Capstone Project 3 Here we put all that was learnt into a real world project. My project is Kubernetes-based Personal Carbon Footprint Prediction Check it out here. https://github.com/dangoled/MLZoomcamp2025_Capstone3/tree/main
Module 8: Neural networks and Deep learning In module we learned how to build an image classification model using PyTorch and transfer learning. We used a clothing dataset. What was covered Introduction to PyTorch for deep learning Loading and preprocessing image data Using pre-trained models (MobileNetV2) Understanding convolutional neural networks (CNNs) Transfer learning: adapting pre-trained models Hyperparameter tuning: learning rate optimization Model checkpointing: saving the best model Adding more layers to improve performance Dropout regularization to prevent overfitting Data augmentation for better generalization Training the final model Using the model for predictions Exporting models to ONNX format
Here we had to apply everything learned so far in a complete project by find a dataset, training models, and deploying a web service. We were expected to Describe the problem and explain how a model could be used Prepare the data and doing EDA, Analyze important features Train multiple models, tune their performance and select the best model Export the notebook into a script Put your model into a web service and deploy it locally with Docker Deploy the service to the cloud.
Module 6: Decision Trees & Ensemble Learning We learned about tree-based models and ensemble methods for better predictions. Topics covered: Decision trees Random Forest Gradient boosting (XGBoost) Hyperparameter tuning Feature importance This was hard. Needs a revisit
Module 5: Deploying Machine Learning Models This module was about turning ML models into web services and deploying them with Docker and cloud platforms. Topics covered included: Model serialization with Pickle FastAPI web services Docker containerization Cloud deployment Tools used included FastApi, Docker, pickle, uvicorn and uv
Module 4: Evaluation Metrics for Classification This module was about how to properly evaluate classification models and handle imbalanced datasets. The following were covered Accuracy, precision, recall, F1-score ROC curves and AUC Cross-validation Confusion matrices Class imbalance handling
Module 3: Machine Learning for Classification In this module we create a customer churn prediction system using logistic regression and learn about feature selection. Topics covered included: Logistic regression Feature importance and selection Categorical variable encoding Model interpretation