ML ZOOMCAMP 2025 - Module 1
As a Data Science enthusiast, I am always looking for opportunities to make myself better in the field. This quest led me to the Data Science Zoomcamp, Cohort 2025.
I am looking forward to the 4 months of learning, unlearning and re-learning, and documenting my journey with key highlights and insights.
The Youtube live pre-course Q&A session happened on August 19th, whose role was to set the stage for the course.
The course officially kicked off on September 15th, and will follow the otline below:
- Module 1: Introduction to Machine Learning
- Module 2: Machine Learning for Regression
- Module 3: Machine Learning for Classification
- Module 4: Evaluation Metrics for Classification
- Module 5: Deploying Machine Learning Models
- Module 6: Decision Trees & Ensemble Learning
- Midterm Project
- Module 7: Neural Networks & Deep Learning
- Module 8: Serverless Deep Learning
- Module 9: Kubernetes & TensorFlow Serving
- Capstone Project
This module aimed at learn the fundamentals: what ML is, when to use it, and how to approach ML problems using the CRISP-DM framework.
1 Introduction to Machine Learning with Cars Data
In Machine Learning, patterns are extracted from feature variables in the data to train a model which learns from the dats and can be used make predictions. We looked at data about cars, including characteristics (features) and prices (target). A Machine Learning (ML) model can be used to extract patterns from known information (data) about some cars in order to predict car prices based on their characteristics.
2 Rules-Based Systems vs. Machine Learning
Rule-based systems use a set of characteristics/rules to determine an outcome (e.g an email is spam or not), while Machine learning models can be trained with features and targets extracted from data, without explicitly being programmed as rules.
- For Rules-Based Systems, rukes are manually converted into code using a programming language and then applied to data, making the process complex and challenging, especially when the rules keep changing and the code requires frequent update and maintenamnce.
- In Machine Learning, instead of manually coding rules, ML models automatically extract patterns from data using Mathematics and Statistics.
- Regression: the output is a number (e.g car's price).
- Classification: the output is a category (e.g spam detection).
- Binary classification (there are two categories e.g spam or no spam).
- Multiclass problems: there are more than two categories (e.g cat, dog, horse, donkey)
- Ranking: the output is based on top scores assigned (eg shirts ranking higher than sweaters). Examples include recommender systems.
This is a structured methodology for organizing machine learning projects. It follows the steps:
- Business Understanding
- Data Understanding
- Data Preparation
- Modeling (choosing and training models, then selecting the best one)
- Evaluation
- Deployment
6 Setting up the Environment
7 Introduction to NumPy
8 Linear Algebra
Comments
Post a Comment