Having examined bias and fairness, we now focus on improving models responsibly. You’ll explore iterative development, including hyperparameter tuning, feature engineering, collecting more data, and combining models to enhance performance while maintaining fairness.
Building a good ML model is an iterative process. We start with a simple model and test it, then make improvements. Each change (like using a different algorithm or adjusting settings) is evaluated on validation data.
It's similar to cooking: you taste a dish, adjust the seasoning, and try again until it's just right.
Models have hyperparameters (settings like learning rate, number of trees, etc.). We can tune these to improve performance. For example, a decision tree can be made deeper or shallower.
We try different settings and pick the one that works best on validation data, much like dialing in the right recipe.
Sometimes creating new features can boost performance. If we have raw text, we might extract the number of specific words; if we have dates, we might add "day of week".
By engineering useful features, we give the model clearer signals. It's like highlighting important clues in the data.
Often, collecting more data helps improve models. More examples allow the model to learn exceptions and reduce overfitting.
If a model isn't accurate enough, getting more labeled examples (especially from underrepresented cases) is a reliable step toward improvement.
Instead of a single train/test split, we can use k-fold cross-validation: split the data into k parts and train k times, each time using a different part as the validation set.
This gives a more robust performance estimate, ensuring our results aren't due to a lucky split.
Combining multiple models can boost accuracy. For example, you might average the predictions of several models or take a vote. This is called ensembling.
It's like consulting multiple experts and combining their opinions; often the ensemble outperforms any single model.
Which Approach Is Likely To Help A Model That Is Underfitting?
Adjusting a model's settings (like learning rate or tree depth) to improve performance is called hyperparameter ______.
Think of a skill you've improved over time (like playing an instrument). How does the process of feedback and adjustment in that skill compare to training and improving a machine learning model?
Fantastic work! You now understand how to iteratively improve ML models through tuning, feature engineering, and other techniques.