Chapter 5: Building Predictive Models Using Penalized Linear Methods
Python Packages for Penalized Linear Regression
Multivariable Regression: Predicting Wine Taste
Building and Testing a Model to Predict Wine Taste
CODE
- Listing 5-1: Using Cross-Validation to Estimate Out-of-Sample Error with Lasso Modeling
Wine Taste—wineLassoCV.py
- Figure 5-1: ... un-normalized Y
- Figure 5-2: ... normalized Y
- Figure 5-3: ... un-normalized X and Y
Training on the Whole Data Set before Deployment
CODE
- Listing 5-2: Lasso Training on Full Data Set—wineLassoCoefCurves.py
- Figure 5-4: Coefficient curves for Lasso trained to predict wine quality
- Figure 5-5: Coefficient curves for Lasso trained on un-normalized Xs
Basis Expansion: Improving Performance by Creating New Variables from Old Ones
CODE
- Listing 5-3: Using Out-of-Sample Error to Evaluate New Attributes for Predicting Wine
Quality—wineExpandedLassoCV.py
- Figure 5-6: Cross-validation error curves for Lasso trained on wine quality data with expanded feature set
Binary Classification: Using Penalized Linear Regression to Detect Unexploded Mines
CODE
- Listing 5-4: Using ElasticNet Regression to Build a Binary (Two-Class) Classifier—
rocksVMinesENetRegCV.py
- Figure 5-7: Out-of-sample classifier misclassification performance
- Figure 5-8: Out-of-sample classifier AUC performance
- Figure 5-9: Receiver operating characteristic for best performing classifier
Build a Rocks versus Mines Classifier for Deployment
CODE
- Listing 5-5: Coefficient Trajectories for ElasticNet Trained on Rocks versus Mines
Data— rocksVMinesCoefCurves.py
- Figure 5-10: Coefficient curves for ElasticNet trained on rocks versus mines data
- Listing 5-6: Penalized Logistic Regression Trained on Rocks versus Mines Data— rocksVMinesGlmnet.py
- Figure 5-11: Coefficient curves for ElasticNet penalized logistic regression trained on rocks versus mines data
Multiclass Classification: Classifying Crime Scene Glass Samples
Listing 5-7: Multiclass Classification with Penalized Linear Regression - Classifying Crime Scene Glass Samples—glassENetRegCV.py
- Figure 5-12: Misclassification error rates using penalized linear regression for glass classification