Chapter 4: Penalized Linear Regression

Extremely Fast Coefficient Estimation

Variable Importance Information

Extremely Fast Evaluation When Deployed

Problem May Require Linear Model

a linear model might be a requirement of the solution, e.g.
Calculations of insurance payouts
Drug testing [...] regulatory apparatus requires a linear form for statistical inference

When to Use Ensemble Methods

you might get better performance with another technique, such as an ensemble method
ensemble methods for measuring variable importance can yield more information about the relationship between attributes and predic- tions ... second-order (and higher) information about what pairs of variables are more important together

Training Linear Models: Minimizing Errors and More

How LARS Generates Hundreds of Models of Varying Complexity

Choosing the Best Model from The Hundreds LARS Generates

CODE

Listing 4-1: LARS Algorithm for Predicting Wine Taste—larsWine2.py
- Figure 4-3: Coefficient curves for LARS regression on wine data.
Listing 4-2: 10-Fold Cross-Validation to Determine Best Set of Coefficients—larsWineCV.py
- Figure 4-4: Cross-validated mean square error for LARS on wine data.

Comparison of the Mechanics of Glmnet and LARS Algorithms

Initializing and Iterating the Glmnet Algorithm

CODE

Listing 4-3: Glmnet Algorithm—glmnetWine.py
- Figure 4-6: Coefficient curves for glmnet models for predicting wine taste

CODE

Listing 4-4: Converting a Classification Problem to an Ordinary Regression Problem by Assigning Numeric Values to Binary Labels
- Figure 4-7: Coefficient curves for rocks versus mines classification problem solved by converting to labels

Ergänzungen:

http://dataaspirant.com/2017/03/07/difference-between-softmax-function-and-sigmoid-function/
"Multi-class classification means a classification task with more than two classes; each label are mutually exclusive. The classification makes the assumption that each sample is assigned to one and only one label [...] Multi-label classification assigns to each sample a set of target labels." https://towardsdatascience.com/multi-label-text-classification-with-scikit-learn-30714b7819c5

CODE

Listing 4-5: Basis Expansion for Wine Taste Prediction
- Figure 4-8: Functions generated to expand wine attribute session

CODE

Listing 4-6: Coding Categorical Variable for Penalized Linear Regression - Abalone Data—larsAbalone.py