The Most Popular Variable Selection Method: Ranking the Top Choice

Choose the method you think is the most popular!

Author: Gregor Krambs
Jun 16, 2023 10:18 (Updated on Dec 2, 2023 09:40)
Discover the most popular variable selection method in the world of statistics and data analysis! Dive into the exciting realm of StrawPoll's latest ranking, where thousands of users are casting their votes and sharing their opinions on their favorite ways to choose the best variables for their models. Are you a fan of Stepwise Selection, LASSO, or Ridge Regression? Or do you have a different technique up your sleeve? No matter your preference, join our ever-growing community of data enthusiasts, vote for your top pick, and even suggest a missing option to add to the mix! Unleash your inner statistician, and together, let's uncover the ultimate variable selection method that reigns supreme. Don't miss out on this thrilling ranking – the fate of the variables lies in your hands!

What Is the Most Popular Variable Selection Method?

  1. 1

    Forward selection

    George E. P. Box
    This method begins with an empty model and adds variables one at a time, choosing the variable that provides the best fit until there is no further improvement.
    Forward selection is a variable selection method used to build a predictive model by iteratively adding the most significant predictors. It starts with an empty model and adds one variable at a time based on a pre-defined criterion, such as the highest increase in predictive power or lowest p-value.
    • Iteration: Iterative process
    • Criterion: Based on highest increase in predictive power or lowest p-value
    • Model Start: Empty model
    • Variable Addition: One variable at a time
    • Selection Control: Stopping point based on predefined criteria or exhaustively evaluate all subsets
  2. 2

    Backward elimination

    George E.P. Box
    This method starts with a model that includes all variables and then removes them one at a time, choosing the variable that provides the best fit until there is no further improvement.
    Backward elimination is a variable selection method used in statistical modeling. It involves starting with a model that includes all predictor variables, and iteratively removing the least significant variable until a desirable model is obtained. The goal is to simplify the model by eliminating non-significant predictors and improving interpretability.
    • Purpose: Variable selection
    • Procedure: Iteratively eliminates the least significant variable
    • Starting point: Model with all predictor variables
    • Criterion for elimination: p-value, significance level
    • Sequential process: Elimination of one variable in each iteration
  3. 3

    Lasso regression

    Robert Tibshirani
    This method uses regularization to penalize the size of coefficients, forcing some of them to zero and thus selecting only the most important variables.
    Lasso regression, also known as L1 regularization, is a variable selection method used in linear regression analysis. It was introduced by Robert Tibshirani in 1996. It aims to address the problem of multicollinearity, where independent variables are highly correlated, by shrinking the coefficients of less important variables towards zero.
    • Regularization Method: L1
    • Objective Function: Least Absolute Shrinkage and Selection Operator (LASSO)
    • Optimization Algorithm: Coordinate Descent
    • Penalty Term: Absolute value of regression coefficients
    • Variable Selection Approach: Automatically selects important variables while shrinking others
  4. 4
    This method also uses regularization to shrink the size of coefficients, but it does not force any of them to zero.
    Ridge regression is a popular variable selection method that is used to tackle the issue of multicollinearity in regression analysis. It was introduced by Arthur E. Hoerl and Robert W. Kennard in 1970.
    • Purpose: To overcome multicollinearity by adding a penalty term to the least squares regression.
    • Penalty term: Ridge regression adds a penalty term to the least squares regression called a regularization parameter (lambda) which shrinks the coefficients towards zero.
    • Coefficient shrinkage: Ridge regression shrinks the coefficients, reducing the effect of variables with lower importance.
    • Bias-variance tradeoff: Ridge regression seeks a balance between reducing variance (overfitting) and maintaining bias (underfitting).
    • Tikhonov regularization: Ridge regression is also known as Tikhonov regularization or L2 regularization.
  5. 5

    Elastic net

    Hui Zou and Trevor Hastie
    This method combines the Lasso and Ridge methods to balance their strengths and weaknesses.
    Elastic net is a popular variable selection method that combines the L1 and L2 regularization techniques. It was introduced by Zou and Hastie in 2005 as an extension of the Lasso regression. The Elastic net algorithm is widely used in machine learning and statistics for feature selection and regularization to handle large-dimensional data.
    • Regularization techniques: Combines L1 and L2 regularization
    • Purpose: Feature selection and regularization for large-dimensional data
    • Advantage: Handles correlated predictors and selects groups of correlated variables
    • Interpretability: Produces sparse solutions, making it easier to interpret
    • Flexibility: Allows the user to control the balance between L1 and L2 regularization
  6. 6
    Principal component analysis (PCA)
    Nicoguaro · CC BY 4.0
    This method transforms the original variables into a smaller set of uncorrelated variables, which can then be used in a regression model.
    Principal component analysis (PCA) is a popular variable selection method used in data analysis and machine learning. It is a statistical technique that reduces the dimensionality of a dataset while preserving most of its variability. By transforming the original variables into a new set of uncorrelated variables called principal components, PCA allows for a simplified representation of the data.
    • Objective: Dimensionality reduction
    • Main Purpose: Identify and capture the most important features in a dataset
    • Method: Linear algebra approach
    • Data Type: Numeric data
    • Input: Matrix or data frame
  7. 7

    Random forest

    Leo Breiman and Adele Cutler
    This method uses an ensemble of decision trees to identify the most important variables for predicting the outcome.
    Random forest is a popular variable selection method that combines multiple decision trees to improve prediction accuracy and feature importance ranking. It was first proposed by Leo Breiman and Adele Cutler in 2001.
    • Type: Supervised machine learning algorithm
    • Purpose: Variable selection, prediction, feature importance ranking
    • Method: Ensemble learning
    • Algorithm: Random forest builds multiple decision trees by bootstrapping training data and randomly selecting a subset of features for each tree
    • Feature Importance: Random forest calculates feature importance based on the average decrease in impurity or Gini index across all decision trees
  8. 8
    This method starts with a model that includes all variables and then removes them one at a time, based on their importance as determined by the model.
    Recursive feature elimination (RFE) is a popular variable selection method that iteratively removes less important features from a model until a desired number of features remain. It is commonly used in machine learning tasks to improve model performance and interpretability.
    • Application: Machine learning variable selection
    • Algorithm: Greedy search with elimination
    • Objective: Optimize model performance and interpretability
    • Iterative Process: Features with low importance are eliminated at each iteration
    • Evaluation Metric: Commonly uses cross-validation or other performance metrics
  9. 9

    Stepwise regression

    M. A. Miller
    This method combines forward selection and backward elimination, adding and removing variables until no further improvement is possible.
    Stepwise regression is a variable selection method used in statistical modeling to determine the subset of predictor variables that are most relevant for predicting the outcome variable. It starts with an initial model and iteratively adds or removes variables based on their statistical significance or contribution to the model's fit. The method is called 'stepwise' because it proceeds step by step, alternating between forward selection and backward elimination.
    • Selection direction: Forward, backward, or both
    • Criteria for variable entry: Significance level or contribution to model fit
    • Criteria for variable removal: Significance level or change in model fit
    • Stopping criteria: P-value threshold or model fit improvement
    • Multiple testing correction: Optional, usually Bonferroni correction
  10. 10
    This method tests all possible combinations of variables and chooses the one that provides the best fit.
    Best subset selection is a variable selection method used in statistical modeling to select the best subset of predictor variables. It involves fitting all possible combinations of predictor variables and selecting the subset that provides the highest model fit or the most meaningful interpretation. Best subset selection aims to find the optimal model that balances model complexity and predictive accuracy.
    • Optimization Goal: Finding the optimal subset of predictor variables.
    • Model Fit Criterion: Maximizing model fit or minimizing model error.
    • Variable Combinations: Fitting all possible combinations of predictor variables.
    • Model Complexity: Balancing model complexity and predictive accuracy.
    • Computational Complexity: High computational complexity for large feature spaces.

Missing your favorite method?


Ranking factors for popular variable selection method

  1. Accuracy
    The method should have a high accuracy rate in identifying relevant and important variables.
  2. Efficiency
    The method should be computationally efficient and able to handle large datasets.
  3. Interpretability
    The method should provide a clear and interpretable representation of the selected variables.
  4. Robustness
    The method should be robust to noise and outliers in the data.
  5. Scalability
    The method should have the ability to scale with increasing data sizes and complexity.
  6. Flexibility
    The method should be adaptable to different types of data and modeling techniques.
  7. Validation
    The method should be rigorously validated and benchmarked against other state-of-the-art methods.
  8. Simplicity
    The method should be easy to understand and apply, with clear documentation and user-friendliness.

About this ranking

This is a community-based ranking of the most popular variable selection method. We do our best to provide fair voting, but it is not intended to be exhaustive. So if you notice something or method is missing, feel free to help improve the ranking!


  • 189 votes
  • 10 ranked items

Voting Rules

A participant may cast an up or down vote for each method once every 24 hours. The rank of each method is then calculated from the weighted sum of all up and down votes.


More information on most popular variable selection method

Variable selection is an important step in data analysis that involves choosing a subset of relevant variables from a larger set. The process helps to simplify statistical models, reduce overfitting, and enhance the accuracy of predictions. There are several methods of variable selection, each with its strengths and weaknesses. Some of the most popular techniques include forward selection, backward elimination, lasso regression, Ridge regression, and principal component analysis (PCA). Choosing the best method largely depends on the nature and size of the data, the research question, and the desired level of prediction accuracy. In this article, we explore the most popular variable selection methods used by data analysts, and highlight their pros and cons.

Share this article