# Predictive Power Assessment using PLSPredict in SmartPLS3

### Learn to assess Predictive Power in PLS-SEM

The Concept and Process of Predictive Power Assessment using PLSPredict in SmartPLS3

## What is Predictive Power?

- Many researchers interpret the R-Sq statistic as a measure of their model’s predictive power (Sarstedt & Danks, 2021; Shmueli & Koppius, 2011). This interpretation is not entirely correct, however, since the R-Sq only indicates the model’s in-sample explanatory power.
- “In sample” refers to the data that you have, and “out of sample” to the
**data you don’t have**but want to forecast or estimate. - It says nothing about the model’s
**predictive power**(Chin et al., 2020; Hair & Sarstedt, 2021), also referred to as**out-of-sample predictive power**, which indicates a model’s ability to predict new or future observations. - Does the model has good predictive quality?
- Addressing this concern, Shmueli, Ray, Estrada, and Chatla (2016) introduced
**PLSpredict**, a procedure for out-of-sample prediction. - Execution of PLSpredict involves estimating the model on a
**training sample**and evaluating its predictive performance on a**holdout sample**(Shmueli et al., 2019). - Note that the holdout sample is separated from the total sample before executing the initial analysis on the training sample data, so it includes data that were not used in the model estimation.
- Researchers need to make sure that the training sample for each fold meets minimum sample size guidelines (e.g., by following the inverse square root method).

## How to Assess Predictive Power?

- To assess a model’s predictive power, researchers can draw on several
**prediction statistics**that quantify the amount of**prediction error**in the indicators of a particular endogenous construct. - Error is not an error (as in a mistake). It is a residual, the lower the better, this is the difference between actual values and the predicted values.
- The most popular metric to quantify the degree of prediction error is the
**root-mean-square error (RMSE)**. - Another popular metric is the
**mean absolute error (MAE)**. - In most instances, researchers should use the RMSE to examine a model’s predictive power.
- But if the prediction error distribution is highly nonsymmetric, as evidenced in a long left or right tail in the distribution of prediction errors (Danks & Ray, 2018), the MAE is the more appropriate prediction statistic (Shmueli et al., 2019).
- To assess the degree of prediction error, use the RMSE unless the prediction error distribution is highly non-symmetric. In this case, the MAE is the more appropriate prediction statistic
- To interpret these metrics, researchers need to compare each indicator’s RMSE (or MAE) values with a naïve
**linear regression model (LM) benchmark**. - The LM benchmark values are obtained by running a linear regression of each of the dependent construct’s indicators on the indicators of the exogenous constructs in the PLS path model (Danks & Ray, 2018). In comparing the RMSE (or MAE) values with the LM values, the following guidelines apply (Shmueli et al., 2019):
If

*all*indicators in the PLS-SEM analysis have lower RMSE (or MAE) value compared to the naïve LM benchmark, the model has high predictive power.If the

*majority*(or the same number) of indicators in the PLS-SEM analysis yields smaller prediction errors compared to the LM, this indicates a medium predictive power.If a

*minority*of the dependent construct’s indicators produce lower PLS-SEM prediction errors compared to the naïve LM benchmark, this indicates the model has low predictive power.If the PLS-SEM analysis (compared to the LM) yields lower prediction errors in terms of the RMSE (or the MAE) for

*none*of the indicators, this indicates the model lacks predictive power.

#### In comparing the RMSE (or MAE) values with the LM values, the following guidelines apply

##### All Indicators

If all indicators in the PLS-SEM analysis have lower RMSE (or MAE) value compared to the naïve LM benchmark, the model has high predictive power.

##### Majority

If the majority (or the same number) of indicators in the PLS-SEM analysis yields smaller prediction errors compared to the LM, this indicates a medium predictive power.

##### Minority

If a minority of the dependent construct’s indicators produce lower PLS-SEM prediction errors compared to the naïve LM benchmark, this indicates the model has low predictive power.

##### None

If the PLS-SEM analysis (compared to the LM) yields lower prediction errors in terms of the RMSE (or the MAE) for none of the indicators, this indicates the model lacks predictive power.

## Video Tutorial

## Additional SmartPLS Resources

- Analyzing Formative-Formative Higher-Order Construct in SmartPLS
- Categorical Predictor Variable using SMART-PLS
- Complex Higher-Order Model using SmartPLS
- Concept of Higher-Order Constructs in PLS-SEM
- How to Solve Convergent and Discriminant Validity Issues
- How to Start Data Analysis using SMART-PLS
- How to Structure, Format, and Report SMART PLS-SEM Results
- Mediation Analysis, Interpretation, and Reporting using SMART-PLS
- Moderation Analysis with Categorical Variables using SMART-PLS
- Moderation Analysis, Interpretation, and Reporting using SMART-PLS
- Reflective Vs Formative Indicators: The Concept and Differences
- Reflective-Formative Higher-Order Construct using SMART-PLS
- Reflective-Reflective Higher-Order Construct using SMART-PLS
- Reporting Measurement and Structural Model in SMART-PLS
- Understanding Convergent and Discriminant Validity using SMART-PLS
- Understanding R Square, F Square, and Q Square using SMART-PLS
- Validating Formative Indicators using SMART-PLS