Statistical Analysis - Regression

Class 10

The main objective of this class is to learn how to perfom different types of regression models and generate tables and figures of those models.

Prepare

Before starting this class:

📦 Install packages for generating regression tables

sjPlot, correlation, broom, performance, parameters

📦 Install packages for regression analyses

e1071, ggeffects, mediation

📦 Install packages for path analysis

lavaan, semTools, semPlot

📦 Install my packages from GitHub for genearting model tables

First install: devtools

Then install packages from GitHub: modeloutput and semoutput

devtools::install_github("dr-JT/modeloutput")
devtools::install_github("dr-JT/semoutput")

Setup an R Project

Create an R Project (if you don’t have one already for this course) with at least a data folder

📁 analyses

📁 data

Setup Quarto Document

To follow along, you should create an empty Quarto document for this class.

YAML

---
title: "Document Title"
author: Your Name
date: today
theme: default
format:
  html:
    code-fold: true
    code-tools: true
    code-link: true
    toc: true
    toc-depth: 1
    toc-location: left
    page-layout: full
    df-print: paged
execute:
  error: true
  warning: true
self-contained: true
editor_options: 
  chunk_output_type: console
---

Headers

Create a level 1 header for a Setup section to load packages and set some theme options:

# Setup

Create a level 2 header to load packages

## Required Packages

Create another level 2 header where we can set a ggplot2 theme

## Plot Theme

R Code Chunks

Add an R code chunk below the Required Packages header and load the following packages,

library(here)
library(readr)
library(dplyr)
library(tidyr)
library(stringr)
library(ggplot2)
library(e1071)
library(sjPlot)
library(correlation)
library(broom)
library(performance)
library(parameters)
library(modeloutput)
library(ggeffects)
#library(mediation) we will use as mediation::mediate() instead of loading here
library(lavaan)
library(semTools)
library(semPlot)
library(semoutput)

Add an R code chunk below the Plot Theme header and set your own ggplot2 theme to automatically be used in the rest of the document.

In this example I am using the theme_classic() settings in addition to bolding the title elements. This custom theme will now automatically be applied to all ggplot2 objects by using theme_set()

custom_theme <- theme_classic() +
  theme(axis.title.x = element_text(face = "bold"),
        axis.title.y = element_text(face = "bold"),
        legend.title = element_text(face = "bold"),
        plot.title = element_text(face = "bold"))

theme_set(custom_theme)

Example Data Set

Suppose we have a data set in which we are interested in what variables can predict people’s self-reported levels of happiness, measured using valid and reliable questionnaires. We measured a number of other variables we thought might best predict happiness.

Financial wealth (annual income)

Emotional Regulation (higher values indicate better emotional regulation ability)

Social Support Network (higher values indicate a stronger social support network)

Chocolate Consumption (higher values indicate more chocolate consumption on a weekly basis)

Type of Chocolate (the type of chocolate they consume the most of)

You can download the data for this hypothetical experiment and save it in your data folder:

⬇️ happiness_data.csv

Import Data

Create a level 1 header for importing and getting the data ready for statistical analysis

# Data

Create a tabset with a tab to import the data and another tab to get the data ready

You can add multiple tabs easily by going to

In the toolbar: Insert -> Tabset…

::: panel-tabset
## Import Data

## Get Data Ready for Models

:::

Create an R code chunk below the Import Data tab header

data_import <- read_csv(here("data", "happiness_data.csv"))

ABCDEFGHIJ0123456789

Subject <dbl>	Happiness <dbl>	Financial_Wealth <dbl>	Emotion_Regulation <dbl>	Social_Support <dbl>	Chocolate_Consumption <dbl>	Type_of_Chocolate <chr>
1	92.71552	306.00901	63.893686	55.36886	12.458157	White
2	119.92629	341.62836	120.213271	75.94387	10.013649	White
3	98.61032	584.29717	43.444294	13.75679	11.754135	White
4	60.34756	540.10136	56.051794	46.12790	16.456166	White
5	122.02791	459.11142	86.066170	78.51215	11.409276	White
6	70.72007	445.47368	54.673345	30.11951	8.006052	White
7	72.41749	318.08589	37.872393	26.12595	7.326831	White
8	127.20308	490.45500	74.419634	52.08379	10.593124	White
9	38.79877	392.41033	71.718926	44.72171	14.974105	White
10	118.26631	367.92411	31.971869	37.36753	8.367964	White

Specify Factor Levels

Notice that the Type of Chocolate variable is categorical. When dealing with categorical variables for statistical analyses in R, it is usually a good idea to define the order of the categories as this will by default determine which category is treated as the reference (comparison group).

Let’s set factor levels for Type of Chocolate

Remember you can use colnames() to get the columns in a data frame and unique() to evaluate the unique values in a column.

Create an R code chunk below the Get Data Ready for Models tab header

happiness_data <- data_import |>
  mutate(Type_of_Chocolate = factor(Type_of_Chocolate,
                                    levels = c("White", "Milk",
                                               "Dark", "Alcohol")))

Note

From here on out you can create your own header and tabsets as you see fit

Descriptives

It is a good idea to get a look at the data before running your analysis. One way of doing so is to calculate summary statistics (descriptives) and print it out in a table. There are some column we can’t or don’t want descriptives for. For instance there is usually a subject id column in data files or some variables might be categorical rather than continuous variables. We can remove those column using select() from the dplyr package.

Manual
modeloutput

happiness_data |>
  select(-Subject, -Type_of_Chocolate) |>
  pivot_longer(everything(),
               names_to = "Variable", 
               values_to = "value") |>
  summarise(.by = Variable,
            n = length(which(!is.na(value))),
            Mean = mean(value, na.rm = TRUE),
            SD = sd(value, na.rm = TRUE),
            min = min(value, na.rm = TRUE),
            max = max(value, na.rm = TRUE),
            Skewness =
              skewness(value, na.rm = TRUE, type = 2),
            Kurtosis =
              e1071::kurtosis(value, na.rm = TRUE, type = 2),
            '% Missing' =
              100 * (length(which(is.na(value))) / n()))

ABCDEFGHIJ0123456789

Variable <chr>	n <int>	Mean <dbl>	SD <dbl>	min <dbl>	max <dbl>	Skewness <prmtrs_s[,2]>	Kurtosis <dbl>
Happiness	152	86.46802	32.709428	10	173.03629	<prmtrs_s[,2]>	-0.2216226
Financial_Wealth	152	422.04191	142.156453	10	760.28356	<prmtrs_s[,2]>	-0.1688662
Emotion_Regulation	152	57.39529	19.691426	5	120.21327	<prmtrs_s[,2]>	0.2761216
Social_Support	152	44.64559	16.315470	7	84.96829	<prmtrs_s[,2]>	-0.1605829
Chocolate_Consumption	152	10.49716	3.960066	0	20.54462	<prmtrs_s[,2]>	-0.1252454

Note

skewness() and kurtosis() are from the e1071 package

I have an R package, modeloutput , for easily creating nice looking tables for statistical analyses. For a descriptives table we can use descriptives_table().

happiness_data |>
  select(-Subject, -Type_of_Chocolate) |>
  descriptives_table()

Descriptive Statistics
Variable	n	Mean	SD	min	max	Skewness	Kurtosis	% Missing
Happiness	152.00	86.47	32.71	10.00	173.04	0.08	−0.22	0.00
Financial_Wealth	152.00	422.04	142.16	10.00	760.28	−0.10	−0.17	0.00
Emotion_Regulation	152.00	57.40	19.69	5.00	120.21	−0.02	0.28	0.00
Social_Support	152.00	44.65	16.32	7.00	84.97	0.13	−0.16	0.00
Chocolate_Consumption	152.00	10.50	3.96	0.00	20.54	−0.08	−0.13	0.00
Total N = 152

Historgrams

Visualizing the actual distribution of values on each variable is a good idea too. This is how outliers or other problematic values can be detected.

r plot
ggplot2

The most simple way to do so is by plotting a histogram for each variable in the data file using hist().

hist(happiness_data$Happiness)

Alternatively, you can use ggplot2()

ggplot(happiness_data, aes(Happiness)) +
  geom_histogram(binwidth = 10, fill = "gray", color = "black")

Correlation Tables

Regression models are based on the co-variation (or correlation) between variables. Therefore, we might want to first evaluate all of the pairwise correlations in the data.

The correlation package has a convenient function for creating a table of correlations that includes confidence intervals, t-values, df, and p.

happiness_data |>
  select(-Subject, -Type_of_Chocolate) |>
  correlation() |>
  select(-CI, -Method)

ABCDEFGHIJ0123456789

	Parameter1 <chr>	Parameter2 <chr>	r <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>	n_Obs <int>
1	Happiness	Financial_Wealth	0.2168292	0.05968891	0.3634827	2.720322	150	2.187765e-02	152
2	Happiness	Emotion_Regulation	0.4836213	0.35148218	0.5968674	6.767142	150	2.488028e-09	152
3	Happiness	Social_Support	0.4322032	0.29317496	0.5533308	5.869955	150	2.156402e-07	152
4	Happiness	Chocolate_Consumption	0.3390685	0.19013103	0.4727501	4.414214	150	1.349489e-04	152
5	Financial_Wealth	Emotion_Regulation	0.2505000	0.09509153	0.3939885	3.169025	150	9.270486e-03	152
6	Financial_Wealth	Social_Support	0.2455000	0.08980939	0.3894784	3.101670	150	9.270486e-03	152
7	Financial_Wealth	Chocolate_Consumption	0.1052500	-0.05487010	0.2600926	1.296244	150	1.968824e-01	152
8	Emotion_Regulation	Social_Support	0.6295000	0.52268071	0.7168593	9.922476	150	3.742120e-17	152
9	Emotion_Regulation	Chocolate_Consumption	0.3045000	0.15270176	0.4422614	3.915275	150	8.200237e-04	152
10	Social_Support	Chocolate_Consumption	0.2020000	0.04422139	0.3499469	2.526058	150	2.514553e-02	152

Or you can the correlation package to create a heat map

happiness_data |>
  select(-Subject, -Type_of_Chocolate) |>
  correlation() |>
  summary(redundant = TRUE) |>
  plot()

The sjPlot package has a convenient function for creating a correlation table/matrix

happiness_data |>
  select(-Subject, -Type_of_Chocolate) |>
  tab_corr(na.deletion = "pairwise", digits = 2, triangle = "lower")

	Happiness	Financial_Wealth	Emotion_Regulation	Social_Support	Chocolate_Consumption
Happiness
Financial_Wealth	0.22**
Emotion_Regulation	0.48***	0.25**
Social_Support	0.43***	0.25**	0.63***
Chocolate_Consumption	0.34***	0.11	0.30***	0.20*
Computed correlation used pearson-method with pairwise-deletion.

Scatterplots

Using the base R plot() function

plot(happiness_data$Financial_Wealth, happiness_data$Happiness)

Using ggplot2

ggplot(happiness_data, aes(Financial_Wealth, Happiness)) +
  geom_point() +
  geom_smooth(method = "lm") +
  labs(x = "Financial Wealth (annual income)")

Using the correlation package

plot(cor_test(happiness_data, "Financial_Wealth", "Happiness"))

Regression

Regression models allow us to test more complicated relationships between variables. It is important to match your research question with the correct regression model. But also, thinking about the different types of regression models can help you clarify your research question. We will take a look at how to conduct the following regression models in R:

Simple regression
Multiple regression
Hierarchical regression
Mediation analysis
Moderation analysis

To conduct a regression model is fairly simple in R, all you need is to specify the regression formula in the lm() function.

Simple Regression

In what is called simple regression, there is only one predictor variable. Simple regression is not really necessary because it is directly equivalent to a correlation. However, it is useful to demonstrate what happens once you add multiple predictors into a model.

The formula in the lm() function takes on the form of: dv ~ iv.

Model

reg_simple <- lm(Happiness ~ Financial_Wealth, data = happiness_data)

For regression models we can use the summary() call to print model results

summary(reg_simple)


Call:
lm(formula = Happiness ~ Financial_Wealth, data = happiness_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-79.766 -25.047   0.446  22.921  79.219 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      65.41183    8.16488   8.011 2.93e-13 ***
Financial_Wealth  0.04989    0.01834   2.720  0.00729 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 32.04 on 150 degrees of freedom
Multiple R-squared:  0.04701,   Adjusted R-squared:  0.04066 
F-statistic:   7.4 on 1 and 150 DF,  p-value: 0.007293

The broom and performance packages provide an easy way to get model results into a table format

First using broom::glance() to get an overall model summary

glance(reg_simple)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.0470149	0.04066167	32.03752	7.400152	0.007292551	1	-741.642	1489.284	1498.356	153960.4

Then performance::model_parameters() to get unstandardized and standardized coefficient tables

model_parameters(reg_simple)

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	65.41183383	8.16488368	0.95	49.27879665	81.5448710	8.011361	150	2.927852e-13
Financial_Wealth	0.04989122	0.01834019	0.95	0.01365274	0.0861297	2.720322	150	7.292551e-03

model_parameters(reg_simple, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-6.594117e-17	0.07944455	0.95	-0.15697491	0.1569749	-8.300276e-16	150	1.000000000
Financial_Wealth	2.168292e-01	0.07970718	0.95	0.05933536	0.3743230	2.720322e+00	150	0.007292551

My modeloutput package provides a way to display regression tables in output format similar to other statistical software packages like JASP or SPSS.

regression_tables(reg_simple)

Model Summary: Happiness
Model	R²	R² adj.	BIC
H1	0.047	0.041	1498.356
H1: Happiness ~ Financial_Wealth; N = 152

ANOVA Table: Happiness
Model	Term	Sum of Squares	df	Mean Square	F	p
H1	Regression	73472.006	2	36736.003	7.400	0.007
H1	Residual	153960.370	150	1026.402
H1: Happiness ~ Financial_Wealth; N = 152

Regression Coefficients: Happiness
Model	Term	Unstandardized		Standardized			t	df	p
Model	Term	b	95% CI	β	95% CI	SE	t	df	p
H1	(Intercept)	65.412	49.279 — 81.545	0.000	−0.157 — 0.157	0.079	8.011	150	<0.001
H1	Financial_Wealth	0.050	0.014 — 0.086	0.217	0.059 — 0.374	0.080	2.720	150	0.007
H1: Happiness ~ Financial_Wealth; N = 152

Multiple Regression

You can and will most likely want to include multiple predictor variables in a regression model. There will be a different interpretation of the beta coefficients compared to the simple regression model. Now, the beta coefficients will represent a unique contribution (controlling for all the other predictor variables) of that variable to the prediction of the dependent variable. We can compare the relative strengths of the beta coefficients to decide which variable(s) are the strongest predictors of the dependent variable.

In this case, we can go ahead and include all the continuous predictor variables in a model predicting Happiness and compare it with the simple regression when we only had Financial Wealth in the model.

Model

reg_multiple <- lm(Happiness ~ Financial_Wealth + Emotion_Regulation + 
                     Social_Support + Chocolate_Consumption,
                   data = happiness_data)

For regression models we can use the summary() call to print model results

summary(reg_multiple)


Call:
lm(formula = Happiness ~ Financial_Wealth + Emotion_Regulation + 
    Social_Support + Chocolate_Consumption, data = happiness_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-75.847 -17.630  -0.047  17.125  66.950 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)   
(Intercept)           16.97620    9.73743   1.743  0.08336 . 
Financial_Wealth       0.01778    0.01645   1.081  0.28162   
Emotion_Regulation     0.46080    0.15209   3.030  0.00289 **
Social_Support         0.39466    0.17851   2.211  0.02859 * 
Chocolate_Consumption  1.70732    0.59612   2.864  0.00480 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 27.62 on 147 degrees of freedom
Multiple R-squared:  0.3061,    Adjusted R-squared:  0.2872 
F-statistic: 16.21 on 4 and 147 DF,  p-value: 5.097e-11

The broom and performance packages provide an easy way to get model results into a table format

First using broom::glance() to get an overall model summary

glance(reg_multiple)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.306079	0.2871968	27.61581	16.20992	5.096787e-11	4	-717.5316	1447.063	1465.207	112107

Then performance::model_parameters() to get unstandardized and standardized coefficient tables

model_parameters(reg_multiple)

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	16.97620036	9.73742948	0.95	-2.26723213	36.21963284	1.743396	147	0.083355139
Financial_Wealth	0.01777601	0.01644939	0.95	-0.01473183	0.05028385	1.080648	147	0.281623254
Emotion_Regulation	0.46080228	0.15208880	0.95	0.16023933	0.76136522	3.029824	147	0.002891782
Social_Support	0.39465683	0.17850526	0.95	0.04188878	0.74742488	2.210897	147	0.028587681
Chocolate_Consumption	1.70732087	0.59612046	0.95	0.52924776	2.88539398	2.864053	147	0.004795126

model_parameters(reg_multiple, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-5.026120e-17	0.06847989	0.95	-0.13533223	0.1353322	-7.339557e-16	147	1.000000000
Financial_Wealth	7.725525e-02	0.07148971	0.95	-0.06402511	0.2185356	1.080648e+00	147	0.281623254
Emotion_Regulation	2.774079e-01	0.09155908	0.95	0.09646580	0.4583500	3.029824e+00	147	0.002891782
Social_Support	1.968549e-01	0.08903847	0.95	0.02089414	0.3728157	2.210897e+00	147	0.028587681
Chocolate_Consumption	2.067020e-01	0.07217112	0.95	0.06407498	0.3493289	2.864053e+00	147	0.004795126

My modeloutput package provides a way to display regression tables in output format similar to other statistical software packages like JASP or SPSS.

regression_tables(reg_multiple)

Model Summary: Happiness
Model	R²	R² adj.	BIC
H1	0.306	0.287	1465.207
H1: Happiness ~ Financial_Wealth + Emotion_Regulation + Social_Support + Chocolate_Consumption; N = 152

ANOVA Table: Happiness
Model	Term	Sum of Squares	df	Mean Square	F	p
H1	Regression	20192.945	5	4038.589	16.210	<0.001
H1	Residual	112107.036	147	762.633
H1: Happiness ~ Financial_Wealth + Emotion_Regulation + Social_Support + Chocolate_Consumption; N = 152

Regression Coefficients: Happiness
Model	Term	Unstandardized		Standardized			t	df	p
Model	Term	b	95% CI	β	95% CI	SE	t	df	p
H1	(Intercept)	16.976	−2.267 — 36.220	0.000	−0.135 — 0.135	0.068	1.743	147	0.083
H1	Financial_Wealth	0.018	−0.015 — 0.050	0.077	−0.064 — 0.219	0.071	1.081	147	0.282
H1	Emotion_Regulation	0.461	0.160 — 0.761	0.277	0.096 — 0.458	0.092	3.030	147	0.003
H1	Social_Support	0.395	0.042 — 0.747	0.197	0.021 — 0.373	0.089	2.211	147	0.029
H1	Chocolate_Consumption	1.707	0.529 — 2.885	0.207	0.064 — 0.349	0.072	2.864	147	0.005
H1: Happiness ~ Financial_Wealth + Emotion_Regulation + Social_Support + Chocolate_Consumption; N = 152

Hierarchical Regression

Hierarchical regression is a multiptle regression approach where you compare multiple models with each other. We can compare the simple regression model we conducted above with only Financial_Wealth as a predictor of Happiness with the multiple regression model in which we added the other predictors.

easystats
modeloutput

We can use performance:compare_performance() to do model comparisons (hierarchical regression)

compare_performance(reg_simple, reg_multiple)

ABCDEFGHIJ0123456789

Name <chr>	Model <chr>	AIC <dbl>	AIC_wt <dbl>	AICc <dbl>	AICc_wt <dbl>	BIC <dbl>	BIC_wt <dbl>	R2 <dbl>	R2_adjusted <dbl>
reg_simple	lm	1489.284	6.790501e-10	1489.446	8.365355e-10	1498.356	6.335541e-08	0.0470149	0.04066167
reg_multiple	lm	1447.063	1.000000e+00	1447.643	1.000000e+00	1465.207	9.999999e-01	0.3060790	0.28719679

In modeloutput::regression_tables() you can add up to three models to compare

regression_tables(reg_simple, reg_multiple)

Model Summary: Happiness
Model	R²	R² adj.	ΔR²	ΔF	df1	df2	p	BIC	BF	P(Model\|Data)
H1	0.047	0.041						1498.356
H2	0.306	0.287	0.259	18.293	3	147	<0.001	1465.207	1.58 × 10⁷	1.000
H1: Happiness ~ Financial_Wealth; N = 152
H2: Happiness ~ Financial_Wealth + Emotion_Regulation + Social_Support + Chocolate_Consumption; N = 152

ANOVA Table: Happiness
Model	Term	Sum of Squares	df	Mean Square	F	p
H1	Regression	73472.006	2	36736.003	7.400	0.007
H1	Residual	153960.370	150	1026.402
H2	Regression	20192.945	5	4038.589	16.210	<0.001
H2	Residual	112107.036	147	762.633
H1: Happiness ~ Financial_Wealth; N = 152
H2: Happiness ~ Financial_Wealth + Emotion_Regulation + Social_Support + Chocolate_Consumption; N = 152

Regression Coefficients: Happiness
Model	Term	Unstandardized		Standardized			t	df	p
Model	Term	b	95% CI	β	95% CI	SE	t	df	p
H1	(Intercept)	65.412	49.279 — 81.545	0.000	−0.157 — 0.157	0.079	8.011	150	<0.001
H1	Financial_Wealth	0.050	0.014 — 0.086	0.217	0.059 — 0.374	0.080	2.720	150	0.007
H2	(Intercept)	16.976	−2.267 — 36.220	0.000	−0.135 — 0.135	0.068	1.743	147	0.083
H2	Financial_Wealth	0.018	−0.015 — 0.050	0.077	−0.064 — 0.219	0.071	1.081	147	0.282
H2	Emotion_Regulation	0.461	0.160 — 0.761	0.277	0.096 — 0.458	0.092	3.030	147	0.003
H2	Social_Support	0.395	0.042 — 0.747	0.197	0.021 — 0.373	0.089	2.211	147	0.029
H2	Chocolate_Consumption	1.707	0.529 — 2.885	0.207	0.064 — 0.349	0.072	2.864	147	0.005
H1: Happiness ~ Financial_Wealth; N = 152
H2: Happiness ~ Financial_Wealth + Emotion_Regulation + Social_Support + Chocolate_Consumption; N = 152

Mediation

Mediation analysis allows us to test whether the effect of a predictor variable, on the dependent variable, “goes through” a mediator variable. This effect that “goes through” the mediator variable is called the indirect effect; whereas any effect of the predictor variable that does not “go through” the mediator is called the direct effect where:

Total Effect = Direct Effect + Indirect Effect

We are parsing the variance explained by the predictor variable into direct and indirect effects. Here are some possibilities for the outcome:

Full Mediation: The total effect of the predictor variable can be fully explained by the indirect effect. The mediator fully explains the effect of the predictor variable on the dependent variable. Direct effect = 0, indirect effect != 0.
Partial Mediation: The total effect of the predictor variable can be partially explained by the indirect effect. The mediator only partially explains the effect of the predictor variable on the dependent variable. Direct effect != 0, indirect effect != 0.
No Mediation: The total effect of the predictor variable can not be explained by the indirect effect; there is no indirect effect at all. Direct effect != 0, indirect effect = 0.

There are two general ways to perform a mediation analysis in R.

The standard regression method
Path analysis

Regression

The regression method requires multiple regression models because we have two dependent variables in the mode - the mediation and the outcome measure. The mediator acts as both an independent variable and dependent variable in the model. Because regression requires only one dependent variable, we cannot test the full model in one regression model. Instead, we have to test the different pieces of the model one at a time. We also need to perform a test on the indirect effect itself, which will require a separate model.

In order to evaluate the standardize coefficients (which is often times most useful), we need to first convert the the variables into z-scores. This is fairly straightforward but does require some extra code. A z-score is standardized score on the scale of standard deviation units and is calculated as:

z-score = (raw score - mean) / sd

We can create new columns representing the z-scores using mutate() from the dplyr package.

data_mediation <- happiness_data %>%
  mutate(Happiness_z = 
           (Happiness - mean(Happiness)) / sd(Happiness),
         Social_Support_z = 
           (Social_Support - mean(Social_Support)) / sd(Social_Support),
         Emotion_Regulation_z = 
           (Emotion_Regulation - mean(Emotion_Regulation)) / sd(Emotion_Regulation))

The standard regression model typically involves three regression models and a model testing the indirect effect.

A model with the total effect: dv ~ iv

reg_total <- lm(Happiness_z ~ Social_Support_z, 
                data = data_mediation)

A model with the effect of the predictor on the mediator: m ~ iv

reg_med <- lm(Emotion_Regulation_z ~ Social_Support_z, 
              data = data_mediation)

A model with the direct effect and the effect of the mediator: dv ~ iv + m

reg_direct <- lm(Happiness_z ~ Emotion_Regulation_z + Social_Support_z, 
                 data = data_mediation)

A statistical test of the indirect effect: mediation::mediate()

model_mediation <- mediation::mediate(reg_med, reg_direct,
                                      treat = "Social_Support_z",
                                      mediator = "Emotion_Regulation_z",
                                      boot = TRUE, sims = 1000, 
                                      boot.ci.type = "bca")

easystats
modeloutput

1. DV ~ IV

glance(reg_total)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.1867996	0.1813782	0.9047772	34.45637	2.695503e-08	1	-199.4619	404.9238	413.9954	122.7933

model_parameters(reg_total, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-4.602532e-17	0.07338712	0.95	-0.1450060	0.1450060	-6.271580e-16	150	1.000000e+00
Social_Support_z	4.322032e-01	0.07362972	0.95	0.2867178	0.5776885	5.869955e+00	150	2.695503e-08

2. M ~ IV

glance(reg_med)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.3962702	0.3922454	0.7795862	98.45554	3.74212e-18	1	-176.8252	359.6505	368.7221	91.16319

model_parameters(reg_med, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-4.087312e-17	0.06323279	0.95	-0.124942	0.124942	-6.463912e-16	150	1.00000e+00
Social_Support_z	6.295000e-01	0.06344182	0.95	0.504145	0.754855	9.922476e+00	150	3.74212e-18

3. DV ~ IV + M

glance(reg_direct)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.2609273	0.2510069	0.8654439	26.302	1.64889e-10	2	-192.1977	392.3954	404.491	111.6

model_parameters(reg_direct, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-3.170321e-17	0.07019677	0.95	-0.13870974	0.1387097	-4.516334e-16	149	1.0000000000
Emotion_Regulation_z	3.504041e-01	0.09064193	0.95	0.17129445	0.5295137	3.865806e+00	149	0.0001648697
Social_Support_z	2.116238e-01	0.09064193	0.95	0.03251413	0.3907334	2.334723e+00	149	0.0208928355

4. Mediation

model_parameters(model_mediation)

ABCDEFGHIJ0123456789

Parameter <chr>	Estimate <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	p <dbl>
ACME	0.2205794	0.95	0.1035922	0.3401491	0.000
ADE	0.2116238	0.95	0.0458126	0.3639228	0.016
Total Effect	0.4322032	0.95	0.3154388	0.5457633	0.000
Prop. Mediated	0.5103604	0.95	0.2418449	0.8697138	0.000

Regression Tables

regression_tables(reg_total, reg_direct)

Model Summary: Happiness_z
Model	R²	R² adj.	ΔR²	ΔF	df1	df2	p	BIC	BF	P(Model\|Data)
H1	0.187	0.181						413.995
H2	0.261	0.251	0.074	14.944	1	149	<0.001	404.491	1.16 × 10²	0.991
H1: Happiness_z ~ Social_Support_z; N = 152
H2: Happiness_z ~ Emotion_Regulation_z + Social_Support_z; N = 152

ANOVA Table: Happiness_z
Model	Term	Sum of Squares	df	Mean Square	F	p
H1	Regression	28.207	2	14.103	34.456	<0.001
H1	Residual	122.793	150	0.819
H2	Regression	15.276	3	5.092	26.302	<0.001
H2	Residual	111.600	149	0.749
H1: Happiness_z ~ Social_Support_z; N = 152
H2: Happiness_z ~ Emotion_Regulation_z + Social_Support_z; N = 152

Regression Coefficients: Happiness_z
Model	Term	Unstandardized		Standardized			t	df	p
Model	Term	b	95% CI	β	95% CI	SE	t	df	p
H1	(Intercept)	0.000	−0.145 — 0.145	0.000	−0.145 — 0.145	0.073	0.000	150	1.000
H1	Social_Support_z	0.432	0.287 — 0.578	0.432	0.287 — 0.578	0.074	5.870	150	<0.001
H2	(Intercept)	0.000	−0.139 — 0.139	0.000	−0.139 — 0.139	0.070	0.000	149	1.000
H2	Emotion_Regulation_z	0.350	0.171 — 0.530	0.350	0.171 — 0.530	0.091	3.866	149	<0.001
H2	Social_Support_z	0.212	0.033 — 0.391	0.212	0.033 — 0.391	0.091	2.335	149	0.021
H1: Happiness_z ~ Social_Support_z; N = 152
H2: Happiness_z ~ Emotion_Regulation_z + Social_Support_z; N = 152

regression_tables(reg_med)

Model Summary: Emotion_Regulation_z
Model	R²	R² adj.	BIC
H1	0.396	0.392	368.722
H1: Emotion_Regulation_z ~ Social_Support_z; N = 152

ANOVA Table: Emotion_Regulation_z
Model	Term	Sum of Squares	df	Mean Square	F	p
H1	Regression	59.837	2	29.918	98.456	<0.001
H1	Residual	91.163	150	0.608
H1: Emotion_Regulation_z ~ Social_Support_z; N = 152

Regression Coefficients: Emotion_Regulation_z
Model	Term	Unstandardized		Standardized			t	df	p
Model	Term	b	95% CI	β	95% CI	SE	t	df	p
H1	(Intercept)	0.000	−0.125 — 0.125	0.000	−0.125 — 0.125	0.063	0.000	150	1.000
H1	Social_Support_z	0.630	0.504 — 0.755	0.630	0.504 — 0.755	0.063	9.922	150	<0.001
H1: Emotion_Regulation_z ~ Social_Support_z; N = 152

Mediation

model_parameters(model_mediation)

ABCDEFGHIJ0123456789

Parameter <chr>	Estimate <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	p <dbl>
ACME	0.2205794	0.95	0.1035922	0.3401491	0.000
ADE	0.2116238	0.95	0.0458126	0.3639228	0.016
Total Effect	0.4322032	0.95	0.3154388	0.5457633	0.000
Prop. Mediated	0.5103604	0.95	0.2418449	0.8697138	0.000

Path Analysis

There is a statistical modelling approach called path analysis that does allow for multiple dependent variables in a single model. Here is how to specify a path analysis model using the lavaan package.

See the lavaan website for how to conduct mediation in lavaan

Model

model <- "
# a path
Emotion_Regulation ~ a*Social_Support

# b path
Happiness ~ b*Emotion_Regulation

# c prime path 
Happiness ~ c*Social_Support

# indirect and total effects
indirect := a*b
total := c + a*b
"

fit <- sem(model, data = happiness_data)

summary(fit, standardized = TRUE)

lavaan 0.6.16 ended normally after 1 iteration

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                         5

  Number of observations                           152

Model Test User Model:
                                                      
  Test statistic                                 0.000
  Degrees of freedom                                 0

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                       Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Emotion_Regulation ~                                                      
    Socl_Spprt (a)        0.760    0.076    9.988    0.000    0.760    0.629
  Happiness ~                                                               
    Emtn_Rgltn (b)        0.582    0.149    3.905    0.000    0.582    0.350
    Socl_Spprt (c)        0.424    0.180    2.358    0.018    0.424    0.212

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Emotion_Regltn  232.557   26.676    8.718    0.000  232.557    0.604
   .Happiness       785.537   90.107    8.718    0.000  785.537    0.739

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    indirect          0.442    0.122    3.637    0.000    0.442    0.221
    total             0.866    0.147    5.909    0.000    0.866    0.432

I also have a package, semoutput for displaying results from lavaan models:

sem_tables(fit)

N	χ²	df	p
Model Significance
152	0.000	0

CFI	RMSEA	90% CI	TLI	SRMR	AIC	BIC
Model Fit
1.000	0.000	0.000 — 0.000	1.000	0.000	2714.271	2729.391

Predictor	DV	Standardized
Regression Paths
Predictor	DV	β	95% CI	sig	SE	z	p
Social_Support	Emotion_Regulation	0.629	0.544 — 0.715	***	0.044	14.356	0.000
Emotion_Regulation	Happiness	0.350	0.181 — 0.520	***	0.086	4.057	<0.001
Social_Support	Happiness	0.212	0.039 — 0.384	*	0.088	2.402	0.016
a*b	indirect	0.221	0.109 — 0.333	***	0.057	3.859	<0.001
c+a*b	total	0.432	0.309 — 0.555	***	0.063	6.882	<0.001
* p < .05; p < .01; * p < .001

size <- .65
semPaths(fit, whatLabels = "std", layout = "tree2", 
         rotation = 2, style = "lisrel", optimizeLatRes = TRUE, 
         structural = FALSE, layoutSplit = FALSE,
         intercepts = FALSE, residuals = FALSE, 
         curve = 1, curvature = 3, nCharNodes = 8, 
         sizeLat = 11 * size, sizeMan = 11 * size, sizeMan2 = 4 * size, 
         edge.label.cex = 1.2 * size, 
         edge.color = "#000000", edge.label.position = .40)

It is highly advised to calculate bootstrapped and bias-corrected confidence intervals around the indirect effect:

monteCarloCI(fit, nRep = 1000, standardized = TRUE)

ABCDEFGHIJ0123456789

	est.std <dbl>	ci.lower <dbl>	ci.upper <dbl>
indirect	0.2205794	0.109084	0.3372444
total	0.4322032	0.318394	0.5498827

Moderation

Another type of multiple regression model is the moderation analysis. The concept of moderation in regression is analogous to the concept of an interaction in ANOVA. The idea here is that the relationship (correlation or slope) between two variables changes as a function of another variable, called the moderator. As the values on the moderator variable increases, the correlation or slope might increase or decrease. The moderator can be either continuous or categorical.

In our happiness data, we can test whether the relationship between chocolate consumption and happiness is moderated by the type of chocolate consumed. In this case, type of chocolate is a categorical variable: White, milk, dark, or alcohol.

It is highly advisable to first “center” the predictor and moderator variables. Centering refers to centering the scores around the mean (making the mean = 0). This can simply be done by subtracting out the mean from each score. We can do this with mutate() from the dplyr package.

Since type of chocolate is categorical we do not need to center it.

data_mod <- happiness_data %>%
  mutate(Chocolate_Consumed_c = 
           Chocolate_Consumption - mean(Chocolate_Consumption))

Model

model_null <- lm(Happiness ~ Chocolate_Consumed_c + Type_of_Chocolate, 
                 data = data_mod)

model_moderation <- lm(Happiness ~ Chocolate_Consumed_c * Type_of_Chocolate, 
                       data = data_mod)

summary(model_null)


Call:
lm(formula = Happiness ~ Chocolate_Consumed_c + Type_of_Chocolate, 
    data = data_mod)

Residuals:
    Min      1Q  Median      3Q     Max 
-81.628 -17.038  -0.842  20.550  72.419 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               76.4680     4.8883  15.643  < 2e-16 ***
Chocolate_Consumed_c       2.8006     0.6192   4.523 1.25e-05 ***
Type_of_ChocolateMilk      5.0000     6.9132   0.723   0.4707    
Type_of_ChocolateDark     15.0000     6.9132   2.170   0.0316 *  
Type_of_ChocolateAlcohol  20.0000     6.9132   2.893   0.0044 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 30.13 on 147 degrees of freedom
Multiple R-squared:  0.1738,    Adjusted R-squared:  0.1513 
F-statistic: 7.729 on 4 and 147 DF,  p-value: 1.111e-05

summary(model_moderation)


Call:
lm(formula = Happiness ~ Chocolate_Consumed_c * Type_of_Chocolate, 
    data = data_mod)

Residuals:
   Min     1Q Median     3Q    Max 
-68.72 -18.19  -2.37  20.53  74.22 

Coefficients:
                                              Estimate Std. Error t value
(Intercept)                                    76.4680     4.7887  15.968
Chocolate_Consumed_c                            0.4167     1.2132   0.343
Type_of_ChocolateMilk                           5.0000     6.7722   0.738
Type_of_ChocolateDark                          15.0000     6.7722   2.215
Type_of_ChocolateAlcohol                       20.0000     6.7722   2.953
Chocolate_Consumed_c:Type_of_ChocolateMilk      1.2821     1.7158   0.747
Chocolate_Consumed_c:Type_of_ChocolateDark      3.6060     1.7158   2.102
Chocolate_Consumed_c:Type_of_ChocolateAlcohol   4.6477     1.7158   2.709
                                              Pr(>|t|)    
(Intercept)                                    < 2e-16 ***
Chocolate_Consumed_c                           0.73176    
Type_of_ChocolateMilk                          0.46153    
Type_of_ChocolateDark                          0.02834 *  
Type_of_ChocolateAlcohol                       0.00367 ** 
Chocolate_Consumed_c:Type_of_ChocolateMilk     0.45613    
Chocolate_Consumed_c:Type_of_ChocolateDark     0.03733 *  
Chocolate_Consumed_c:Type_of_ChocolateAlcohol  0.00757 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 29.52 on 144 degrees of freedom
Multiple R-squared:  0.2233,    Adjusted R-squared:  0.1855 
F-statistic: 5.914 on 7 and 144 DF,  p-value: 4.684e-06

Null Model

glance(model_null)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.1737706	0.1512882	30.13374	7.729173	1.111472e-05	4	-730.7947	1473.589	1491.733	133482.2

model_parameters(model_null, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-0.3057223	0.14944739	0.95	-0.60106516	-0.01037935	-2.0456848	147	0.0425702226
Chocolate_Consumed_c	0.3390685	0.07497072	0.95	0.19090885	0.48722810	4.5226789	147	0.0000124905
Type_of_ChocolateMilk	0.1528611	0.21135052	0.95	-0.26481682	0.57053907	0.7232588	147	0.4706701550
Type_of_ChocolateDark	0.4585834	0.21135052	0.95	0.04090544	0.87626133	2.1697764	147	0.0316303989
Type_of_ChocolateAlcohol	0.6114445	0.21135052	0.95	0.19376657	1.02912246	2.8930352	147	0.0043959933

Interaction

glance(model_moderation)

ABCDEFGHIJ0123456789

r.squared <dbl>	adj.r.squared <dbl>	sigma <dbl>	statistic <dbl>	p.value <dbl>	df <dbl>	logLik <dbl>	AIC <dbl>	BIC <dbl>	deviance <dbl>
0.2232952	0.1855387	29.51944	5.914089	4.683958e-06	7	-726.097	1470.194	1497.409	125481.2

model_parameters(model_moderation, standardize = "refit")

ABCDEFGHIJ0123456789

Parameter <chr>	Coefficient <dbl>	SE <dbl>	CI <dbl>	CI_low <dbl>	CI_high <dbl>	t <dbl>	df_error <int>	p <dbl>
(Intercept)	-0.30572226	0.1464008	0.95	-0.59509441	-0.0163501	-2.0882555	144	0.038535440
Chocolate_Consumed_c	0.05044796	0.1468848	0.95	-0.23988080	0.3407767	0.3434526	144	0.731758915
Type_of_ChocolateMilk	0.15286113	0.2070420	0.95	-0.25637290	0.5620952	0.7383098	144	0.461527773
Type_of_ChocolateDark	0.45858339	0.2070420	0.95	0.04934936	0.8678174	2.2149294	144	0.028335810
Type_of_ChocolateAlcohol	0.61144451	0.2070420	0.95	0.20221049	1.0206785	2.9532392	144	0.003673596
Chocolate_Consumed_c:Type_of_ChocolateMilk	0.15522448	0.2077264	0.95	-0.25536239	0.5658113	0.7472544	144	0.456128471
Chocolate_Consumed_c:Type_of_ChocolateDark	0.43656885	0.2077264	0.95	0.02598198	0.8471557	2.1016529	144	0.037325091
Chocolate_Consumed_c:Type_of_ChocolateAlcohol	0.56268873	0.2077264	0.95	0.15210187	0.9732756	2.7087971	144	0.007571808

regression_tables(model_null, model_moderation)

Model Summary: Happiness
Model	R²	R² adj.	ΔR²	ΔF	df1	df2	p	BIC	BF	P(Model\|Data)
H1	0.174	0.151						1491.733
H2	0.223	0.186	0.050	3.061	3	144	0.030	1497.409	5.85 × 10⁻²	0.055
H1: Happiness ~ Chocolate_Consumed_c + Type_of_Chocolate; N = 152
H2: Happiness ~ Chocolate_Consumed_c * Type_of_Chocolate; N = 152

ANOVA Table: Happiness
Model	Term	Sum of Squares	df	Mean Square	F	p
H1	Regression	250273.269	5	50054.654	7.729	<0.001
H1	Residual	133482.237	147	908.042
H2	Regression	239803.383	8	29975.423	5.914	<0.001
H2	Residual	125481.246	144	871.398
H1: Happiness ~ Chocolate_Consumed_c + Type_of_Chocolate; N = 152
H2: Happiness ~ Chocolate_Consumed_c * Type_of_Chocolate; N = 152

Regression Coefficients: Happiness
Model	Term	Unstandardized		Standardized			t	df	p
Model	Term	b	95% CI	β	95% CI	SE	t	df	p
H1	(Intercept)	76.468	66.808 — 86.129	−0.306	−0.601 — −0.010	0.149	15.643	147	<0.001
H1	Chocolate_Consumed_c	2.801	1.577 — 4.024	0.339	0.191 — 0.487	0.075	4.523	147	<0.001
H1	Type_of_ChocolateMilk	5.000	−8.662 — 18.662	0.153	−0.265 — 0.571	0.211	0.723	147	0.471
H1	Type_of_ChocolateDark	15.000	1.338 — 28.662	0.459	0.041 — 0.876	0.211	2.170	147	0.032
H1	Type_of_ChocolateAlcohol	20.000	6.338 — 33.662	0.611	0.194 — 1.029	0.211	2.893	147	0.004
H2	(Intercept)	76.468	67.003 — 85.933	−0.306	−0.595 — −0.016	0.146	15.968	144	<0.001
H2	Chocolate_Consumed_c	0.417	−1.981 — 2.815	0.050	−0.240 — 0.341	0.147	0.343	144	0.732
H2	Type_of_ChocolateMilk	5.000	−8.386 — 18.386	0.153	−0.256 — 0.562	0.207	0.738	144	0.462
H2	Type_of_ChocolateDark	15.000	1.614 — 28.386	0.459	0.049 — 0.868	0.207	2.215	144	0.028
H2	Type_of_ChocolateAlcohol	20.000	6.614 — 33.386	0.611	0.202 — 1.021	0.207	2.953	144	0.004
H2	Chocolate_Consumed_c:Type_of_ChocolateMilk	1.282	−2.109 — 4.673	0.155	−0.255 — 0.566	0.208	0.747	144	0.456
H2	Chocolate_Consumed_c:Type_of_ChocolateDark	3.606	0.215 — 6.997	0.437	0.026 — 0.847	0.208	2.102	144	0.037
H2	Chocolate_Consumed_c:Type_of_ChocolateAlcohol	4.648	1.256 — 8.039	0.563	0.152 — 0.973	0.208	2.709	144	0.008
H1: Happiness ~ Chocolate_Consumed_c + Type_of_Chocolate; N = 152
H2: Happiness ~ Chocolate_Consumed_c * Type_of_Chocolate; N = 152

To better understand the nature of a moderation, it helps to plot the results.

sjPlot
ggeffects + ggplot2

The easiest way to plot the results of a moderation analysis is using the plot_model() function from the sjPlot package. Note that it is a bit more complicated to plot a moderation analysis when the moderator is a continuous variable instead of categorical but the sjPlot package actually makes it a bit easier.

plot_model(model_moderation, typ = "int")

The most customizable way to plot model results is using the ggeffects and ggplot2 packages. We can use ggeffect() to get a data frame with estimated model values for specified terms in the model and the dependent variable.

For continuous variables, we can customize at what values we want to plot. For “all” possible values from the original data we can specify [all]. If your variable was standardized before entering it into the model you can use [2, 0, -2] to get values at +2 standard deviations above the mean, the mean, and -2 standard deviations below the mean.

data_plot <- ggeffect(model_moderation,
                       terms = c("Chocolate_Consumed_c [all]", 
                                 "Type_of_Chocolate")) |>
  rename(Chocolate_Consumed_c = x, Type_of_Chocolate = group,
         Happiness = predicted)

ABCDEFGHIJ0123456789

Chocolate_Consumed_c <dbl>	Happiness <dbl>	std.error <dbl>	conf.low <dbl>	conf.high <dbl>	Type_of_Chocolate <fct>
-10.497	72.09401	13.605950	45.20083	98.98719	White
-10.497	63.63553	13.605950	36.74235	90.52872	Milk
-10.497	49.24204	13.605950	22.34886	76.13522	Dark
-10.497	43.30703	13.605950	16.41385	70.20021	Alcohol
-9.424	72.54112	12.395906	48.03968	97.04256	White
-9.424	65.45837	12.395906	40.95693	89.95980	Milk
-9.424	53.55837	12.395906	29.05693	78.05981	Dark
-9.424	48.74113	12.395906	24.23969	73.24257	Alcohol
-9.259	72.60988	12.211506	48.47292	96.74684	White
-9.259	65.73867	12.211506	41.60171	89.87563	Milk

ggplot(data_plot, aes(Chocolate_Consumed_c, Happiness, 
                      color = Type_of_Chocolate)) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high, 
                  fill = Type_of_Chocolate), 
              show.legend = FALSE, alpha = .1, color = NA) +
  geom_smooth(method = "lm", alpha = 0) +
  scale_color_brewer(palette = "Set1",
                     guide_legend(title = "Type of Chocolate")) +
  labs(x = "Amount of Chocolate Consumed")