This is the central function of the package. It runs the specification curve analysis. It takes the data frame and vectors for analytical choices related to the dependent variable, the independent variable, the type of models that should be estimated, the set of covariates that should be included (none, each individually, and all together), as well as a named list of potential subsets. The function returns a tidy tibble which includes relevant model parameters for each specification. The function tidy is used to extract relevant model parameters. Exactly what tidy considers to be a model component varies across models but is usually self-evident.
run_specs( df, x, y, model = "lm", controls = NULL, subsets = NULL, all.comb = FALSE, conf.level = 0.95, keep.results = FALSE )
a data frame that includes all relevant variables
a vector denoting independent variables
a vector denoting the dependent variables
a vector denoting the model(s) that should be estimated.
a vector denoting which control variables should be included. Defaults to NULL.
a named list that includes potential subsets that should be evaluated (see examples). Defaults to NULL.
a logical value indicating what type of combinations of the control variables should be specified. Defaults to FALSE (i.e., none, all, and each individually). If this argument is set to TRUE, all possible combinations between the control variables are specified (see examples).
the confidence level to use for the confidence interval. Must be strictly greater than 0 and less than 1. Defaults to .95, which corresponds to a 95 percent confidence interval.
a logical value indicating whether the complete model object should be kept. Defaults to FALSE.
a tibble that includes all specifications and a tidy summary of model components.
Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2019). Specification Curve: Descriptive and Inferential Statistics for all Plausible Specifications. Available at: https://doi.org/10.2139/ssrn.2694998
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing Transparency Through a Multiverse Analysis. Perspectives on Psychological Science, 11(5), 702-712. https://doi.org/10.1177/1745691616658637
plot_specs() to visualize the results of the specification curve analysis.
# run specification curve analysis results <- run_specs(df = example_data, y = c("y1", "y2"), x = c("x1", "x2"), model = c("lm"), controls = c("c1", "c2"), subsets = list(group1 = unique(example_data$group1), group2 = unique(example_data$group2))) # Check results frame results#> # A tibble: 192 x 23 #> x y model controls estimate std.error statistic p.value conf.low #> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 x1 y1 lm c1 + c2 4.95 0.525 9.43 3.11e-18 3.92 #> 2 x2 y1 lm c1 + c2 6.83 0.321 21.3 1.20e-57 6.20 #> 3 x1 y2 lm c1 + c2 -0.227 0.373 -0.607 5.44e- 1 -0.961 #> 4 x2 y2 lm c1 + c2 0.985 0.324 3.04 2.62e- 3 0.347 #> 5 x1 y1 lm c1 5.53 0.794 6.97 2.95e-11 3.96 #> 6 x2 y1 lm c1 8.07 0.557 14.5 6.90e-35 6.98 #> 7 x1 y2 lm c1 0.0461 0.466 0.0989 9.21e- 1 -0.872 #> 8 x2 y2 lm c1 1.61 0.394 4.10 5.72e- 5 0.837 #> 9 x1 y1 lm c2 5.15 0.625 8.24 9.95e-15 3.92 #> 10 x2 y1 lm c2 6.50 0.466 13.9 5.38e-33 5.58 #> # … with 182 more rows, and 14 more variables: conf.high <dbl>, #> # fit_r.squared <dbl>, fit_adj.r.squared <dbl>, fit_sigma <dbl>, #> # fit_statistic <dbl>, fit_p.value <dbl>, fit_df <dbl>, fit_logLik <dbl>, #> # fit_AIC <dbl>, fit_BIC <dbl>, fit_deviance <dbl>, fit_df.residual <int>, #> # fit_nobs <int>, subsets <chr>