This vignette exemplifies different ways to plot the specification curve. In general, the function plot_specs() takes care of the overall process. However, more specific customizations are possible once we use the more specific functions plot_curve(), plot_choices, and plot_samplesizes(). These function produce regular ggplot-objects that can be customized further.

1. Run the specification curve analysis

In order to have some data to work with, we run the minimal example contained in the package.

library(specr)
library(dplyr)
# run spec analysis
results <- run_specs(example_data,
                     y = c("y1", "y2"),
                     x = c("x1", "x2"),
                     model = "lm",
                     controls = c("c1", "c2"),
                     subset = list(group1 = unique(example_data$group1),
                                   group2 = unique(example_data$group2)))

Let’s quickly get some ideas about the specification curve by using summarise_specs()

summarise_specs(results, group = c("x", "controls"))
## # A tibble: 8 x 9
## # Groups:   x [2]
##   x     controls      median   mad     min   max   q25   q75   obs
##   <chr> <chr>          <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 x1    c1              3.16  4.50 -2.05    8.96 0.581  6.39   123
## 2 x1    c1 + c2         3.16  4.46 -1.94    8.49 0.228  6.25   123
## 3 x1    c2              3.79  4.51 -1.61    9.14 0.632  6.48   123
## 4 x1    no covariates   3.95  4.21 -1.80    9.28 1.13   6.73   123
## 5 x2    c1              4.75  4.77 -0.258   8.75 1.36   7.79   123
## 6 x2    c1 + c2         4.35  4.74 -0.0841  8.71 1.08   7.14   123
## 7 x2    c2              4.40  4.78  0.373   9.58 1.18   7.61   123
## 8 x2    no covariates   4.81  4.92  0.463   9.22 1.57   8.12   123

We see that it makes quite a difference whether x1 or x2 is used as independent variable.

2. A simple way to visualize the results

The simplest way to visualize most of the information contained in the results data frame is by using the plot_specs() function.

plot_specs(results)

We can further customize that function, e.g., by removing unnecessary information or by reordering/transforming the analytical choices (and thereby visualize specific contrasts).

plot_specs(results, 
           choices = c("x", "y", "controls", "subsets"),  # model is not plotted
           rel_heights = c(0.75, 2)) # changing relative heights of the two parts

# Investigating specific contrasts
results %>%
  mutate(group1 = ifelse(grepl("group1 = 0", subsets), "0", "1"),
         group2 = ifelse(grepl("group2 = A", subsets), "A", "B & C")) %>%
  plot_specs(choices = c("x", "y", "controls", "group1", "group2"), rel_heights = c(2.4, 1.9))

3. An alternative way to visualize the results

3.1. Plot curve and choices seperately

Alternatively, we can also plot the curve and the choice panel individually and bind them together afterwards.

# Plot specification curve (p1)
p1 <- plot_curve(results) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey") +
  ylim(-8, 12) +
  labs(x = "", y = "unstandarized regression coefficient") +
  theme_classic()

# Plot choices (p2)
p2 <- plot_choices(results) +
  labs(x = "specifications (ranked)") +
  theme_classic() +
  theme(strip.text.x = element_blank())

# Combine plots
plot_specs(plot_a = p1,
           plot_b = p2,
           labels = c("", ""),
           rel_height = c(2, 2.5))

3.2. Include sample size histogram

The additional function plot_samplesizes() provides and additional panel that can be added via the cowplot::plot_grid()

p3 <- plot_samplesizes(results) +
  theme_classic()

# Combine via cowplot
cowplot::plot_grid(p1, p2, p3,
                   ncol = 1,
                   align = "v",
                   rel_heights = c(1.5, 2, 0.8),
                   axis = "rbl")