What should be reported in the measures section of a standard research paper? Most scholars provide example items, some descriptive analyses (e.g., mean and standard deviation), and Cronbach’s alpha as a measure of reliability. We argue that this is not sufficient and schoalrs should provide more in-depth analyses of the measures’ factorial validity. The following page includes reporting guidelines and examples.
Dienlin & Metzger (2016) added the following preamble before they discussed they measures in more detail:
Based on established scales and additional items that we designed in order to fit the research question more closely, confirmatory factor analyses (CFA) were run for each variable to select items that formed a unidimensional structure. To assess the assumption of normality, Shapiro-Wilk normality tests were done. As the results showed violations of normality, we used the more robust Satorra-Bentler scaled test statistic. Items that did not sufficiently load on the latent factor were deleted. To assess reliability of the constructed and congeneric scales, the usual fit indices (χ2, CFI, TLI, RMSEA, SRNR), McDonald’s composite reliability omega, and Cronbach’s alpha were calculated. All scales had adequate to good factorial validity and reliability. The variables and their psychometrics appear in Table 2; all questionnaire items, the data, item distributions, and the CFAs can be found in the online supplementary material.
They then proceeded to describe each scale:
Facebook benefits measured how many positive aspects people attributed to Facebook use. Twelve items were initially developed based on an earlier focus group pilot study of users who discussed the benefits and risks that they experience as a result of using Facebook. Of the 12 items, 10 were used as determined by the data preparation analysis discussed above that included, for example, using Facebook for self-expression, learning new things, and making new personal or business contacts (see online supplementary material). Respondents answered all items on a 5-point scale ranging from 1 = strongly disagree to 5 = strongly agree.
Similarly, Bol et al. (2018) provide comprehensive information about all scales’ factorial validity:
Participants answered all items on a 7-point scale ranging from 1 = strongly disagree to 7 = strongly agree. Factor validity was tested via confirmatory factor analyses for each variable separately. In addition, to test discriminant validity and item cross-loadings, we computed an overall model analyzing all variables together. Referring to common fit criteria (e.g., Kline, 2016), all measures showed good model fit and reliability; likewise, the overall model revealed good fit (see Table 1). Several items violated the assumption of normal distribution (see Figure 1); therefore, we used maximum likelihood estimation with robust standard errors and a Satorra-Bentler scaled test statistic. All items are listed in the online supplementary material.
This description is followed by a table that outlines fit indidices for the respective measurement models of all scales as well as relevant reliability estimates.
Masur, DiFranzo & Bazarova (2021) provide comprehensive information about each scale:
To measure participants’ social norm perceptions we adapted the 12-item scale developed by Park and Smith (2007) to fit the simulated SNS environment ESL. Four items referred to descriptive (e.g., “Most people on EatSnap.Love are visually identifiable in their posts”), injunctive (e.g., “The majority of people on EatSnap.Love think it is appropriate to share pictures of themselves.”), and subjective norm perceptions (e.g., “I have the feeling that most people on EatSnap.Love want other users to post pictures of themselves”) respectively and were administered on a 7-point scale ranging from 1 (strongly disagree) to 7 (strongly agree). The three-dimensional model (χ2(51) = 343.74, p <.001; CFI = .97; TLI = .96; RMSEA = .10, 90% CI [.09, .11]; SRMR = .02) fitted the data well, but all three subdimensions correlated very strongly (r >.90), suggesting that these all three factors did not have enough discriminant validity. A second-order model revealed that all three types of norms loaded highly onto a global factor (γ >.90). We hence continued our analyses with a single factor (and a single mean index respectively). Reliabilities of all three subdimensions were high (ω = .93-.95). The reliability of the whole scale was likewise very high (ω = .98).
They further provide all item formulations, item-specific descriptive analyses and more details about the confirmatory factor analyses in a online supplement: https://osf.io/rqft5