# Alternative Models Examination and Gender Measurement Invariance of the 12-item General Health Questionnaire among Nigerian Adolescents

## Article information

## Abstract

### Objective

The objective of this study was to examine the factor structures of the *a priori* alternative models of the 12-items General Health Questionnaire (GHQ-12), its psychometric characteristics and gender measurement invariance in a sample of non-clinical Nigerian adolescents (n=1326; M_{age}=15.16).

### Methods

The sample consisted of 606 (45.7%) males, who completed the GHQ-12 in addition to the Hospital Anxiety and Depression Scale (HADS), the Rosenberg Self-Esteem Scale (RSES) and the Suicidal Behaviors Questionnaire-Revised (SBQ-R). We compared 21 models to identify which has the best fit indices applying confirmatory factor analysis. Gender measurement invariance was examined with nested multiple-group confirmatory factor analysis (MGCFA).

### Results

The model that best captures psychological distress was a three-factor model that was initially described among the Australian general population (CFI=0.952, SRMR=0.0310, RMSEA=0.042) (90%CI=0.035–0.049). The internal consistencies (ω) of this model and its dimensions were modestly satisfactory. The criterion validity of this model was supported via significant correlations with the other study measures. MGCFA supported the configural, metric and scalar gender invariances of this model.

### Conclusion

A three-factor GHQ-12 model (anhedonia/sleep disturbances; social performance and Loss of confidence) is useful as a psychological distress assessment tool among Nigerian adolescents.

**Keywords:**General Health Questionnaire-12; Nigerian adolescents; Factor structure; Psychometric properties; Gender invariance

## INTRODUCTION

The availability of subjectively completed measurements for the quantitative evaluation of psychological distress has become an integral component of practice and research in the context of healthcare [1]. The psychometric assessment of healthcare instruments has allowed for the availability of an extensive assortment of self and clinician completed scales [2]. The General Health Questionnaire (GHQ) is a generic instrument that was originally developed to provide information about psychological health, through the responses provided to a series of questions suggestive of distress [3]. It was not designed to diagnose specific psychiatric disorders [4], and was originally meant to assess the mental health of primary care patients in the United Kingdom [5].

It originally consisted of 60 items and other versions consisting of 12, 28, and 30 items have been adapted from it. The 12-item GHQ is the most globally utilized version for psychological distress assessment and evaluation of alterations in psychological health in the short term [6]. Its extensive use has been credited to its ease of administration, normative data availability and brevity [7-9]. There are three approaches to the scoring of the GHQ-12 items; the original binary method (0-0-1-1), the Likert type method (0-1-2-3) and the corrected binary method [10], in which the negatively worded items are scored using 0-1-1-1, while the original binary scoring method is applied to the positively worded items.

Since the development of the GHQ-12, its dimensionality has been rather contentious, with different authors reporting diverse and contradicting factorial structures. Attempts have been made to clarify the factorial structure of the GHQ-12 applying exploratory and confirmatory factor analytic techniques. The nagging debate regarding the underlying latent structure of the GHQ-12 is attributed to the observation that several studies have not been able to replicate the original unidimensional structure. Furthermore, studies that adopted the same scoring methods have also failed to produce the same factorial structure [11-13]. While the unidimensional structure was confirmed by some authors [8], others have reported bi-dimensional [14,15] and tri-dimensional [16-18] factor structures.

The GHQ-12 in terms of its psychometric characteristics has been examined only among the Nigerian adult population [12]. It would be erroneous to translate the psychometric findings among Nigerian adults to the adolescent population. Therefore, this present study attempts to first, utilize structural equation modeling, specifically confirmatory factor analysis (CFA) to compare 21 *a priori* alternative GHQ-12 models described in the literature and identify the one with the best fit indices. Secondly, to examine the criterion validity and internal reliability of the selected model, and thirdly, to examine its measurement invariance (MI) across the genders. The fulfillment of MI is a prerequisite that needs to be established before the comparison of psychological distress can be made between the genders using the latent factors of the selected model [19].

## METHODS

### Participants

The study sample of senior high school adolescents was selected from the four public high schools in Osogbo, a city in Southwestern Nigeria, adopting a multistage stratified sampling recruitment method. In the first stage, four classrooms from the 3 arms of the senior secondary classes (I, II, and III) were randomly selected from each high school (yielding 12 classrooms from each school and a total of 48 classrooms from the four schools). Secondly, using a balloting method we selected 30 students per classroom, producing a total of 1440 senior high school adolescents. The participants were recruited between July and November 2018.

### Measures

#### Sociodemographic questionnaire

The variables in this questionnaire included age and gender (male and female).

### General Health Questionnaire-12

This was used to quantify psychological distress among our respondents. Each item was scored using the binary method adopted (0-0-1-1) by the questionnaire’s developer [5]. The total score ranged from 0 to 12. Higher scores suggest psychological distress or an increased likelihood of having a psychiatric disorder. Satisfactory screening qualities for the identification of the presence of a psychiatric disorder in primary health care has been described among the Nigerian adult population [12].

### The description of the 21 *a priori* alternative GHQ-12 models

We specified and examined 21 models identified from literature [20-22]. The first was the original model [5]. The second [14] and seventh [23] models were the earliest two-factor 12-item models to be reported in the literature. The third model [24] is a two-factor structure that first described the GHQ-12 in terms of the positively and negatively worded items. The eighth [25] and ninth [9] models were the two-factor models with a reduced number of items. The fourth [16] and the sixth [26] models were the initial threefactor structures. The fifth model [17], is a three-factor structure described based on item content analysis, while the tenth model [18] is a three-factor Arabic version of the questionnaire. The eleventh model [20] is a three-factor model described recently among Malaysian university students, while the twelfth model [15], is a two-factor model of the Iranian version. The thirteenth [7], fourteenth [13] and sixteenth [27] models were all two-factor structures initially described among the Brazilian adult population. The fifteenth model [28] is a three-factor model described recently among young Chinese civil servants. The seventeenth model [8], is a single factor model with correlated unique variances of the negatively worded items. The eighteenth and nineteenth models [11] were three and two-factor models described among Japanese men and women. The twentieth model [12] was the two-factor model reported among Nigerian adults; while, the twenty-first model [29] is a three-factor model described among the nursing staff of a hospital in Tasmania. There were no similarities among the two or three-factor models concerning the item loadings on the factors.

### Rosenberg’s Self-Esteem Scale

The Rosenberg’s Self-Esteem Scale (RSES) consists of 10 items assessed on a 4-point Likert (0 to 3) scale, ranging from strongly agree to strongly disagree. Higher cumulative scores reflect greater self-esteem [30]. Satisfactory reliability and validity have been reported among Nigerian adolescents [31].

### Hospital Anxiety and Depression Scale

This is a 14-item scale (with each of the anxiety and depressive subscale consisting of 7 items) that was used to quantitatively evaluate anxiety and depressive symptoms [32]. Each item is rated on a 4-point Likert (0 to 3) scale. The cumulative score on each subscale ranges from 0 to 21. Higher scores indicate more severe anxiety and depressive symptoms. The reliability and validity as a screening tool for anxiety and depressive disorders have been reported to be adequate among the non-clinical and clinical populations in Nigeria [33].

### Suicidal Behaviors Questionnaire-Revised

This is a 4-item instrument that quantitatively assesses the different dimensions of suicidal behaviors [34]. The cumulative score ranges from 3 to 18. The prospective likelihood of suicidal behaviors is reflected by higher scores. Satisfactory psychometric properties as a screening instrument for the identification of those at a high risk of suicidal behaviors have been described among Nigerian adolescents and young adults [35].

### Procedure

The Research and Ethical Committees of the State-owned university (LTH/REC/2018/04/16/189) and Education Ministry (ED/2018/03/18/091) approved the study protocol. Those excluded were those older than 18 years, those with a current or previous history of a psychiatric disorder, those who refuse to assent, and those whose parents or guardians refused consent. The adolescents who agreed to participate in the study went home with a study-specific parental consent form that described the objectives of the study. Their parents or guardians were requested to append their signatures on the form as an indication of their consent. Twenty-three of the students refused to give assent, while ninety-one of the parents and guardians refused to give their consent. Thus, study measures from 1326 respondents were available for analysis (response rate of 92%).

### Data analyses

These were performed with the 25th version of the Statistical Package for Social Sciences (IBM Corp., Armonk, NY, USA) and R psych statistical software (version 3.4.2). Scores on the study measures were depicted using descriptive statistics. The dimensionality of the 21 *a priori* alternative models of the GHQ-12 was evaluated through CFA with the 20th version of the SPSS Analysis of Moment Structure (AMOS) software, utilizing the covariance matrix input method of the Maximum Likelihood Estimation (MLE) technique. These models were assessed with many indices; comparative fit index (CFI) [36], the root mean square of error approximation (RMSEA) [37], and the standardized root mean square residual (SRMR). We didn’t focus on the significance of the ratio of the chi-square (χ^{2}) and its related degree of freedom (χ^{2}/df), due to its tendency to unjustifiably indicate model rejection as a result of its sensitivity to sample size [38,39]. Acceptable data fit to model is indicated by CFI>0.90, SRMR<0.10, and RMSEA<0.08 [40]. Although, more stringent cutoff values (CFI>0.95, SRMR<0.08, and RMSEA<0.06) have also been proposed [41]. All further analyses were conducted with the model with the best fit indices. The criterion validity of the selected model and its underlying dimensions were examined with correlational analyses with the other study measures. The internal reliability of the selected model and its subscales was evaluated with the MacDonald’s omega coefficient (ω), which tends to yield a more accurate reliability coefficient for a multidimensional scale compared to the Cronbach’s alpha [42,43]. The ω coefficients were calculated using the R psych statistical package.

Next, we examined the gender MI of the selected model. The presence of MI will be buttressed if the construct of psychological distress as evaluated by the selected GHQ-12 model will exhibit no difference between groups (i.e., the male and female adolescents in our study). The MI across genders of the selected model was examined with multiple-group CFA (MGCFA). This was achieved through serially nested hierarchical steps [44]. First, we established the adequacy of the fit indices of the selected model separately for the male and female adolescents [45]. This initial step towards the establishment of MI is to confirm a baseline model that will adequately fit the data separately for the two genders [46]. Following the establishment of a model that exhibited acceptable fit indices for the genders separately, we subsequently examined simultaneously across both genders a configural MI model. In configural MI, there are no constraints placed on the model, i.e., all the selected model parameters were freely estimated across the genders. To confirm configural MI, the genders must demonstrate the loadings of equivalent manifest variables (GHQ-12 items) on the same dimensions (subscales). The subscales must have inter-correlations values below one and the loadings of all the manifest items must be significant. The fulfillment of configural MI means that the selected GHQ-12 model is similar across the genders. This level of MI doesn’t completely reflect the gender invariance of the selected model [45].

Afterward, we examined the metric MI of the model by specifying equal constraints on the factors (subscales) of the selected model [19]. This step will reveal if the factor loadings (GHQ-12 subscales) of the same manifest variables are equal for the genders. The affirmation of metric MI indicates that the 12-items of the selected GHQ-12 model have the same meaning to the male and female adolescents. The establishment of metric MI will allow for the comparison of the correlates of the selected GHQ-12 model factors between the genders [45]. The extent of the changes in the fit indices between the metric and configural models will support the presence or absence of MI [47]. Subsequently, we examined for scalar MI by placing equal constraints on the selected model’s factor loadings (as in the approach in metric MI) and the latent intercepts of the manifest variables across both genders [19]. Scalar MI is confirmed based on changes in the fit indices compared to the metric model [47].

Metric MI is confirmed by a change in CFI, RMSEA, and SRMR values that are ≤0.01, ≤0.015, and ≤0.03 respectively, when compared to the configural model, while the fulfillment of scalar MI is validated by changes in the CFI, RMSEA, and SRMR that are ≤0.01, ≤0.015, and ≤0.01 respectively in comparison to the metric model [47,48].

## RESULTS

### Sociodemographic and study measures characteristics of the participants

As shown in Table 1, females constituted 54.3% of the total sample. The mean age was 15.6 (SD 1.30). The mean total score on the GHQ-12 was 1.33 (SD 1.90). Table 1 also shows the descriptive characteristics of the subscales of the GHQ-12 and the other study measures.

### Confirmatory factor analysis of the 21 *a priori* alternative GHQ-12 models

As seen in Table 2, our data exhibited acceptable fit indices with 6 of the 21 models [8,11,16,24,26]. The best fit indices [CFI=0.952, SRMR=0.0310, RMSEA=0.042 (90% CI=0.035–0.049)] was exhibited by the three-factor model that was initially described among 603 adults selected from the general community in Australia [26]. Also, the three dimensions (subscales) in this model were modestly correlated (0.36 to 0.67), reflecting a low amount of variance. This observation supports the multidimensionality of this model in our sample. We did not attempt to free the covariances between measurement error variables since this approach to improve model fit has been described as unacceptable [49], although, this was the approach that was used in the establishment of the 17th model [8]. Figure 1 shows the CFA path analysis diagram for this model.

### Correlational analyses (criterion validity) between GHQ-12 and other study measures

The correlations (Spearman’s rho) between the selected GHQ-12 model, its subscales and the other study measures in terms of their directions and strengths are shown in Table 3. All the correlations with the other measures were statistically significant (p<0.001). Also depicted in Table 3 are the internal consistencies (ω) of the selected GHQ-12 model and its subscales.

### Gender measurement invariance of the selected 3-factor GHQ-12 model

The initial CFA performed separately for the male and female adolescents using the selected 3-factor model [26], showed acceptable fit indices for the genders. As depicted in Table 4, among the males, this model exhibited an acceptable fit [CFI=0.913, RMSEA=0.061 (90% CI: 0.050–0.071), SRMR=0.0463]. Among the male respondents, the item loadings on the Anhedonia/Sleep disturbance, Social performance and Loss of confidence subscales ranged from 0.57 to 0.67, 0.41 to 0.64, and 0.55 to 0.60 respectively. The correlations among the subscales ranged from 0.35 to 0.70. Among the females, this model also exhibited an acceptable fit [CFI=0.969, RMSEA=0.032 (90% CI: 0.020–0.043), SRMR=0.0297], the two items on the Anhedonia/Sleep disturbance subscale had equal loadings of 0.51, while the item loadings on the Social performance and Loss of confidence subscales ranged from 0.38 to 0.55 and 0.50 to 0.67 respectively. The correlations among the three subscales for the females ranged from 0.36 to 0.67. Table 4 indicates that the configural MI for the selected three-factor model [26] has acceptable fit indices [CFI=0.952, RMSEA=0.030 (90% CI: 0.026–0.033), SRMR=0.0310]. A metric MI model in relation to the genders also yielded acceptable fit indices [CFI=0.955, RMSEA=0.027 (90% CI: 0.024–0.031), SRMR=0.0310]. Finally, a scalar MI model for both genders also demonstrated acceptable fit indices [CFI=0.957, RMSEA=0.025 (90% CI: 0.022–0.028), SRMR=0.0310]. No changes were observed in the SRMR values in the three nested models. The changes in the CFI and RMSEA values [47,48] support the metric and scalar gender invariances of the selected three-factor model [26].

## DISCUSSION

One of the primary objectives of this study was to identify which of the 21 *a priori* alternative models of the GHQ-12 described in literature best measures the construct of psychological distress in a non-clinical sample of Nigerian adolescents. The other objectives were to assess the internal consistency and criterion validity of the selected model, and examine if this model will be invariant for both the male and female adolescents. To the knowledge of the authors, this is the only study in Nigeria, which is the most populous black nation globally [50], that have attempted to examine among adolescents, the dimensionality of the GHQ-12 applying CFA, in addition to its psychometric properties and gender invariance.

We noted that out of the 21 alternative models, 6 yielded acceptable fit indices. Even though the GHQ-12 was conceived originally as a unidimensional structure, numerous other bi-dimensional, tri-dimensional and even a modified uni-dimensional structure have been described, thus there is no consensus regarding its dimensionality. Even among the authors that reported the same structure in terms of the number of subscales, there are discrepancies regarding the item loading and labeling of the subscales, i.e., two-factors [14,23,25], and three-factors [16,17,20,28].

In our study, the model that best captures the construct of psychological distress was the three-factor model which was initially described in a cross-sectional community sample of Australian adults [26]. The three dimensions in this model were originally labeled as Anhedonia/Sleep disturbance (2 items), Social Performance (6 items) and Loss of confidence (4 items). In our sample of Nigerian adolescents, the items on each of the dimensions had positive and reasonably high statistically significant standardized factor loadings. The three dimensions were all modestly correlated, an observation reflecting a low amount of variance among them, further buttressing that this model best explains psychological distress in our sample.

We also noted that one [8] of the other five models with acceptable fit indices had values that were approximate to those of our selected model [26]. The author of this modified uni-dimensional model correlated the unique error variances on the negatively phrased items based on the opinion that they were influenced by response bias [8], despite the criticism regarding the covariation of error residuals on a CFA model [39]. A study [51], that examined the dimensionality of the Dutch version of the GHQ-12, in a large sample of adults in Belgium, reported that the same three-factor model [26], had the best fit indices.

In terms of internal consistency, MacDonald’s coefficient omega (ω) values for the overall model (0.78) and its subscales (0.50 to 0.72) were rather modest. Omega coefficient values are interpreted similarly to Cronbach’s alpha and values ranging from 0.60 to 0.69 are marginally acceptable [52]. The criterion validity of this model was also confirmed via correlational analyses with the other study measure. Although the strength of the correlations of the selected model with its dimensions and the other study measures were modest, the directions were all as expected. In this study, we noted that psychological distress correlated positively with anxiety, depression, and suicidality. In other words, higher scores on the GHQ-12 were associated with higher scores on the HADS-Anxiety and Depression subscale and the SBQ-R. Positive correlations have been recently reported between psychological distress and anxiety and depressive symptoms among Australian adolescents [53]. Dimensions of suicidality such as suicidal ideations and attempts have also been observed to have positive correlations with psychological distress among adolescents [54]. Higher psychological distress was associated with lower self-esteem in our sample. A similar observation was reported in a recent study that examined the correlates of psychological distress among adolescents in Kosovo [55].

Another objective of our study was to evaluate the gender measurement invariance of the selected model. Authors in developed countries have previously demonstrated that three-factor model with the best fit indices in our study is invariant across male and female adults [22]. To the knowledge of the authors of this current study, the gender invariance of any of the identified 21 a priori models have not been specifically examined among adolescents. Our results indicate that the three-factor model [26] that we selected fitted acceptably well for both genders in our adolescent sample. Also, there was evidence to support the configural, metric and scalar invariances of this model concerning the genders. Thus, in our sample, the structure of the selected three-factor model did not significantly differ concerning factor loadings and item intercepts between the genders. The confirmation of gender invariance is a prerequisite that should be fulfilled before any meaningful comparison can be made between the male and female adolescents with the GHQ-12 and its latent variables. Further analysis of our data showed that there was no gender difference in our sample with the selected model [26] and its three dimensions.

Future additional research among Nigerian adolescents should be targeted at the evaluation of the GHQ-12 in terms of its clinical sensitivity, specificity, and predictive validity by comparing to a ‘gold standard’. This will enable the identification of cut-off scores that will be clinically useful in identifying psychologically distressed Nigerian adolescents. Since we recruited only a non-clinical sample, additional studies are needed to further explore the psychometric characteristics in Nigerian adolescent clinical samples.

### Limitations

There are some limitations to be considered regarding our study. First, we require that caution should be exercised in generalizing our findings to the adolescent population in other regions of the country since our sample is from one of the six geopolitical zones (southwest) of the country. Another limitation was that we adopted research measures that were all subjectively completed.

### Conclusions

The three-factor model of the GHQ-12 [26], first described among Australian adults was the best measurement for the construct of psychological distress among Nigerian adolescents. The findings in our study appear to support the validity of applying the GHQ-12 as a multidimensional instrument among Nigerian adolescents.

## Notes

The authors have no potential conflicts of interest to disclose.

**Author Contributions**

Conceptualization: Olutayo Aloba, Tolulope Opakunle. Data curation: Olutayo Aloba, Tolulope Opakunle, Kunle Ogunrinu. Formal analysis: Olutayo Aloba, Tolulope Opakunle. Funding acquisition: Tolulope Opakunle. Investigation: Tolulope Opakunle, Kunle Ogunrinu. Methodology: Olutayo Aloba, Tolulope Opakunle. Project administration: Tolulope Opakunle, Kunle Ogunrinu. Resources: Olutayo Aloba. Software: Olutayo Aloba, Tolulope Opakunle. Supervision: Olutayo Aloba. Validation: Olutayo Aloba, Tolulope Opakunle. Writing—original draft: Olutayo Aloba, Tolulope Opakunle. Writing—review & editing: Olutayo Aloba, Tolulope Opakunle. All authors read and approved the final version of the manuscript.