Is Longer Always Better? Clinical Validation of the Korean Altman Self-Rating Mania Scale With a Comparison of 5-Item Versus 11-Item Versions
Article information
Abstract
Objective
The Altman Self-Rating Mania Scale (ASRM) has been recommended as a brief and psychometrically promising tool for the self-reporting of (hypo)manic symptoms. However, there is a shortage of validation in clinical samples, and none for the newly developed 11-item version. Thus, this study aimed to test the reliability and validity of the ASRM-11 in comparison with the ASRM-5.
Methods
Through a retrospective chart review, self-reported data on the Korean ASRM (K-ASRM) were collected from 122 patients diagnosed with either bipolar or depressive disorder via a (semi)structured diagnostic interview. They were interviewed individually using the Korean Young Mania Rating Scale (K-YMRS). The reliability, construct, convergent, and diagnostic validity were also examined.
Results
The reliability of both versions of the K-ASRM ranged from good to excellent. Factor analyses revealed a unifactorial solution for K-ASRM-5, while a dual-factor structure emerged for K-ASRM-11, corresponding to the bright and dark sides of (hypo)mania. Moreover, both versions demonstrated a significant positive correlation with the total K-YMRS score. In the receiver operating curve analysis, the discriminating ability of both versions was fair when distinguishing manic from non-manic patients.
Conclusion
Both the K-ASRM-5 and K-ASRM-11 demonstrated comparable levels of sound psychometric properties, which supports their continued usage in research and clinical practice. It is necessary to test potential utility of the K-ASRM-11 for symptom monitoring and treatment response because of its additional coverage of dark-side (hypo)manic symptoms, including irritability and impulsivity, which are considered important for the quality of life of patients with bipolar disorder.
INTRODUCTION
(Hypo)mania is a defining feature of bipolar disorder (BD). The diagnostic criteria for (hypo)manic episodes are an abnormally elevated mood and increased goal-directed behavior or energy. In addition, inflated self-confidence/grandiosity, decreased need for sleep, talkativeness, flight of ideas, distractibility, psychomotor agitation, and/or impulsivity often accompany these episodes. Given that BD tends to be under-recognized at the expense of unipolar depression and has a highly recurrent course [1], it is crucial to screen and monitor (hypo)manic symptoms with reliable and valid measures for both the early detection and successful management of BD [2].
Accordingly, several self-report scales of (hypo)mania have been developed since the 1990s, refuting the past notion that self-rating of mania is neither reliable nor valid because of the lack of insight or uncooperative attitudes of manic patients [3]. A recent systematic review of the existing self-report measures of manic symptoms [4] suggested that three scales—the Altman Self-Rating Mania Scale (ASRM) [5], Internal State Scale (ISS) [6], and Self-Report Mania Inventory (SRMI)7—are the most promising at the moment. Supporting evidence for its psychometric properties has been reported from BD patient samples in more than three independent studies. In addition, the authors highlighted that the ASRM has a comparative advantage of brevity (5 items) and time frame (over the past week) in accordance with the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic criteria (cf., ISS 16 items, SRMI 47 items, both over the past month). Moreover, Cerimele et al. [8] also recommended the ASRM as having the highest clinical utility (9.6, range=3–9.6) as a self-report tool for (hypo)manic symptoms in implementing measurement-based care for BD.
The original ASRM consists of five groups of statements designed to assess the severity of (hypo)manic symptoms: elevated mood, increased self-esteem, reduced need for sleep, pressured speech, and psychomotor agitation (more activity [9]) [5]. These 5 items (manic factor), extracted from a principal component analysis of 14 items, were selected in the scale construction process based on the finding that the sum score could discriminate patients with manic BD from other psychiatric diagnostic groups (22 schizophrenia, 13 schizoaffective disorder, 30 major depressive disorder, 6 bipolar disorder, depressive episode, and 34 bipolar disorder, manic episode). It also showed a strong positive correlation with the Young Mania Rating Scale (YMRS) and the Clinician-Administered Rating Scale for Mania (CARS-M) in a subset of 34 bipolar manic patients, indicating concurrent validity. Furthermore, an ASRM score greater than 5 demonstrated a sensitivity of 0.86 and a specificity of 0.87 in the receiver-operating characteristic (ROC) analysis. In a subsequent study comparing three self-report scales of mania [10], the ASRM score showed a medium positive association with the CARS-M and a high sensitivity of 0.93, as opposed to a lower specificity of 0.33, in a sample of 44 acutely manic inpatients, outperforming other self-reports.
Since its original development, the ASRM has been translated into various languages [11]. Kim and Kwon [12] demonstrated satisfactory psychometric properties of the Korean ASRM (K-ASRM) in a large sample of non-clinical participants. Validation in clinical samples has also been conducted; however, formal reports on such validation data are limited. The Spanish version of the ASRM evidenced high internal consistency and convergent validity with the CARS-M in a sample of 74 patients with BD [13]. Factor analysis yielded a unifactor structure that could discriminate patients with BD in the manic state from asymptomatic patients with BD. To the best of our knowledge, the most recent clinical validation was performed by Skokou et al. [9]. They validated the Greek version of the ASRM (G-ASRM) using a sample of 86 patients with BD and 37 healthy controls. Internal consistency was excellent, and a one-factor structure was found in the exploratory factor analysis (EFA). The G-ASRM was positively correlated with the YMRS but unrelated to the Montgomery-Asberg Depression Rating Scale, demonstrating convergent and divergent validity. At the optimal cutoff score of 6, it classified (hypo)manic patients well (area under the curve [AUC]=0.84, sensitivity=0.83, specificity=0.84).
However, the brevity of the ASRM can have both sides. Although it is convenient and economical in terms of the time and effort required for test administration, it has been criticized for its limited content coverage. Acknowledging this limitation, the original author newly suggested longer versions, 11-item and 14-item ASRM [11]. Six items (irritability, lability of mood, grandiosity, racing thoughts, distractibility, and impulsivity/poor judgement), which constituted the “irritability factor,” were supplemented to construct a more comprehensive self-report of mania, focusing on symptoms listed in the DSM diagnostic criteria. The longest, ASRM-14, additionally includes the remaining three items for psychosis.
Therefore, the objectives of the current study were twofold. First, we aimed to validate the K-ASRM in a clinical sample of patients diagnosed with mood disorders, including BD. A previous study provided the intimal psychometric properties of the K-ASRM [12], but it was labeled as preliminary, because it did not involve patients with BD. Even though the ASRM has been recommended in the DSM-5 as an emergent measure requiring additional follow-up research to confirm its utility in diverse settings, there is still a paucity of clinical validation of the ASRM. Furthermore, we explored whether psychometric properties differ across diagnostic groups, based on the possibility that current mood state or insight level could influence the self-rated ASRM scores. Second, we initially sought to compare two versions of the ASRM: the original ASRM-5 and the recent ASRM-11. Since the suggestion by Altman and Østergaard [11], no study has yet reported the reliability or validity of the ASRM-11, nor verified the incremental utility of the longer version in comparison to the original ASRM-5. This could be an additional issue that requires further empirical investigation.
METHODS
Participants and procedures
A total of 122 psychiatric patients (average age=30.01, SD=10.15, range=19–67 years, 77% female) were consecutively enrolled either from the inpatient ward or outpatient clinic of the Department of Psychiatry, Samsung Medical Center, from March 2021 to December 2022. The inclusion criteria were diagnoses of 1) bipolar I disorder, manic, 2) bipolar I disorder, depressed, 3) bipolar II disorder, hypomanic, 4) bipolar II disorder, depressed, 5) other specified bipolar and related disorder, and 6) depressive disorder. Diagnosis was confirmed using the Structured Clinical Interview for DSM-5 [14] or the Mini-International Neuropsychiatric Interview [15], which was individually administered by a trained interviewer with a master’s degree in clinical psychology, under the supervision of a doctoral-level clinical psychologist. The same interviewer conducted the YMRS, ignoring individual ASRM scores. Exclusion criteria were recent history of substance abuse, unstable medical condition, acute psychotic state, or severe thought disorder at assessment, which may interfere with the ability for self-reporting [5]. As this was a retrospective chart review, the need for informed consent was waived by the Institutional Review Board (IRB No: SMC 2024-09-086).
Measures
K-ASRM
Detailed information on the prior validation of the 5-item K-ASRM in a large nonclinical sample can be found in Kim and Kwon [12]. For continued clinical validation of ASRM-11, permission was obtained from the original author, Dr. Altman (personal communication, January 15, 2021). After the first author translated the English items into Korean, a bilingual person back-translated the items, which were then checked by the original author to confirm the equivalence of meaning. Any disagreements were resolved through consensus between the first and original authors through a series of personal communications.
As mentioned previously, ASRM-11 consists of 11 groups of statements aimed at assessing the severity of (hypo)manic symptoms in accordance with the DSM diagnostic criteria. Respondents were instructed to choose one statement that best described how they had been feeling in the past week on a 5-point Likert scale (0=never, 1=occasionally, 2=frequently, 3=most of the time, and 4=constantly). Here, it is noteworthy that we used the ASRM-11 version in which the Likert options were slightly changed by Altman and Østergaard [11], which may result in a change in sum score.
K-YMRS
The YMRS is one of the most frequently used clinician rating scales to evaluate the severity of (hypo)manic symptoms [16]. The K-YMRS was validated by Jung et al. [17]. It comprises 11 items (elevated mood, increased motor activation and energy, sexual interest, sleep, irritability, speech, language-thought disorder, content, disruptive-aggressive behavior, appearance, and insight). Each item was rated on a 5-point Likert scale, taking into account both patients’ reports of recent conditions and clinician observations during the interview. A higher total score indicated the presence of greater (hypo)manic symptoms.
Statistical analysis
All analyses were performed using Mplus 8.4 [18] and Jamovi 2.5 (The Jamovi Project, 2024; https://www.jamovi.org/) software. Following descriptive statistics, reliability for both versions of the ASRM were evaluated using internal consistency, including McDonald’s ω and item-to-total correlations.
To evaluate construct validity, a series of factor analyses was conducted. In the case of the K-ASRM-5, confirmatory factor analysis (CFA) was performed based on previous research that reported a unifactorial structure. Given the ordinal nature of the Likert scale, assuming continuity in the responses might be challenging. In addition, positively skewed distributions are possible because the frequency of reported manic symptoms may be sparse. Therefore, CFA was performed using Diagonally Weighted Least Squares (WLSMV) estimation, which does not assume normality or continuity, as an alternative method to overcome biases in the conventional maximum likelihood (ML) estimation on skewed ordinal data [19,20]. It also provides more robust parameter estimates and standard errors than ML estimations [21]. The model fit was assessed using the following criteria: comparative fit index (CFI) >0.95, root mean square error of approximation (RMSEA) <0.06, and standardized root mean square residual (SRMR) <0.08 [22,23].
An EFA was conducted for the K-ASRM-11 items because of the absence of pre-existing evidence for its factor structure. Weighted least squares was used for factor extraction; oblimin rotation was used, assuming possible correlations between factors. The minimum threshold for a significant factor loading was defined 0.30 and above, and cross-loading was defined as an item significantly loaded on more than one factor, and the difference between the loadings was less than 0.10 [24].
Pearson’s correlation analyses were then performed to assess the convergent validity of both versions of the K-ASRM and K-YMRS. Steiger’s Z-test was applied to determine whether these correlations differed significantly between the two versions [25]. Finally, ROC curves were generated to assess how well the scale scores predicted the likelihood of being diagnosed as manic. Consistent with Skokou et al. [9], we defined the manic state as a YMRS score greater than 11 and the non-manic state as a YMRS score less than or equal to 11 when dichotomous variables were necessary for analysis. In interpreting the ROC analyses, we regarded the value of the AUC between 0.70 and 0.80 as fair, between 0.80 and 0.90 as good, and values of 0.90 or higher as excellent [26]. Optimal cutoff scores were determined using Youden’s index [27], which maximizes the balance between sensitivity and specificity.
RESULTS
Descriptive statistics and reliability
Table 1 presents the descriptive statistics and mean differences between the groups. The internal consistency of both scales was in the good to excellent range, with McDonald’s ω values of 0.81 for K-ASRM-5 and 0.83 for K-ASRM-11, respectively. Additionally, item-to-total correlations within each scale were all acceptable, with values ranging from 0.43 to 0.57 for K-ASRM-5 and from 0.22 to 0.69 for K-ASRM-11. These findings strongly support the reliability of the two ASRM versions.
Reliability analyses by diagnostic group revealed a differential pattern. Both bipolar I disorder and bipolar II disorder groups demonstrated good to excellent internal consistency of both scales. For the K-ASRM-5, McDonald’s ω was 0.88 for bipolar I disorder group and 0.90 for bipolar II disorder group. Similarly, for the K-ASRM-11, McDonald’s ω was 0.92 for bipolar I disorder group and 0.87 for bipolar II disorder group. However, depressive disorder group exhibited lower internal consistency, with McDonald’s ω values of 0.58 for the K-ASRM-5 and 0.73 for the K-ASRM-11, reflecting low to acceptable reliability.
Construct validity
Before conducting CFA on the K-ASRM-5 items, the Shapiro-Wilk test for normality was performed, revealing that all items were significant (p<0.001). Thus, as previously stated, the WLSMV estimation, which does not assume the normality of data, was utilized. The CFA results indicated that the unidimensional factor model demonstrated an adequate fit for K-ASRM-5 (χ²=4.15, df=5, p=0.53, CFI=1.0, RMSEA=0.00, SRMR=0.02) with all factor loadings ranging from 0.51 (less need for sleep) to 0.88 (psychomotor agitation [more activity]).
EFA with the K-ASRM-11 items suggested a dual-factor structure (Table 2). The first factor (F1) included four items from the ASRM-5, in addition to item 5 (grandiosity), whereas the second factor (F2) consisted of 5 items (irritability, lability, racing thoughts, distractibility, and self-defeating impulsive behaviors). Most items showed strong factor loadings to the designated factor, ranging from 0.50 to 0.87, except item 6 (less need for sleep), which was dropped due to cross-loading.
Convergent validity
Pearson’s correlation analysis was conducted to evaluate convergent validity. There were significant medium-sized positive correlations between the total score of the K-YMRS and both the K-ASRM-5 (r=0.48, p<0.01) and K-ASRM-11 (r=0.43, p<0.01), which were not statistically different (Steiger’s Z=0.89, p=0.38). When the K-ASRM-11 subfactors were considered, both factors demonstrated significant positive correlations with the total K-YMRS score. However, the magnitude of the correlation with F1 (r=0.46, p<0.01) was significantly larger than with F2 (r=0.26, p<0.01), which was significantly different (Steiger’s Z=2.14, p<0.05).
In subsequent analyses examining whether convergent validity varied by diagnostic group, a differential pattern emerged, similar to that of reliability analyses. In bipolar I disorder group, the K-YMRS showed significant positive correlations with both K-ASRM-5 (r=0.53, p=0.02) and K-ASRM-11 (r=0.59, p=0.01). Also, bipolar II disorder group demonstrated a comparable level of significant positive correlations between K-YMRS and both K-ASRM-5 (r=0.61, p<0.01) and K-ASRM-11 (r=0.69, p<0.01). In contrast, depressive disorder group did not exhibit significant correlations between K-YMRS and either K-ASRM-5 (r=0.11, p=0.42) or K-ASRM-11 (r=0.10, p=0.50).
Diagnostic validity
Finally, ROC curves were generated to determine the optimal cutoff scores for distinguishing the manic group from the non-manic group. The AUC values were 0.78 for the K-ASRM-5 and 0.71 for the K-ASRM-11 (F1=0.77, F2=0.65), indicating fair discriminative power for both versions. The optimal cutoff scores were ≥3 for the K-ASRM-5 and ≥19 for the K-ASRM-11. Detailed information on sensitivity, specificity, positive predictive value, and negative predictive value is provided in Table 3.
DISCUSSION
In response to Altman and Østergaard’s suggestion [11], this study initially explored psychometric properties of the K-ASRM-11 in comparison to the traditional K-ASRM-5 in a clinical sample of bipolar and depressive disorder patients. Overall, the results demonstrated that both versions of the K-ASRM possessed adequate reliability and validity as self-report measures of (hypo)manic symptoms. The reliability was good to excellent in terms of internal consistency and item-to-total correlation. In addition, promising evidence was obtained in support of the construct and convergent and diagnostic validity of both versions of the K-ASRM. Although the ASRM has been recommended as a representative self-report tool for (hypo)manic symptoms, there are only limited validation studies in well-defined clinical samples and none for ASRM-11. In such circumstances, the present study provides valuable empirical evidence to justify the continued use of the ASRM in clinical practice.
Of additional note, when psychometric properties were compared across diagnostic groups, reliability and convergent validity of both versions of the ASRM were satisfactory in both bipolar I and II disorder groups, while they fell below the minimum standards in case of depressive disorder group. Though early studies found that self-rating of manic patients could be reliable and valid [3,9], to our knowledge, the present study initially reported that psychometric properties of the ASRM are comparable across bipolar I and II disorder group. Considering that most participants diagnosed with bipolar II disorder were in depressive state, it seemed that not current mood state but diagnostic group may have contributed to this between-group difference. Taken together, this suggests that the ASRM is a measure that is specifically relevant to capturing (hypo) manic symptoms of BD patients.
Some points require further discussion. First, CFA on the K-ASRM-5 yielded a unifactorial structure consistent with previous research [9,12]. In contrast, a dual-factor solution emerged in the EFA with K-ASRM-11. Existing literature indicates that mania is a multifaceted construct composed of distinct factors [28], and factor analyses of (hypo)manic symptoms suggest a dual structure with bright (active/elated factor) and dark (irritable/risk-taking factor) sides [29]. Our EFA pattern was interpretable, as it was in line with the following distinction: the bright-side items belonged to F1 (elevated mood, increased self-esteem, pressured speech, psychomotor agitation [more activity], and grandiosity), whereas F2 pertained to the so-called dark-side items (irritability, lability, racing thoughts, distractibility, and self-defeating impulsive behaviors). Thus, the use of ASRM-11 would be conducive to a comprehensive evaluation by overcoming the limited content coverage issue raised against ASRM-5, which only captures the bright side.
Second, although the discriminating power of both versions of the K-ASRM was proven fair, the cutoff score of the 5-item version (≥3) was lower than that reported in previous research. The optimal cutoff score was ≥6 in the original development study [5] and the most recent clinical validation [9]. In our opinion, this difference may stem from the changed anchor wording in the Likert option [11]. In particular, the score of 3 (frequently) in the previous ASRM-55 was changed to 2 (frequently) for three items (decreased sleep need, pressured speech, and psychomotor agitation [more activity]), which may have affected the total score of the K-ASRM-5. Prior to data collection, the first author consulted with the original author, Dr. Altman, regarding this issue with a concern that this might result in a lower cutoff point (personal communication, January 19, 2021); however, the final decision was to test the new Likert options. It seems probable that this change affected the cutoff score for ASRM-5. Roughly speaking, adding one point from those three items would lead to six if respondents were to answer with a “frequently” option. However, it is necessary to compare two versions of the anchor wording in future investigations for direct verification.
Third, we did not find sufficient support for the incremental validity of the longer 11-item version, and further research is warranted before reaching a conclusion. Convergent validity, as evidenced by the correlation with the clinician rating scale, was comparable between the two versions. The diagnostic validity of the longer version did not exceed that of the shorter version. This was due to the low discriminating power of F2, the newly added dark-side items. Considering the initial scale development process, this can be readily understood, as the original authors constructed the ASRM-5 with 5 items that could distinguish manic BD patients from others. This confirms the idea that the bright-side items are more specific to mania, as irritability, distractibility, and impulsivity may also be common in other psychiatric conditions, such as depression.
Taken together, K-ASRM-5 is thought to be an economical option that demonstrates psychometric properties comparable to those of K-ASRM-11. However, this does not preclude the potential utility of K-ASRM-11, depending on the purpose of the evaluation. For example, although the F2 items did not provide more informational value in diagnostic classification, they could be meaningful and useful for symptom monitoring/treatment response evaluation for the management of patients with BD, as irritability and impulsivity are known to be highly and inversely correlated with quality of life (QoL) in patients with BD [30,31]. Although more research is needed, our results suggest that the K-ASRM-5, which possesses higher sensitivity and brevity, could be preferable for screening for BD, while the K-ASRM-11 might be useful for symptom monitoring/treatment response evaluation purpose.
Limitations
Next, we acknowledge that several limitations should be considered in interpreting the current results. First, the sample size was not determined by power analysis but relied on the availability and consent of participants during routine practice. The proportion of patients with BD in a (hypo)manic state was relatively small (approximately 13%). However, the total sample size was sufficiently large for factor analysis (e.g., exceeding the 10:1 ratio between sample size and items). Moreover, unlike Skokou et al. [9], psychiatric diagnosis was confirmed using a (semi)structured diagnostic interview, which enhances diagnostic reliability. Second, the K-ASRM-5 was not administered in a standalone manner. Third, the correlation coefficients (r=0.43–0.48) between self-report and clinician ratings were lower than those reported in previous studies that also used the YMRS [5,9]. According to a systematic [4], the magnitude of the correlation with the YMRS differed depending on the sample characteristics and ranged from 0.40 (BD patient sample) to 0.72 (inpatient sample). In addition to Skokou et al.’s [9] sample which consisted of patients with BD and healthy controls, we recruited patients with BD and depressive disorder only, as a differential diagnosis is critical between these two groups. Therefore, it seems possible that the restricted range may have resulted in a reduced magnitude of the correlation. Finally, “less need for sleep” item was dropped in the EFA of the K-ASRM-11 due to cross-loading. Considering that it is a specific symptom that is uniquely related to (hypo)mania, future research needs to address this issue.
Conclusion
In conclusion, this study provided psychometric properties of the ASRM-11 in comparison with the ASRM-5 in a clinical sample diagnosed with mood disorders. Although the K-ASRM-11 did not outperform the K-ASRM-5 in terms of convergent and diagnostic validity, it has comparative strength in a more comprehensive content coverage of (hypo)manic symptoms, which is possibly important for QoL issues in BD. If reliability and validity are verified, self-reports such as the ASRM can be easily and conveniently utilized as an adjunct method of assessment in mass screening and repeated symptom monitoring of BD, as they are more time- and effort-efficient than clinician ratings. We hope that this research will contribute to the early detection and successful management of BD.
Notes
Availability of Data and Material
The datasets generated or analyzed during the study are available from the corresponding author on reasonable request.
Conflicts of Interest
The authors have no potential conflicts of interest to disclose.
Author Contributions
Conceptualization: Bin-Na Kim, Jungkyu Park, Ji Hyun Baek. Data curation: Ji Hyun Baek. Formal analysis: Bin-Na Kim, Jungkyu Park, Yongmin Shin. Funding acquisition: Bin-Na Kim. Investigation: Bin-Na Kim, Jungkyu Park, Ji Hyun Baek. Methodology: Bin-Na Kim, Jungkyu Park, Ji Hyun Baek. Project administration: Bin-Na Kim. Resources: Ji Hyun Baek. Supervision: Jungkyu Park, Ji Hyun Baek. Validation: Bin-Na Kim, Ji Hyun Baek. Writing—original draft: Bin-Na Kim, Jungkyu Park. Writing—review & editing: Jungkyu Park, Ji Hyun Baek, Yongmin Shin.
Funding Statement
This work was supported by the Gachon University research fund of 2023 (GCU-202304630001).
Acknowledgments
None
