Validating the Autism Diagnostic Interview-Revised in the Korean Population
Article information
Abstract
Objective
This study aimed to examine the validity of the Korean version of the Autism Diagnostic Interview-Revised (K-ADI-R) and determine its efficacy in identifying individuals with autism spectrum disorder (ASD).
Methods
Data were pooled from several past and ongoing studies as well as clinical records acquired at Seoul National University Bundang Hospital from 2008 to 2017. The K-ADI-R were administered and scored by trained research reliable examiners. Measurements to investigate the validity of the K-ADI-R was through sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV), and Cohen’s kappa.
Results
A total of 1,271 (age 88.9±62.42 months, male=927) participants were included. The K-ADI-R yielded strong psychometric properties with high sensitivity (86.06–99.27%), specificity (84.75–99.55%), PPV (92.33–99.72%), and NPV (79.43–98.64%). There were significant differences in item scores across the K-ADI-R diagnostic algorithm regardless of age and sex (p<0.001). Agreement between the K-ADI-R and other ASD related measurements ranged between levels of good to excellent.
Conclusion
Despite language or cultural boundaries, the K-ADI-R demonstrated high levels of sensitivity, specificity, PPV, and NPV within a wide range of participants; hence, suggesting promising usage as a valuable diagnostic instrument for individuals with ASD.
INTRODUCTION
Autism Spectrum Disorder (ASD) is defined by persistent deficits in social communication, social interaction, and restricted, repetitive patterns of behavior, interests, or activities [1]. The prevalence of ASD has risen significantly in recent years and is considered one of the fastest-growing developmental disabilities that affect individuals worldwide [2,3]. As ASD influences multiple domains of an individual throughout their lifespan, the timing of detection is crucial to ensure early linkage to care and optimize long-term outcomes [4-6]. However, manifesting a heterogeneous phenotype with continuous variation, diagnosing ASD can be complex based on the need for extensive information ranging from early childhood development to school life and social relationships with peers [7].
While initially developed for research purposes, the Autism Diagnostic Interview-Revised (ADI-R) and the Autism Diagnostic Observation Schedule (ADOS) are considered the gold standard diagnostic instruments for ASD [8-11]. Both closely in parallel with the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV-TR) dimensions and the ICD-10 criteria, the ADI-R and ADOS are often administered together to improve consensus with clinical judgments by using multiple sources of information acquired on an individual’s past and current behavior [12-14].
The ADI-R is a semi-structured interview administered by trained examiners to caregivers of individuals with suspected ASD. The assessment solicits information from caregivers that focuses on the domains of: 1) social interactions, 2) communication, 3) restricted, repetitive and stereotyped behaviors (RRB), and 4) the period in which concerns or ASD related symptoms became apparent. Current and past behaviors of each item are scored by the examiner based on the level of severity, with higher scores indicating abnormality, based on the evidence provided during the interview. Depending on the individual’s age, a diagnostic algorithm systematically combines a subset of the questions that result in the classification of ASD or non-ASD based on whether the domain scores exceed the cut-off points.
Having demonstrated good to excellent levels of sensitivity and specificity of diagnostic validity across varying samples, the ADI-R has gained attention internationally to examine the equivalence of its psychometric properties across diverse cultures [15,16]. Previous research has shown that adaptation of instruments is a complicated and challenging process that involves linguistic translation and adjusting to cultural values or customs while considering its feasibility [17,18]. Despite increased awareness, understanding, and recognition of ASD, different cultures vary in the levels of over or under-reporting concerns due to different standards to which behaviors are considered acceptable and expected [19,20]. Hence, there is an increasing need to adequately assess the compatibility of the ADI-R in different population groups to ensure the instrument’s appropriateness.
To date, the ADI-R has been translated into 17 languages, with validation studies conducted in Germany, Greece, Finland, Brazil, Japan, the Netherlands, China, Sweden, the Latino population in the United States, South Korea, and Poland [18,21-30]. While all of the studies have reported promising results for the use of the ADI-R in clinical and research settings, most have been restricted to small sample sizes or based on participants with relatively narrow age ranges, which further limits the generalizability of the findings. For instance, Kim et al. [29] demonstrated moderate-to-high classifications of individuals with ASD using the ADI-R in South Korea. However, this study only included school-age (7–14-year-old) children with a lack of evidence for its use on younger children, adolescents, and adults. Additionally, with concerns being raised regarding individual factors (i.e., age or intelligent quotient) as well as administrative factors influencing the ADI-R results, additional research on applicability is required [31,32]. Therefore, the purpose of this study is to further expand on previous studies and investigate the diagnostic validity of the Korean translated version of ADI-R (K-ADI-R) in a wider age range and examine whether its ability to detect individuals with ASD is comparable to the original validation conducted by Lord et al. [8].
METHODS
Participants and data collection
Data was compiled from past and ongoing projects such as a genetic study to identify ASD related biomarkers, a social skills training intervention for individuals with ASD, the development of an early ASD screening instrument, and by obtaining the outpatient clinical records at Seoul National University Bundang Hospital (SNUBH) from 2008 to 2017. Participants were recruited through the combination of diverse routes (i.e., the psychiatric clinic and Pediatric and Child Rehabilitation clinic at SNUBH, local primary clinics, community health and mental health centers, daycare centers, advertisements on both online and offline bulletin boards at public institutions, and with the aid of parents’ self-help communities). While specific projects (i.e., the genetics study) only included participants whose biological parents are Korean, other studies did not limit participation based on ethnicity but required caregivers to comprehend interview questions and were comfortable being evaluated by the K-ADI-R. Each participant provided informed consent to the corresponding study they were enrolled in, and the retrospective analysis of the collected data was approved by the Institutional Review Board (IRB no. B-1711-435-106) at SNUBH to address the aim of the present study.
As seen in Table 1, a total of 1,271 children, adolescents, and adults with ages ranging from 24 months to 34 years old, comprised of 825 participants with ASD (males; 84.6%), 446 non-ASD participants (males; 51.1%), were included for analyses. Participants included individuals with ASD, unaffected siblings, and typically developing individuals. A subgroup of the participants in the non-ASD group who scored lower than 80 points on either the full-scale intelligence quotient (FSIQ), the Korean version of the Vineland Adaptive Behavior Scale, second edition (K-VABS), or the Korean Vineland Social Maturity Scale (K-SMS) while not meeting the ASD diagnostic criteria were separately categorized as other developmental disorders (OD). In addition to a battery of parent-report questionnaires, described in further detail below, trained research reliable examiners administered the Korean translated versions of ADOS and ADI-R during the participants one time visit on-site. Sessions of the diagnostic evaluations were video-recorded and checked to ensure adequate levels of inter-rater reliability. 43.4% of the videos were watched by two independent raters, while another 10% of the cases were viewed together during weekly research meetings. Upon reviewing the comprehensive information gathered, two board-certified psychiatrists made the best clinical diagnosis by following the DSM-IV-TR and DSM-5 criteria.
Measurements
Autism Diagnostic Interview-Revised (ADI-R) [9]
The ADI-R is a semi-structured interview that allows parents or primary caregivers to report on a child’s current behavior as well as reflect on the early developmental history, provided that he or she has a mental age above two years. Consisting of 93 questions, the ADI-R uses the information passed on by the caregiver to rate each item on a scale of 0 (socially appropriate) to 3 (evidence of severe abnormality). Depending on the individual’s age, scores are converted, and a diagnostic algorithm made up by a subset of the question items are summed to four domains of 1) social interactions; 2) communication (verbal and non-verbal); 3) RRB; and 4) whether developmental concerns were present before three years of age. All four domains need to reach the cut-off limits to meet the ASD diagnostic criteria. The K-ADI-R used in the present study was translated and back-translated by Yoo et al. [33] and approved through Western Psychological Services.
Autism Diagnostic Observation Schedule (ADOS) and Autism Diagnostic Observation Schedule-second edition (ADOS-2) [10,11]
The ADOS and ADOS-2 are an observation instrument using play-based methods to assess communication and social behaviors in a series of standardized contexts. Participants’ behavior or interactions during the assessment are recorded by trained examiners and used to determine item scores in verbal and non-verbal communication, social interaction, imagination/creativity, and RRBs. Taking into account the developmental trajectory of an individual by age and expressive verbal abilities, the ADOS-2 consists of five modules, each with slightly different tasks and diagnostic algorithms. Summation of the converted scores for a batch of items results in the domains of Social Affect and RRBs. Although each module has distinct combinations of items and cut-offs points, the total score derived from the addition of the two domains can classify individuals of autism, ASD, and non-spectrum combined. While the Western Psychological Services approved both versions of the Korean translated ADOS [34] and ADOS-2 [35], the latter was administered for toddlers and any participant data collected following the publication of the second edition.
Korean version of the Childhood Autism Rating Scale (K-CARS) [36]
The K-CARS is a rating scale used to identify the presence of ASD related behaviors and assess the severity of symptoms in children over 24 months. Fifteen items with scores ranging from 1 (appropriate behavior for age level) to 4 (severe deviance compared to age level) are rated by evaluators using information gathered from clinical observations, caregiver interviews, and other questionnaires. Domain scores are added together to derive a total, which then classifies an individual as non-ASD or indicates mild, moderate, or severe ASD symptoms. While studies have well documented the psychometrics of CARS, the cut-off scores used in this study were set to 28 grounded on previous validations in Korea [37].
Korean Vineland Social Maturity Scale (K-SMS) [38]
The K-SMS, based on Doll’s Vineland Social Maturity Scale (SMS) [39], is used as an interview and behavior-observation scale to evaluate social competence and adaptive functioning. Conducted with a caregiver who is familiar with the person being assessed, it can be administered to individuals from birth up to 30 years of age. Six domains, each organized into year levels, were built upon the standardization using a representative sample of Korean participants [40]. Results of the K-SMS can be used to determine the social maturity and social quotient of an individual.
Social Reciprocity Scale (SRS) [41]
The SRS is a 65-item caregiver-report questionnaire that is often used as a screening instrument to recognize ASD-related behaviors and capture its severity. Having high internal consistency and good discriminant validity, the SRS has been useful in distinguishing individuals with and without ASD [42,43]. Items are rated on a 4-point Likert scale with higher T-scores suggesting more significant impairments. The Korean translated version of the SRS was approved for usage in each of the past and ongoing studies by the Western Psychological Services.
Social Communication Questionnaire (SCQ) [44]
The SCQ is a 40-item questionnaire based on the ADI-R. Items are rated as “yes” or “no” by caregivers of individuals aged 24 months or above. There are two versions of the SCQ: the current form that focuses on the child’s behavior in the last three months and the lifetime form that asks about the developmental history in the past 12 months. Behaviors are assessed on the domains of social interaction, language and communication, and RRB. Initially intended as a screening instrument for children who are four years of age, the SCQ threshold has been adjusted by researchers when including younger children [45]. In Korea, Kim et al. [46] reported that cut off scores of 10 points for children under 47 months and 12 points for children over 48 months were most effective at maximizing the sensitivity and specificity of individuals with ASD.
Korean version of the Vineland Adaptive Behavior Scale, second edition (K-VABS) [47,48]
The K-VABS questionnaire was used to measure an individual’s adaptive functioning from birth through 90 years of age. Items are arranged in order of developmental sequence across nine subdomains. Caregivers rate how often each item is performed on a 3-point scale. Scores are standardized with the mean score of 100 and standard deviations of 15 points into domains of Communication, Socialization, Daily Living Skills, and Motor Skills. An overall rating derived by taking into account all four domains were used to determine whether an individual has OD.
Statistical analyses
Group differences of participant characteristics for categorical variables were analyzed using a chi-square test, while continuous variables such as the K-ADI-R domain and subdomain scores were examined using independent sample t-tests. Further investigations were performed by independent sample t-tests to see whether there were effects of sex on the K-ADI-R scores and was followed by an additional exploration looking into age differences (i.e., children, adolescents, adults). Additionally, following the ADI-R diagnostic algorithm, age groups were divided into verbal and non-verbal individuals below 47 months and those who were 48 months and older.
Consensus regarding participant categorization by the K-ADI-R algorithm standards and the final clinical diagnosis was evaluated by sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Cohen’s kappa (k). To view whether there was a difference in diagnostic algorithm standards, the analysis was conducted in two approaches. While the diagnostic algorithm requires individuals to equal or exceed the thresholds across all four domains to be considered in the ASD group, additional analysis with participants who met at least one of the cut-off score domains were included. Agreement between the K-ADI-R results with other instruments measuring ASD related traits or symptoms was weighed by k coefficients and interpreted based on the division proposed by Landis and Koch (Slight: 0–0.20; Fair: 0.21–0.40; Moderate: 0.41–0.60; Substantial: 0.61–0.80; Almost Perfect: 0.81–1).
To examine whether the K-ADI-R could differentiate individuals with ASD from individuals with OD, sensitivity, specificity, PPV, NPV, and Cohen’s kappa were calculated between the two groups. As each research study followed different protocols and acquired various measurements, only those participants who had available data to categorize groups with diagnostic certainty were included for further analysis. All statistical analyses were performed using SPSS Statistics 26.0 (IBM Corp., Armonk, NY, USA).
RESULTS
Of the total 1,271 participants, 72.9% (n=927) were male, and there was a statistically significant difference (p<0.001) in sex between the ASD (84.6% male) and non-ASD (51.1%) group. Based on the DSM-IV criteria, out of the 825 individuals diagnosed with ASD, 448 were classified as autistic disorder, 291 as Asperger, and 86 as pervasive developmental disorder not otherwise specified. The participants’ average age was 88.9 months (SD=62.4), with the ASD group being around 92.2 months (SD=64.1). The FSIQ of the ASD group was 82.5 (SD=26.6), which was significantly lower than that of the non-ASD group (105.3, SD=18.9) (p<0.001). The mean K-ADI-R algorithm score, as indicated in Table 1, shows statistically significant group differences between the ASD and non-ASD groups across all domains (p<0.001). The t-test results show that the percentage of individuals with language delay, language regression, or loss of skills, measured based on the K-ADI-R items, were significantly higher in the ASD group (p<0.001).
Analysis of the data resulted in significantly higher mean subdomain scores in the K-ADI-R for the ASD group to the non-ASD group (Table 2) (p<0.001). The stratified analysis to further investigate the potential effects of sex differences by dividing males and females showed that the ASD group, regardless of sex, were significantly higher than those of the non-ASD group (Supplementary Table 1 in the online-only Data Supplement). Similarly, despite being broken down by age and verbal ability, the results showed significant differences with consistently higher scores in the ASD group (Supplementary Table 2 and 3 in the online-only Data Supplement).
When the ASD group was defined as participants who equaled or exceeded the cut-off score in at least one domain of the KADI-R, it resulted in high sensitivity (99.27%), specificity (84.75%), PPV (92.33%) and, NPV (98.64%) across all age groups regardless of expressive verbal abilities (Table 3). Limiting the ASD group to those who met all four domains of the K-ADI-R also demonstrated high sensitivity (86.06%), specificity (99.55%), PPV (99.72%), and NPV (79.43%). Similar patterns were seen when analyzed by age groups.
Based on the pre-defined standards, 54 participants were classified into the OD group (Supplementary Table 4 in the online-only Data Supplement). Attempts to explore the validity in differentiating ASD and OD by using the K-ADI-R resulted in ranges of 86.06–99.27% sensitivity, 59.26–98.15% specificity, 97.38–99.86% PPV, and 31.55–84.21% NPV, depending on the number of domains used to classify individuals into the ASD group (Table 3). The small number of participants with OD restricted further analysis by age groups.
As seen in Table 4, Cohen’s kappa for the agreement between the K-ADI-R and other ASD related measurements such as the K-ADOS, K-CARS, SCQ, and SRS demonstrated moderate to excellent levels. Grouping ASD participants as those who met at least one domain of the K-ADI-R had k values ranging from 0.429 to 0.947, whereas participants clustered as ASD when satisfying all four domains had k values ranging from 0.481 to 1.00.
DISCUSSION
This study examines the diagnostic validity of the K-ADI-R, which differentiated ASD and non-ASD participants by showing significant differences in the algorithm’s scores between the two groups. This study is particularly meaningful because the analyses were based on a large sample size with wide age ranges that covered individuals from 24 months to 34-year-olds. Additionally, the analyses compared participants who met the K-ADI-R diagnostic algorithm thresholds on at least one domain versus all four domains. Results reported high sensitivity, specificity, PPV, and NPV in all age groups and demonstrated good to excellent instrument agreement levels with ADOS, CARS, SCQ, and SRS. Incorporating the results of the K-ADI-R in making the best estimate clinical diagnosis could have impacted our findings. To complement this issue and by following previous studies demonstrating increased diagnostic accuracy when combining multiple sources of information [49], our best estimate clinical diagnosis was based upon the combination of direct observations, caregiver questionnaires and interviews, as well as other psychological assessments.
This study found indications of excellent sensitivity (99.27%) and specificity (84.75%) of the K-ADI-R. The overall PPV and NPV were 92.33 and 98.44%, respectively, indicating excellent clinical utility. When ASD groups were limited to those who satisfied all four domains of the K-ADI-R, they also showed high sensitivity (86.06%), specificity (99.55%), PPV (99.72%), and NPV (79.43%). In addition to sensitivities, specificity, PPV, and NPV, this study also compared algorithm scores for each domain and showed significant differences in all categories, making this study more meaningful. Overall, these results suggest that the K-ADI-R is an effective tool with good to excellent validity in diagnosing ASD. Even when limited to comparisons with the OD group, the K-ADI-R demonstrated high validity in detecting individuals with ASD.
Previous studies investigating the diagnostic validity of ADI-R among various samples with different languages and cultures have reported good to excellent sensitivity for the diagnosis of ASD [18,22,24-26,50]. Lord et al. [8] conducted a validation study on the original English version of ADI-R and found high sensitivity (96%) and specificity (92%). Compared to the initial validation of the ADI-R, the current study showed higher sensitivity but lower specificity. Findings have been inconsistent with studies using other languages reporting different results with varying ranges. For example, while the study with Brazilian subjects showed perfect sensitivity and specificity [22,24], the study with a Japanese sample of 317 participants aged 2 to 19 years, found the sensitivity and specificity to be 92 and 89%, and PPV and NPV to be 89 and 90%, respectively [25]. The ADIR sensitivity and specificity in Greek samples were somewhat lower than other studies (85 and 75%, respectively). The Greek and Brazilian studies had small sample sizes (77 and 40 individuals, respectively), and the use of a homogeneous control group could have accounted for the differences in validity. The participants’ chronological and mental age could have also affected the results with studies consisting of younger subjects reporting lower specificity and sensitivity [22,51]. One study looking at participants who ranged from 16 to 31 months showed poor sensitivity (56%) and moderate specificity (67%) [51]. However, the study in Brazil, which showed 100% sensitivity and specificity, was conducted on participants aged 7 to 18 years old [24]. The systematic review supports the findings that ADI-R sensitivity for children under 3 years of age (82%) is much lower than that for children over 3 years of age (91%) [32]. The differences in validity measures can be attributed to varying sample sizes, the severity of symptoms, characteristics of comparison groups, as well as linguistic and cultural factors [18].
Cultural factors may shape the perception of ASD symptoms due to differences in determining risk signs, which may stem from the varying values and standards in what is deemed appropriate behaviors [52]. For instance, Bong et al. [53] found potential influences of cultural differences to lower sensitivity in social smiles among Korean parents. Question items may also need to be worded in a certain way, with examples applicable to the communities being assessed [54]. For example, white parents tended to report higher levels of communication symptoms than Latina parents [55]. In other studies looking at cross cultural comparisons, a study showed that Latina parents underreported social interactions, another study found that Latina parents reported lower levels of restricted, repetitive and stereotyped behaviors than white parents [56,57]. As our study was primarily focused on the validity of the K-ADI-R in detecting individuals with ASD, we did not focus on specific question items but urge future studies to investigate whether patterns of cultural differences affect the validity of the assessment when used in non-Western populations.
Several interviews, questionnaires, and observational instruments have been developed and used to support clinicians in assessing the diagnosis of specific behaviors found in individuals with ASD. ADI-R, ADOS, CARS, SCQ, and SRS are typical examples, and past studies have analyzed the diagnostic agreement between these assessments. According to a study conducted with 119 children aged 9 to 13 years, the SCQ and SRS were highly correlated with ADI-R, while other studies showed that agreement between ADI-R and CARS reached 85.7% [58,59]. According to Cohen’s kappa (0.432–0.842), this study also showed moderate to excellent diagnostic consistency between the ADI-R and other diagnostic tools. The CARS was an exception, showing the lowest level of agreement with ADIR (k=0.432). This finding might be due to the psychometric characteristics of the K-CARS. The cut-off scores of the KCARS used in the study were 28, based upon standardization and validation study results for the Korean version of CARS conducted in 1998 [37]. As there have been high levels of false negatives (26.2%), especially among high-functioning subjects, and with the emphasis on sensory sensitivity, the cut-off scores were recently re-adjusted [60]. The SRS and the SCQ are questionnaires reported by parents, caregivers, and teachers, and how they interpret the questions may differ from the ADI-R, which is scored by trained clinicians [7]. Depending on caregivers and teachers, they may tend to reduce or enlarge a child’s behaviors, or they may not even be aware of the symptoms or risk signs of ASD. Also, the answer to the question items may vary depending on which aspect they focus on. Therefore, it may be beneficial to reach a diagnosis using multiple tools rather than relying on one source.
Although the current study has many positive findings, this study has several limitations. First, this study has not demonstrated how ASD severity and functioning levels are reflected in the ADI-R. Second, we had a relatively small sample of adult participants which restricted further analysis by age groups. Third, in addition to validating the ADI-R, the study tried to identify the difference between individuals with TD, OD, and ASD. However, missing part of the information used to classify individuals into the OD group, such as the FSIQ, VABS, and SMS, limited the classification of non-ASD groups. Further studies are encouraged to confirm differences between TD, OD, and ASD.
To our knowledge, this study used the largest sample size in Korea to examine the validity of the K-ADI-R and showed high sensitivity, specificity, PPV, and NPV within a wide range of age and demonstrated excellent agreement with other ASD diagnostic instruments. This study suggests that ADI-R might be a valuable diagnostic instrument for individuals with ASD across countries with different languages, cultural backgrounds, and levels of awareness for ASD.
Supplementary Materials
The online-only Data Supplement is available with this article at https://doi.org/10.30773/pi.2020.0337.
Acknowledgements
This work has been supported by Original Technology Research Program for Brain Science of the NRF funded by the Korean government, MSIT (NRF-2017M3C7A1027467).
Notes
The authors have no potential conflicts of interest to disclose.
Authors’ contribution
Conceptualization: Miae Oh, Guiyoung Bong, Hee Jeong Yoo. Data curation: Guiyoung Bong, Joo-Hyun Kim. Formal analysis: Miae Oh, Da-Yea Song, Nan-He Yoon. Investigation: Da-Yea Song. Methodology: Guiyoung Bong. Project administration: Miae Oh, Guiyoung Bong. Resources: Guiyoung Bong, Jongmyeong Kim. Supervision: Hee Jeong Yoo. Validation: Miae Oh, Da-Yea Song, So Yoon Kim. Visualization: Da-Yea Song. Writing—original draft: Miae Oh, Da-Yea Song, Hee Jeong Yoo. Writing—review & editing: Miae Oh, Da-Yea Song, Hee Jeong Yoo.