A Meta-Analysis Comparing Open-Label versus Placebo-Controlled Clinical Trials for Aripiprazole Augmentation in the Treatment of Major Depressive Disorder: Lessons and Promises
Article information
Abstract
Objective
The present study is to provide whether open-label studies (OLS) may properly foresee the efficacy of randomized, placebo-controlled trials (RCTs) using OLSs and RCTs data for aripiprazole in the treatment of MDD, with the use of meta-analysis approach.
Methods
A search of the studies used the key terms "depression and aripiprazole" from the databases of PubMed/PsychInfo from Jan 2005 through July 2013. The data were selected and verified for publication in English-based peer-reviewed journals based on rigorous inclusion criteria. Extracted data were delivered into and run by the Comprehensive Meta Analysis program v2.
Results
The pooled SMDs for the primary efficacy measure was statistically significant, pointing out the significant reduction of depressive symptoms after aripiprazole augmentation (AA) to current antidepressant treatment in OLSs (pooled SMD=-2.114, z=-9.625, p<0.001); similar results were also found in RCTs (pooled SMD=-2.202, z=-6.862, p<0.001). The meta-regression analysis revealed no influence of the study design for treatment outcome.
Conclusion
There was no difference in the treatment effects of aripiprazole as an augmentation therapy in both OLSs and RCTs, indicating that open-label design may be a potentially useful predictor for treatment outcomes of controlled-clinical trials. The proper conduction of OLSs may provide informative, useful and preliminary clinical data and factors to be involved in controlled-clinical trials, by which we may have better understanding on the role of AA (e.g., dosing issues, proper duration of treatment, specific population for AA) implicated in the treatment of MDD in clinical practice.
INTRODUCTION
Major depressive disorder (MDD) is a common and debilitating illness resulting in functional disability, decrease in quality of life, and increase in healthcare costs.1 MDD is also the third leading cause of moderate to severe disability and of disease burden worldwide.2 A large array of different class of antidepressants are currently available, however, there has been a controversy regarding class and individual differences in efficacy for treatment of MDD patients among various antidepressants,3,4 although potentially differential points may exist between benefits, acceptability, and acquisition cost.5,6 Despite of sufficient availability of antidepressants with different classes to date, only 30% of patients treated with first antidepressant treatment show a symptomatic remission and suffer significant functional impairment.7,8 Such inadequate antidepressant efficacy results in suffering from significant residual symptoms, functional incapacity, increased utilisation of medical services, and frequent recurrence and relapse.9,10,11
Most guidelines have suggested that such non- or partial responders should be considered for a switch, combination or augmentation of treatment.3,12,13,14 Among such treatment strategies, augmentation is the use of an non-antidepressant agents to broaden or enhance the therapeutic effectiveness of an antidepressant by affecting different neurotransmitter systems combining agents with different mechanisms of action and/or indications. Traditional augmentation agents, lithium and triiodothyronine (T3), as well as buspirone, dopamine agonists, and stimulants have been commonly used for such patient population with limited evidence (e.g., weak efficacy, tolerability issues, shortage of controlled clinical trials and official approval issues).15
Atypical antipsychotics such as olanzapine, quetiapine extended release (XR), and aripiprazole have clearly demonstrated efficacy as an augmentation agent for MDD patients through a number of small-scale, open-label studies or randomized, placebo-controlled clinical trials (RCTs). Among such AAs, aripiprazole has been the first approved by the U.S. FDA as an augmentation therapy to antidepressants for treating MDD in November 2007.
Industry-sponsored registration studies involving RCTs have been a gold-standard and cornerstone of modern research concerning medical therapies for human to prove certain medications' efficacy and safety, however, such rigorous research design usually excludes those with substantial medical comorbidity and do not allow concomitant medication treatments, which is commonly encountered in routine practice. Hence, RCTs are also criticized in lacking external validity and it has been asserted that these efficacy studies do not provide sufficient information to clinicians in real-world settings.9 In addition, they have limitations in terms of the huge study conduction cost, loss of professional autonomy, inflation of rating scale scores prior to randomization, patient expectations of improvement, unexpected influence by a sponsorship, lack of staff and training, observer biases among investigators, a longer time- frame for conduction and difficulties with randomisation or recruitment, e.t.c. Hence, the usual and easy way to go with such further step of researches is open-label and prospectively designed studies to investigations to explore new medication (or new acquisition of indication), to acquire supplemental information about a certain medication, and to facilitate exploratory and hypothesis-generating studies.
However, specific information on such open-trials for future RCTs or addressing certain clinical issues not to be explained or investigated in RCTs is still very lacking today. Hence, the objective of the present study is to provide whether open-label studies (OLS) may properly foresee the efficacy of RCTs using OLSs and RCTs for aripiprazole augmentation (AA) in the treatment of MDD, with the use of meta-analysis based on published OLSs and RCTs.
METHODS
Source of data
A search of the studies used the key terms "depression and aripiprazole" from the databases of PubMed/PsychInfo from Jan 2005 through July 2013. The data were verified for publication in English-based peer-reviewed journals. We also used reference lists from identified articles and reviews to find additional studies. Abstracts identified during literature search were screened by two review authors independently. Potentially eligible papers were read in full by two review authors to determine whether they met the eligibility criteria. Disagreements were discussed with a third review author until consensus was reached. Study selection was handled first by two of the authors, H.J. S and C. H. then independently reassessed by C.U. P.
Inclusion criteria
We included studies of AA in the treatment of MDD. The studies to be included in the present meta-analysis should fulfill following criteria: 1) prospective design with commonly adopted in clinical trials as efficacy measures such as Hamilton Depression Rating scale (HAMD), e.t.c.; 2) at least one or more follow-up visit; 3) study duration at least for 4 weeks; 4) OLS or RCT for aripiprazole; and 5) published in English-based peer-reviewed journal. There were no requirements or restrictions in search of data for date. However, post-hoc analysis of RCT, redundant studies, case-report or letter-to-the-editor concerning interesting or rare case illustration in the use of aripiprazole for treating MDD were not included. If necessary, the study authors were contacted for additional information. Figure 1 summarizes the disposition of the included studies for the present meta-analysis.
Data extraction
Data on patients (e.g. age and gender), design (e.g. OL, randomization, allocation concealment), duration of treatment (weeks), name of antidepressant, range of aripiprazole doses, sample size, type of primary outcome measure for efficacy, baseline and endpoint mean and standard deviation (SD) of primary efficacy measures, remission and response rates based on the criteria as each study defined. The baseline and endpoint mean and SD were replaced by baseline ones or taken manually calculated from visual availability or taken from standard errors, confidence intervals (CIs) or t-values if such parameters were not available from the original study. The primary efficacy measure was HAMD-17, Montgomery-Åsberg Depression Rating Scale (MADRS), Quick Inventory of Depressive Symptomatology-16 items (QIDS-16) in the studies included in the present meta-analysis.
Data analysis
Primary efficacy measure
The primary efficacy measure was the mean change from baseline to endpoint in the total scores on HAD-17, MADRS, or QIDS which are the most frequently used rating scales in the OLS and RCTs included in the present meta-analysis. They have been founded to be highly and significantly correlated in a number of researches based on a direct comparison of HAMD-17 and MADRS scores in a rather homogeneous sample.16
Safety and tolerability
Comparisons of safety and tolerability measures between OLSs and RCTs were not done due to very limited and inconsistency of reported measures.
Effect size
The effect size for the primary efficacy measures in each study were presented as standardized mean differences (SMDs) with 95% CI as they were continuous parameters. Cohen's classification was used to evaluate the magnitude of the overall effect size with 1) SMD=0.2 to 0.5: small; 2) SMD=0.5 to 0.8: medium, and 3) SMD >0.8: large effect sizes. The calculation of SMDs were based on followings: 1) endpoint mean primary efficacy score minus baseline primary efficacy score/pooled SD of the total treatment groups or 2) endpoint mean primary efficacy score of the active drug group minus baseline primary efficacy score of the placebo group/pooled SD.
Statistical model
The random effects model of meta-analysis was applied for the analysis since it allows more balance than those under the fixed effects model because the smaller studies get more weight and the larger studies get less weight as well as allowing for sampling variability with and between studies under such model. In general, a random effects model is used to combine subgroups and yield the overall effect. The study-to-study variance (tau-squared) is assumed to be the same for all subgroups-this value is computed within subgroups and then pooled across subgroups.
Heterogeneity and sensitivity analysis
The heterogeneity between studies was analyzed using the I2 statistics, a measure of how much variance between studies can be attributed to differences between studies rather than chance. The magnitude of considerable heterogeneity is usually I2=75-100%. To test the robustness of significant results, sensitivity analyses were conducted for studies with high versus low risk of bias. If statistical heterogeneity was present in the respective meta-analysis, subgroup and sensitivity analyses were also used to explore possible reasons for heterogeneity: judgment on whether one study has a huge impact on the overall estimate or underlying influence attributable to the overall estimate. To do sensitivity analysis for each study in two groups (OLS and RCT), the pooled estimate was repeatedly calculated and analyzed with omission of one study at a time.
Publication bias
The Egger test was also used for detection of publication bias to assess the bias associated with the greater likelihood of more publication of positive studies than negative studies. We adopted the method of Egger since the Egger's linear regression method quantifies the bias captured by the funnel plot using the actual values of the effect sizes and their precision, while Begg and Mazumdar's test uses ranks.
Meta-regression
Meta-regression was performed to test the effect of study design (OLS vs. RCT) as an independent parameter on the mean change in the primary efficacy measure. In this method, usually a weighted logistic regression of the 2k cases per study is fit where k is the number of study arms, and the weight is the number of patients who have or do not have the outcome respectively.17 Meta-regression is a sophisticated analytic approach method merging meta-analytic and linear regression principles. It aims to explore whether a linear relationship exists between an outcome measure and on or more covariates. The associations found in a meta-regression should be considered hypothesis generating and not regarded as proof of causality.17
Software program for meta-analysis
All directly extracted or computed data from the studies included were entered into Comprehensive Meta Analysis version 2.0 (CMA v2, Englewood, NJ, USA) to complete meta-analysis with data synthesis and then analyzed.
RESULTS
Demographics
With the search term and condition, we identified 721 articles in the PubMed and PsychInfo database. Seven hundred and three papers were excluded due to ineligibility, giving. The total number of subjects was 990, of which 627 was from RCTs and 363 from OLSs. Among the subject male was 317 (197 from RCTs and 137 from OLSs, 33.7%). The mean subject number of OLSs18,19,20,21,22,23,24,25,26,27,28,29,30 and RCTs31,32,33,34,35 were 27 and 125.4, respectively. The mean ages of OLSs and RCTs were 48.0 (11.5) and 43.6 (11.0) years, respectively. The mean duration of trials was 9.5 and 7.2 weeks in OLSs and RCTs, respectively. Table 1 depicts a summary of the studies included in the present meta-analysis.
Briefly, the clinical benefit of AA for treating patients with MDD have been proven in a number of early phase small scale OLSs with a use of various primary efficacy rating scales such as HAMD, MADRS and QIDS-16.18,19,20,21,22,23,24 In the small OLSs, the primary endpoint improvement was variable across the studies due to multiple factors (e.g., patient characteristics, AA dose, duration of treatment, e.t.c.); for instance, the cumulative remission and response rates showed that, approximately 60% met criteria for remission and 80% met criteria for response at the end of treatment in the recent 12-week OLS,28 while similar 12-week OLS showed that at the endpoint, the remission rate was 41.3% and the response rate was 55.2%. There have been three identically designed initial phase RCTs31,32,33 and two subsequent RCTs.34,36 As for the three RCTs,31,32,33 patients with 1-3 historical failures in adequate antidepressant trials (total score ≥18 on the HAM-D17) were screened and then entered an 8-week prospective treatment phase. Incomplete responders were then randomized for treatment with either aripiprazole or placebo for 6 weeks. The primary efficacy endpoint was the mean change from baseline for the MADRS total score. In total 1,092 prospectively identified partial responders were randomized and 940 (86.4%) patients completed the 6-week three RCTs. In these three RCTs, significant improvements in the mean change of the MADRS total score (range=-8.5 to -10.1) with AA over placebo (-5.8 to -6.4) were observed.31,32,33
OLSs
The pooled SMDs for the primary efficacy measure was statistically significant, pointing out the significant reduction of depressive symptoms after aripiprazole augmentation to current antidepressant treatment (pooled SMD=-2.114, z=-9.625, p<0.001) (Figure 2). The heterogeneity between OLSs was significant, pointing out the substantial variability of in the magnitude of treatment difference and underlying variance influencing on the outcome (I2=80.1%, Q-value=60.3, p<0.001). The pooled SMD was repeatedly calculated and analyzed with omission of one study at a time to perform a sensitivity analysis; the pooled SMD of the primary efficacy measure ranged from -2.210 to -1.898 (all 95% CIs indicated the statistical significance: range from -2.710 to -1.552), proposing that no one study has strongly impacted the pooled SMD. The Egger test was not statistically significant (t=2.114, p=0.055), indicating no publication bias.
RCTs
The pooled SMDs for the primary efficacy measure was statistically significant, pointing out the significant reduction of depressive symptoms after aripiprazole augmentation to current antidepressant treatment (pooled SMD=-2.202, z=-6.862, p<0.001) (Figure 2). The heterogeneity between RCTs was significant, pointing out the substantial variability of in the magnitude of treatment difference and underlying variance influencing on the outcome (I2=94.3%, Q-value=70.6, p<0.001). The pooled SMD was repeatedly calculated and analyzed with omission of one study at a time to perform a sensitivity analysis; the pooled SMD of the primary efficacy measure ranged from -2.430 to -1.896 (all 95% CIs indicated the statistical significance: range from -3.154 to -1.333), proposing that no one study has strongly impacted the pooled SMD. The Egger test was not statistically significant (t=1.117, p=0.345), indicating no publication bias.
Aripiprazole vs. placebo SMDs for RCTs
The pooled SMDs for the primary efficacy measure was statistically significant, pointing out the significant reduction of depressive symptoms after aripiprazole augmentation vs. placebo to current antidepressant treatment (pooled SMD=-2.182, z=-3.135, p=0.002) (Figure 3). The heterogeneity was also significant, pointing out the substantial variability of in the magnitude of treatment difference and underlying variance influencing on the outcome among study (I2=98.8%, Q-value=341.7, p<0.001). The pooled SMD was repeatedly calculated and analyzed with omission of one study at a time to perform a sensitivity analysis; the pooled SMD of the primary efficacy measure ranged from -2.673 to -1.655 (all 95% CIs indicated the statistical significance: range from -4.025 to -0.255), proposing that no one study has strongly impacted the pooled SMD. The Egger test was not statistically significant (t=0.416, p=0.705), indicating no publication bias.
Meta-regression
Meta-regression was performed to test the effect of study design (OLSs vs. RCTs) as an independent parameter on the mean change in the primary efficacy measure. There was no evidence of statistical difference in the pooled SMDs for the primary efficacy measure between OLSs and RCTs (t=0.119, p=0.737), suggesting no substantial influence of design on the primary treatment outcome. However, the pooled SMD was numerically higher in RCTs (-2.202, 95% CIs=-2.831, -1.573) than in OLSs (-2.114, 95% CIs=-2.545, -1.684). Given aforementioned results, the overall pooled SMDs between the two designs were sufficiently resembling each other and the correlation between the two design of pooled SMDs were 0.714, indicating a similarity of the results between such two designs.
DISCUSSION
We tried to find any useful and informative data between the two study designs for AA, OLS and RCT, in the treatment of MDD: OLS may have a potential utility to guide the RCT for proving the effect of AA in the treatment of MDD. According to our results, the pooled SMDs for the primary efficacy measure was statistically significant in both study design, showing a significant reduction of depressive symptoms after AA treatment to current antidepressant treatment in OLSs and in RCTs. The effect sizes measured by SMDs between OLS and RCT design was quite similar and adequately correlated, indicating a practical utility of OLS design to move to RCT conduction in the treatment of MDD. When sensitivity analyses show that the overall result and conclusions are not affected by the different decisions that could be made during the review process, the results of the review can be regarded with a higher degree of certainty. Where sensitivity analyses identify particular decisions or missing information that greatly influence the findings of the review, greater resources can be deployed to try and resolve uncertainties and obtain extra information, possibly through contacting trial authors and obtained individual patient data.37 If this cannot be achieved, the results must be interpreted with an appropriate degree of caution. Such findings may generate proposals for further investigations and future research.37 In addition, no publication biases were found in the present meta-analysis, indicating the validity of the results of a meta-analysis results; if not so, no matter how systematic and thorough in other respects in meta-analysis, the results are not confident.37 Our meta-regression analysis clearly revealed no influence of the study design on for treatment outcome, proving that study design would not have any role as a predictor to the primary treatment outcome.17
Our results are in line with the previous meta-analysis found similarities in the treatment effects between OLS and RCTs in youth with bipolar disorder indicating that studies with open-label design are useful predictors of the potential safety and efficacy of a given compound in the treatment of pediatric bipolar disorder, which was the first meta-analysis investigating such design issue in psychiatry.38 In the study, the pooled effect size was statistically significant in both OLSs (z=8.88, p<0.001) and RCTs (z=13.75, p<0.001), indicating a significant reduction in the Young Mania Rating Scale (YMRS) from baseline to the end of treatment in both study designs. The meta- regression also confirmed that study design was not a significant predictor of mean change in the YMRS. Therefore, our meta-analysis clearly replicated the previous findings by Biederman et al that study design would not affect treatment outcome and OLS may be a substantial indicator to lead a subsequent RCT to fully address a certain medication's efficacy.
Recently, there have been a number of evolving meta-analytic approaches to investigate clinically critical and very informative issues in terms of study design implicated in clinical practice as well. For instance, placebo-response rates that is challenging obstacles for new treatment development in MDD (according to the results, relative efficacy of the active drug compared to placebo in clinical trials for MDD is highly heterogeneous across studies with different placebo response rates; the more placebo response the less performance of active drug),39 the application of a prospective lead-in trial phase to assess antidepressant nonresponse (historical data only to define treatment resistance prior to patient enrollment),40 impact of number of follow-up assessment (increasing the number of follow-up visits, specifically after the third week rather than within the first 3 week of the trial, may be an effective approach to improve the likelihood of trial success),40 the impact of study duration on treatment outcome (4 weeks is the minimum adequate length of a trial), and starting dose issue of SSRIs (Higher starting dose higher response), e.t.c. These results deliver useful and valuable information to clinicians about the trial design as well as doing clinical practice.
The most important clinical implication in terms of OLS and RCT design should be placed on the generalizability and external validity, since patients encountered in clinical practice often do not mirror populations of patients enrolled in well-controlled and adequately powered industry-sponsored or independent clinical trials.41 This emphasis has yielded large effectiveness trials such as the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) and the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) trials designed to inform clinicians about the relative strengths of already existing but not fully investigated treatment approaches in the management of major psychiatric disorders. These studies aim to enroll typical community patients by having relatively lenient inclusion/exclusion criteria and concomitant medication restrictions, thus maximizing external validity. Hence, innovations in clinical trials methodology may stem from a handful of case reports and small-scale OLSs, which may be a useful transition process to RCT or further advanced, controlled clinical trials.41 Accordingly, OLS and RCT may have their own merits and limitations as a research methodology, by which we have to consider their complementary role to another one to have better and clear understanding for achievement of advanced and innovative treatment tactics and strategies.
The limitation of our results include: 1) the data base searched in the present meta-analysis were confined to PubMed and PsychInfo, and published journals, so that we could not collect all the available clinical trial data, although our results did not show any evidence of publication bias and skew in sensitivity analyses 2) the clinical samples were adult population and main portion was female 3) the mean duration was less than 10 weeks in both trial design, thus not be able to ensure any different results in longer-term clinical trials 4) the inclusion of different primary efficacy measure such as HAMD, MADRS and QIDS, although such rating scales are found to be highly correlated to each other, and finally 5) no inclusion of safety and tolerability measure due to their high variability in the measurement methods, by which we could not apply our results in such clinical issues.
In conclusion, we found that OLS for AA may be a useful indicator to conduct such time-consuming, complex and very expensive RCT in the treatment of MDD. Our results should be proved in other studies with different atypical antipsychotics for the treatment of MDD as well. Furthermore, the value of OLS should be re-evaluated as one of crucial steps for development of new drugs or acquisition for new indications.
Acknowledgments
This work was supported by a grant of the Korean Health Technology R&D Project, Ministry of Health & Welfare, Republic of Korea (HI12C0003).