Clinical Trial Designs for Biomarker Evaluation

Clinical Trial Designs for Biomarker Evaluation

The translation of clinical research to practice in the field of oncology has been slow despite a growing understanding of the genetic and molecular basis of this disease.¹ a myriad of factors impact this slow pace of progress, issues relating to clinical trial design play a significant role, including patient selection, choice of end points, choice of control arm, and assay-related issues.² Through a series of simulation studies, Stewart and colleagues³ studied the impact of subpopulation characteristics on overall study outcomes and concluded that a lack of molecular profiling can often lead to incomplete and incorrect conclusions. Consider the case of the Oncotype DX breast cancer assay that analyzes the expression of 21 genes to provide a recurrence score (RS) unique to each patient.^4-6

The RS provides information about the likelihood of cancer recurrence and the likelihood of chemotherapy benefit in women with early-stage, estrogen receptor–positive breast cancer. It has recently been demonstrated, based on a meta-analysis of decision impact data from 912 patients from 7 independent studies, that physicians who used Oncotype DX changed their treatment decisions for more than one-third of patients.⁷ Specifically, 33% of the overall population switched from the combination of chemotherapy and hormonal therapy to hormonal therapy alone based on a low RS, and 4% of the overall population switched from hormonal therapy alone to the combination of chemotherapy and hormonal therapy based on a high RS. From the individual patient perspective, the impact of this is dramatic – reduced chemotherapy use spares the patients the negative effect on health and quality of life from unnecessary chemotherapy. These changes also result in reduced costs to society and the healthcare system. From a trial design perspective, if a biomarker can identify a group of patients more (or less) likely to respond, it can fundamentally help with the efficient demonstration of a treatment benefit.

Prognostic marker validation can be established using the marker and outcome data from a cohort of uniformly treated patients with adequate follow-up.⁸ Designs for predictive marker validation, on the other hand, are inherently complex and involve the following key steps: 1) developmental study (phase 2 trial or previously conducted phase 3 trial with archived samples) with analysis focused on predicting response or treatment benefit (split-sample or cross-validation for assessing prediction accuracy); 2) data used to develop the marker or classifier should be distinct from the data used to test hypotheses about marker treatment effects; and 3) prospective randomized controlled trial (RCT) comparing new treatment to control using a biomarkerbased trial design and/or prospective-retrospective analysis of biomarker outcome data from multiple previously conducted RCTs. The use of an RCT, as opposed to a cohort or single-arm study, is fundamentally essential for initial as well as definitive predictive marker validation.

The RCT assures that the patients who are treated with the therapy for whom the marker is purported to be predictive are comparable to those who are not, as changes in patient population based on biologic subsetting and/or evolution in imaging technologies can make comparisons against historical controls inaccurate. In the absence of an RCT, it is difficult if not impossible to isolate any causal effect of the marker on therapeutic efficacy from the multitude of other factors that may influence the decision to treat or not to treat a patient. For example, in 1 paper examining the predictive utility of tumor microsatellite instability (MSI) for the efficacy of 5-fluorouracil–based chemotherapy in colon cancer, a cohort of nonrandomized patients was used in whom the median age of the treated patients was 13 years younger than that of the nontreated patients.⁹ This made it difficult to separate the predictive ability of tumor MSI from factors such as age that may have influenced the decision to treat or not to treat the patient with chemotherapy.9 RCTs are also essential for making the distinction between a prognostic and predictive marker and provide the opportunity to assess and validate multiple promising markers for a given disease simultaneously.

The term “biomarker” in oncology refers to a broad range of markers, including biochemical markers, cellular markers, cytokine markers, genetic markers, physiological results, radiological measurements, physical signs, and pathological assessment. Furthermore, a single marker can refer to a “single” trait or a composite score from a signature. Given the current landscape of targeted therapeutics in oncology, which impact multiple downstream pathways, the focus is shifting from targeting a single marker to a composite score or multiple markers. For a marker or markers to be useful in clinical practice, the assay results must be accurate and reproducible (analytically valid), and the status of the marker or markers be associated with the outcome of interest (clinically valid).

Finally, there must be a specific clinical question, proposed alteration in clinical management, and improved clinical outcomes (clinical utility). In this review article, we focus on trial designs for assessing and validating a single marker (or a composite score) for a targeted therapeutic (alone or in combination with chemotherapy) as well as designs for validating multiple markers for a single targeted therapeutic or a combination of multiple therapeutics. We will assume that the issues surrounding technical feasibility, assay performance metrics, and the logistics of specimen collection are resolved and that initial results demonstrate promise with regard to the predictive ability of the marker(s). Examples of real clinical trials, where available, will be used to illustrate the design concepts.

Designs for the Evaluation and Validation of a Single Biomarker

Retrospective Evaluation

As stated earlier, the term single biomarker can refer to either a “single” trait or a composite score from a signature. While the gold standard for any predictive marker validation continues to be a prospective RCT, retrospective validation may be acceptable in certain circumstances. The terms “retrospective” and “prospective” refer to both the data collection (using existing vs collecting new data) as well as the data analysis (prior to vs after seeing the data). While the literature is replete with several examples of a “successful” initial assessment of the predictive utility of a marker, most of these results are not replicable or validated in a prospective trial. A recent study found that the quality of the preclinical data that were utilized to perform clinical research was a major contributing factor to the high failure rate of oncology clinical trials.10 Notably, only findings from 6 of 53 landmark studies (11%) could be replicated in an independent study. The following guidelines need to be met for a valid prospective-retrospective validation of a marker¹¹:

Adequate amounts of archived tissue available from a large majority of the patients from a prospective RCT to avoid bias (ie, representative of the patients in the trial) and have adequate statistical power
Test is analytically validated for use with archived tissue
Prospective specification of the biomarker evaluation focusing on the evaluation of a single completely defined signature
Results from archived specimens are validated/replicated using specimens from multiple independent studies

An example of a marker that was successfully validated retrospectively is KRAS as predictive of efficacy of panitumumab and cetuximab in advanced colon cancer. This marker was first identified in single-arm trials after nontargeted phase 3 RCTs had been completed.^12-14 A prospective KRAS analysis plan was specified and implemented using the data from the multiple retrospective RCTs. The percentage of study populations for which KRAS status was assessed in these trials ranged from as low as 23% to as high as 92%. The results consistently demonstrated that the benefit from panitumumab and cetuximab is restricted to patients with wild-type KRAS status, with mutant KRAS patients deriving no clinical benefit.

An example of a well-conducted initial assessment of a predictive marker is epidermal growth factor receptor (EGFR) expression as a predictive tumor biomarker of survival benefit from the addition of cetuximab to firstline chemotherapy in patients with advanced non–small cell lung cancer (NSCLC).¹⁵ Samples were available from 99.6% of the patients enrolled in the FLEX trial, and the hypothesis was prospectively specified. Patients with high EGFR expression based on immunohistochemistry (≥200) had significantly longer overall survival (OS) with cetuximab plus chemotherapy, thus warranting further validation of the predictive utility of this marker.

Prospective Evaluation

Prospective designs for the evaluation (ie, initial validation) and the definitive validation of a single predictive marker (or a score from a marker panel) for a targeted agent (alone or in combination with chemotherapy) can be categorized as follows:

Enrichment Designs
All-Comers Designs, which can be further classified into:
- B1) Hybrid Designs
- B2) Marker by Treatment Interaction Designs
- B3) Marker-Based Strategy Designs
Adaptive Signature Designs

Trials can also utilize a combination of the above designs; for example, the in-development phase 3 marker validation trial Z41102 (A081105), a double-blind placebo-controlled trial of personalized adjuvant treatment in completely resected NSCLC with EGFRmutation (Figure 1). In this trial, completely resected NSCLC stage I-III (excluding N3 disease and T1aN0M0) patients with EGFR mutation (following an enrichment strategy) are randomized to erlotinib versus standard of care within group 1 (T2bN0, T3N0, T4N0; any T1N1, N2) and group 2 (T1aN0, T1bN0, T2aN0).

Figure 1. Z41102 Trial Design

Here, groups 1 and 2 can be considered as the “marker subgroups.” The primary comparison will be OS for patients receiving personalized adjuvant treatment with erlotinib versus standard of care. The null hypothesis of interest is that personalized adjuvant treatment with erlotinib is not superior to standard of care. The target sample size is set so that there will be at least 85% power to reject this null hypothesis if the truth is that personalized adjuvant treatment with erlotinib is superior to standard of care with a hazard ratio (HR) of at least 0.67 in favor of erlotinib (50% improvement, or 7.5 years vs 5.0 years in median OS). Secondary within-group comparisons will also be undertaken to determine if further studies are warranted within that group.

Enrichment Designs

An enrichment design screens patients for the presence or absence of a biomarker profile, and then only includes patients who either have or do not have the profile in the clinical trial.¹⁶ The goal of these designs is to understand the safety, tolerability, and clinical benefit of the treatment within the patient subgroup determined by a specific marker status. This design is based on the paradigm that not all patients will benefit from the study treatment under consideration, but rather that the benefit will be restricted to a biomarker-defined subgroup of patients. N0923 is an example of an ongoing phase 2 trial following an enrichment design strategy for assessing the predictive utility of the presence of more than 1 neuroendocrine marker (synaptophysin, chromogranin, CD56) to NTX-010, a replication-competent picornavirus. This double-blind phase 2 study randomizes patients with extensive-stage small cell lung cancer with the presence of the marker (described above) to the experimental agent versus placebo after standard platinumcontaining cytoreductive induction chemotherapy.

An example of a phase 3 trial that utilized an enrichment design strategy for definitive marker validation is N9831 (and NSABP B-31), in which only patients who were positive for HER2 were enrolled (based on a local assessment).¹⁷ This trial demonstrated that trastuzumab combined with paclitaxel after doxorubicin and cyclophosphamide significantly improved disease-free survival among women with surgically removed HER2-positive breast cancer. However, subsequent analyses raised questions regarding the assay reproducibility based on local versus central testing for HER2 status.^18,19

Since only patients deemed HER2 positive based on the local assessment were enrolled, and tissue from patients deemed HER2 negative was not collected, the question of whether trastuzumab therapy benefits a potentially larger group than the approximately 20% of patients defined as HER2 positive in these 2 trials is the subject of an ongoing trial.²⁰ Clearly, this example reiterates that enrichment designs are to be used for marker validation if, and only if, there is compelling evidence to suggest benefit only in a marker-defined subgroup(s), and when the assay performance is well established with short turnaround times for marker assessment.

All-Comers Designs

In this design, all patients meeting the eligibility criteria, which does not include the biomarker status in question, are entered.¹⁶ The ability to provide adequate tissue may be an eligibility criterion for these designs, but not the specific biomarker result or the status of a biomarker characteristic. These designs are further categorized into hybrid designs, marker-based strategy designs, and marker by treatment interaction designs.

Hybrid Designs

In the case of a hybrid all-comers design, only a certain subgroup of patients based on their marker status are randomized between treatments, whereas patients in the other marker-defined subgroups are assigned the standard of care treatment(s).¹⁶ This design is an appropriate choice for validating a predictive marker when there is compelling evidence demonstrating the efficacy of a certain treatment(s) for a marker-defined subgroup, thereby making it unethical to randomize patients with that particular marker status to other treatment options. However, unlike the enrichment design strategy, all patients regardless of the marker status are enrolled and followed. This provides the possibility for future testing for other potential prognostic markers.

An example of a marker validation trial (to validate a composite score) that utilized the hybrid design strategy is the TAILORx (Trial Assigning Individualized Options for Treatment [Rx]) trial to validate Oncotype DX, a 21-gene RS in tamoxifen- treated breast cancer patients (Figure 2).⁴ A noninferiority design (null hypothesis of no difference) was utilized to determine whether patients with an RS between 11 and 25 derive benefit from adjuvant chemotherapy. ⁴ A decrease in the 5-year disease-free survival rate from 90% with chemotherapy to 87% or lower on hormonal therapy alone would be considered unacceptable. A key aspect of this trial is that all patients will provide tissue samples for banking and future research.

Figure 2. TAILORx Trial Design

Marker-Based Strategy Designs

This design randomizes patients to have their treatment either based on or independent of the marker status.¹⁶ A disadvantage of this design is that it fundamentally includes patients treated with the same regimen on both the marker-based and the nonmarkerbased arms, resulting in a significant overlap (driven by the prevalence of the marker) in the number of patients receiving the same treatment regimen in both arms. As a consequence, the overall detectable difference in outcomes between the 2 arms is reduced (depending on the marker prevalence), thus resulting in a comparatively larger trial.

An example of a marker-based strategy design in the phase 2 setting is the comparison of adenosine triphosphate (ATP)-based tumor chemosensitivity assay (ATPTCA)- directed chemotherapy versus clinician’s best choice of treatment in recurrent platinum-resistant ovarian cancer patients.²¹ The primary end point of the trial was comparison of response rates and progressionfree survival (PFS) between the ATP-TCA arm and the clinician’s choice arm. A total of 180 patients were randomized. There were no significant differences in outcomes, although a trend toward improved response rates and PFS in the ATP-TCA arm was noted. A notable observation was that within the clinician’s choice arm, oncologists switched to the use of chemotherapy combinations similar to those in the ATP-TCA–directed arm over time (~70% overlap in the treatment choices between the arms).

As a consequence, patients randomized to the clinician’s choice arm after the first year had significantly better PFS compared with patients randomized to that arm within the first year! Two limitations of this design are: 1) significant overlap of patients receiving the same regimen in both arms (depending on the marker prevalence), thus diluting the detectable treatment effect (and lowering the power); and 2) true interaction between a treatment regimen and the ATPTCA marker status cannot be assessed, as not all marker subgroups (as classified by the ATP-TCA score) receive all treatments in the nonmarker-based (ie, the clinician’s choice) arm. However, the latter can be overcome by implementing a second randomization in the clinician’s choice arm to the possible treatment regimens. Admittedly though, this will require a large number of randomized patients.

Marker by Treatment Interaction Designs

The marker by treatment interaction design uses the marker status as a stratification factor and randomizes patients to treatment choices within each marker-based subgroup.16 While this is similar to conducting 2 independent RCTs under 1 large RCT umbrella, it differs from a single large RCT in 2 essential characteristics: 1) only patients with a valid marker result are randomized, and 2) there is a prospective sample size specification for each marker-based subgroup.¹⁶ A separate evaluation of the treatment effect can be tested in the 2 markerdefined subgroups, or a test of interaction can be carried out first.

An example of the marker by treatment interaction design with separate evaluation within 2 marker-defined subgroups is the biomarker validation study (MARVEL: Marker Validation of Erlotinib in Lung Cancer) of secondline therapy in patients with advanced NSCLC randomized to receive pemetrexed or erlotinib (N0723) to validate the predictive utility of EGFR as a marker for erlotinib.

Adaptive Signature Designs

These are a class of sequential testing strategy designs that have a single primary hypothesis that is tested either in the overall population first and then in a prospectively planned subset (if the overall test is not significant), or in the marker-defined subgroup first and then in the entire population if the subgroup analysis is significant.¹⁶ The former approach is recommended in cases where the experimental treatment is hypothesized to be broadly effective, and the subset analysis is ancillary. The latter (also known as the closed testing procedure) is recommended when there are strong preliminary data to support that the treatment effect is strongest in the marker-defined subgroup, and that the marker has sufficient prevalence that the power for testing the treatment effect in the subgroup is adequate.

Sample size considerations for these strategies are largely driven by 3 statistical parameters: 1) α, the type I error or probability of a false-positive result; 2) β, the type II error or probability of a false-negative result; and 3) δ, the targeted difference or targeted effect size. These designs differ in the choice of the values for these statistical parameters, which is dictated by the inference framework of the design, and appropriately control for the type I error rates associated with multiple testing. A modification to this approach, taking into account potential correlation arising from testing the overall treatment effect and the treatment effect within the marker-defined subgroup, has also been proposed.²²

The adaptive signature designs (ASDs) are a class of sequential testing strategy designs used when the marker and the threshold are both unknown at the start of the trial.^23,24 The ASD allows for the “discovery and validation” process of the marker within the realm of the single phase 3 trial, using either a cross-validation approach or the split-alpha approach. In the ASD approach, the new treatment is compared with the control arm in all patients at a prespecified significance level. If this overall comparison is significant, then it is taken that the treatment is broadly effective.

However, if the overall comparison is not significant, a second-stage analysis is undertaken for the development and use of a biomarker signature, using a split-sample or a cross-validated approach.^23,24 In the cross-validated ASD, the algorithm for developing a predictive classifier is prospectively defined. At the end of the process, an indication classifier for future patients is obtained by applying the algorithm to the full set of patients treated in the clinical trial, whereas a conservative estimate of the treatment effect for future classifier-positive patients is obtained by employing a K-fold cross-validation procedure.

The effectiveness of the indication classifier, however, depends on the algorithm used and the data set (ie, the unknown truth about how treatment effect varies among patient subsets). Appropriate caution needs to be exercised against model overfitting to prevent classifiers from making poor predictions. The predictive ability of the classifier will be reflected in the cross-validated estimate of the treatment effect for classifier-positive patients. The cross-validation approach, unlike an exploratory exercise that is conducted on the full data set without any cross-validation, which is fraught with many issues, likely produces an unbiased estimate of the performance of a defined algorithm for developing a predictive classifier using the data set of a clinical trial itself.

Designs for the Evaluation and Validation of Multiple Markers

In the previous section, we discussed possible trial designs for the case of a single marker (or a composite score derived from multiple genes) hypothesized to have predictive ability. However, in reality, there may be multiple markers that can predict a clinical outcome. For example, let us suppose that markers M1 and M2 jointly predict a clinical outcome. One straightforward approach to testing the value of the markers in this situation would be to perform a sequential testing for M1 and M2 and then eliminate/randomize patients based on the combined marker status.

A second option could be to use a composite score approach (as outlined in the examples in the previous section) developed using information from both M1 and M2 and then randomizing patients based on the composite score. Here, we discuss the use of designs outlined in the previous section for the multiple markers scenario. We also introduce the concept of adaptive trial designs, where multiple therapeutics/multiple markers can be evaluated in the same trial.

Hybrid Designs

The all-comers hybrid design strategy can be utilized to validate multiple markers. The MINDACT (Microarray in Node-Negative Disease May Avoid Chemotherapy) trial for node-negative breast cancer patients was designed to evaluate MammaPrint, the 70-gene expression profile discovered at the Netherlands Cancer Institute.²⁵ This trial utilized composite risk scores from 2 markers: clinic-pathological factors as well as the 70-gene expression profile.

Another example of a hybrid design for validating multiple markers is ECOG 5202, in which patients are stratified by disease stage (IIA vs IIB), microsatellite stability (MSS; stable vs MSI, where MSI is further classified into MSI-Low, MSI-High), and 18q loss of heterozygosity (LOH). Patients deemed to be at a high risk (MSS/18q LOH or MSI-Low/18q LOH) for recurrence after surgery (estimated 5-year survival rate of 60%) are randomized to 1 of 2 treatment arms, whereas patients deemed to be at a low risk (MSS or MSI-Low with retention of 18q alleles) for recurrence after surgery (5-year survival rate estimate of 90%) will not receive any adjuvant therapy.¹⁶ One limitation of this design is that it does not allow for a determination of the benefit of bevacizumab in the low-risk strata; however, if the outcomes in the absence of treatment are as favorable as predicted in that group, no postsurgical therapy would generally be recommended.

Combination Designs An example of a phase 3 trial utilizing an enrichment followed by a marker by treatment interaction design is the Tailor trial in second-line NSCLC (Figure 3).²⁶ The aim of this study is to validate the predictive value of the KRAS mutation, EGFR protein expression, and EGFR gene copy number, as well as smoking and histotype in patients who do not have EGFR mutations. The primary hypotheses, based on a 2-sided interaction test with 95% power, is that docetaxel is better than erlotinib in group A (30% improvement in OS, for an HR of 1.43 in favor of docetaxel), and erlotinib is better than docetaxel in group B (21% improvement in OS, for an HR of 0.79 in favor of erlotinib).

Figure 3. Combination Design Strategy (Tailor): Enrichment Followed by a Marker by Treatment Interaction Design

A limitation of this trial is that the secondary within-group comparisons are not adequately powered to detect clinically relevant differences in outcomes. Another example of a combination design of an enrichment strategy followed by a marker-based strategy design is trial 0601, coordinated by the Spanish Lung Cancer Group. This is also a phase 3 trial comparing erlotinib with chemotherapy in stage IV NSCLC patients with EGFR mutations (Figure 4).²⁷

Figure 4. Combination Design Strategy (0601): Enrichment Followed by a Marker-Based Strategy Design

Adaptive Designs

Adaptive design strategies are a class of randomized designs by which a variety of marker signatures and drugs can be tested under 1 umbrella protocol.^16,28 In these designs, the success of the drug-biomarker subgroup is assessed in an ongoing manner that allows either the randomization ratio to be altered to place more patients on the most promising arm(s) and/or the underperforming drugs and/or the biomarker subgroups to be eliminated midway through the trial. Key requirements for adaptive designs include: 1) a rapid and reliable end point, which can be somewhat challenging in the oncology setting where time to event end points or end points that involve following a patient’s status for a predetermined time period (such as the progression status at 2 years) are typically used; and 2) real-time access to all clinical and biologic data, which can be a daunting task in multicenter trials at the current time.

Examples of phase 2 trials that have utilized or are utilizing an adaptive design strategy are the I-SPY 2 (investigation of serial studies to predict therapeutic response with imaging and molecular analysis 2) and BATTLE (Biomarker-integrated Approaches of Targeted Therapy of Lung Cancer Elimination) trials.^29,30 I-SPY 2 is an ongoing neoadjuvant trial in breast cancer that is designed to compare the efficacy of standard therapy with the efficacy of novel drugs in combination with chemotherapy. All drugs will be evaluated within the biomarker-defined signature groups. Regimens that have a high predicted probability of being successful in a phase 3 trial are moved forward to phase 3 testing within subpopulations corresponding to the most promising biomarker signature(s). Regimens that have a low probability of efficacy for all biomarker signature subgroups will be dropped from further development.²⁹

The BATTLE trial used an outcome-based adaptive design for randomizing patients to treatment choices based on multiple biomarker profiles in NSCLC. Patients had their tumors tested for 11 different biomarkers, were subsequently categorized into 1 of 5 biomarker subgroups, and were then randomized to 1 of 4 treatment choices.³⁰ The first 97 patients were assigned equally using a balanced randomization to 1 of the 4 treatments. Subsequent patients were adaptively randomized, where the randomization rate was proportional to the ratio of the estimated 8-week disease control rates. The results from the BATTLE trial showed, as hypothesized, that each drug works best for patients with a specific molecular profile.³¹ Two successor trials, BATTLE 2 and BATTLE 3, are currently in development, both following an adaptive design strategy.

The integrated phase 2/3 designs (also known as the multiarm multistage [MAMS] designs) also fall under the class of adaptive design strategies, as they enable the simultaneous assessment of multiple markers/experimental agents against the standard of care in the phase 2 portion using an intermediate (or surrogate) end point.^32,33 The phase 3 portion subsequently continues with the promising marker subgroups/experimental arms from the phase 2 portion, comparing them with the standard of care. GOG-182 is an example of a cooperative group trial funded by the National Cancer Institute that utilized the MAMS design. This was a 5-arm trial in advanced-stage ovarian cancer or primary peritoneal carcinoma.³⁴Finally, the adaptive signature designs, introduced earlier, can also be used to develop and validate multiple markers.²⁴

Summary

An optimal design can help to validate biomarkers designed to predict which patient is likely to benefit from a treatment and/or require intensive treatment. This will help improve the success rate of clinical drug development, bring down trial costs in terms of patients and resources, and prevent patients from being exposed to toxic treatments that may not benefit them. A welldesigned prospective RCT based on biomarkers that are analytically and clinically valid is a key step in translating basic science to clinically useful markers that improve clinical practice. In addition, to fully realize clinical utility, it is imperative that there be 1) a strong biological basis for the target of therapy; 2) a clearly defined subgroup that will benefit from therapy, and a practical method that can identify them from the general patient population; and 3) further studies demonstrating that the biomarker results truly influence clinical decisions and are cost-effective.

Acknowledgments

Supported in part by the National Cancer Institute grants: Mayo Clinic Cancer Center (CA-15083) and the North Central Cancer Treatment Group (CA- 25224).

Disclosures: DJS: Consultant for Genomic Health and Amgen; SJM: no potential conflicts.

References

Hutchinson L, Kirk R. High drug attrition rates – where are we going wrong? Nat Rev Clin Oncol. 2011;8:189-190.
Rubin EH, Gilliland DG. Drug development and clinical trials – the path to an approved cancer drug. Nat Rev Clin Oncol. 2012;9:215-222.
Stewart DJ, Whitney SN, Kurzrock R. Equipoise lost: ethics, costs, and the regulation of cancer clinical research. J Clin Oncol. 2010;28:2925-2935.
Sparano JA, Paik S. Development of the 21-gene assay and its application in clinical practice and clinical trials. J Clin Oncol. 2008;26:721-728.
Paik S, Tang G, Shak S, et al. Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor–positive breast cancer. J Clin Oncol. 2006;24:3726-3734.
Paik S, Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817-2826.
Hornberger J, Chien R. Meta-analysis of the decision impact of the 21- gene breast cancer Recurrence Score® in clinical practice. Presented at 33rd Annual San Antonio Breast Cancer Symposium; December 8-12, 2010; San Antonio, TX. Poster P2-09-06.
Mandrekar SJ, Sargent DJ. Genomic advances and their impact on clinical trial design. Genome Med. 2009;1:69.
Elsaleh H, Joseph D, Grieu F, et al. Association of tumour site and sex with survival benefit from adjuvant chemotherapy in colorectal cancer. Lancet. 2000;355:1745-1750.
Begley CG, Ellis LM. Drug development: raise standards for preclinical cancer research. Nature. 2012;483:531-533.
Simon RM, Paik S, Hayes DF. Use of archived specimens in evaluation of prognostic and predictive biomarkers. J Natl Cancer Inst. 2009;101:1446- 1452.
Jonker DJ, O’Callaghan CJ, Karapetis CS, et al. Cetuximab for the treatment of colorectal cancer. N Engl J Med. 2007;357:2040-2048.
Karapetis CS, Khambata-Ford S, Jonker DJ, et al. K-ras mutations and benefit from cetuximab in advanced colorectal cancer. N Engl J Med. 2008;359:1757-1765.
Van Cutsem E, Köhne CH, Hitre E, et al. Cetuximab and chemotherapy as initial treatment for metastatic colorectal cancer. N Engl J Med. 2009;360:1408-1417.
Pirker R, Pereira JR, von Pawel J, et al. EGFR expression as a predictor of survival for first-line chemotherapy plus cetuximab in patients with advanced non-small-cell lung cancer: analysis of data from the phase 3 FLEX study. Lancet Oncol. 2012;13:33-42.
Mandrekar SJ, Sargent DJ. Clinical trial designs for predictive biomarker validation: theoretical considerations and practical challenges. J Clin Oncol. 2009;27:4027-4034.
Romond EH, Perez EA, Bryant J, et al. Trastuzumab plus adjuvant chemotherapy for operable HER2-positive breast cancer. N Engl J Med. 2005;353:1673-1684.
Perez EA, Suman VJ, Davidson NE, et al. HER2 testing by local, central, and reference laboratories in specimens from the North Central Cancer Treatment Group N9831 intergroup adjuvant trial. J Clin Oncol. 2006;24:3032-3038.
Paik S, Kim C, Wolmark N. HER2 status and benefit from adjuvant trastuzumab in breast cancer. N Engl J Med. 2008;358:1409-1411.
Hayes DF. Steady progress against HER2-positive breast cancer. N Engl J Med. 2011;365:1336-1338.
Cree IA, Kurbacher CM, Lamont A, et al; TCA Ovarian Cancer Trial Group. A prospective randomized controlled trial of tumour chemosensitivity assay directed chemotherapy versus physician’s choice in patients with recurrent platinum-resistant ovarian cancer. Anticancer Drugs. 2007;18:1093-1101.
Song Y, Chi GY. A method for testing a prespecified subgroup in clinical trials. Stat Med. 2007;26:3535-3549.
Freidlin B, Simon R. Adaptive signature design: an adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin Cancer Res. 2005;11:7872-7878.
Freidlin B, Jiang W, Simon R. The cross-validated adaptive signature design. Clin Cancer Res. 2010;16:691-698.
Bogaerts J, Cardoso F, Buyse M, et al. TRANSBIG consortium. Gene signature evaluation as a prognostic tool: challenges in the design of the MINDACT trial. Nat Clin Pract Oncol. 2006;3:540-551.
Farina G, Longo F, Martelli O, et al. Rationale for treatment and study design of tailor: a randomized phase III trial of second-line erlotinib versus docetaxel in the treatment of patients affected by advanced non-small-cell lung cancer with the absence of epidermal growth factor receptor mutations. Clin Lung Cancer. 2011;12:138-141.
Rosell R, Taron M, Sanchez JJ, et al. Setting the benchmark for tailoring treatment with EGFR tyrosine kinase inhibitors. Future Oncol. 2007;3:277-283.
Mandrekar SJ, Sargent DJ. Design of clinical trials for biomarker research in oncology. Clin Investig (Lond). 2011;1:1629-1636.
Barker AD, Sigman CC, Kelloff GJ, et al. I-SPY 2: an adaptive breast cancer trial design in the setting of neoadjuvant chemotherapy. Clin Pharmacol Ther. 2009;86:97-100.
Zhou X, Liu S, Kim ES, et al. Bayesian adaptive design for targeted therapy development in lung cancer – a step toward personalized medicine. Clin Trials. 2008;5:181-193.
Kim ES, Herbst RS, Lee JJ, et al. The BATTLE trial (Biomarker-integrated Approaches of Targeted Therapy for Lung Cancer Elimination): personalizing therapy for lung cancer. Presented at the 101st Annual Meeting of the American Association of Cancer Research; April 17-21, 2010; Washington, DC. Abstract LB-1.
Hunsberger S, Zhao Y, Simon R. A comparison of phase II study strategies. Clin Cancer Res. 2009;15:5950-5955.
Parmar MK, Barthel FM, Sydes M, et al. Speeding up the evaluation of new agents in cancer. J Natl Cancer Inst. 2008;100:1204-1214.
Copeland LJ, Bookman M, Trimble E. Gynecologic Oncology Group Protocol GOG 182-ICON5. Clinical trials of newer regimens for treating ovarian cancer: the rationale for Gynecologic Oncology Group Protocol GOG 182-ICON5. Gynecol Oncol. 2003;90(2 Pt 2):S1-S7.

Clinical Trial Designs for Biomarker Evaluation

Related Items