(C) PLOS One

(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .

A systematic evaluation of deep learning methods for the prediction of drug synergy in cancer [1]

['Delora Baptista', 'Ceb - Centre Of Biological Engineering', 'University Of Minho', 'Braga', 'Labbels - Associate Laboratory', 'Guimarães', 'Pedro G. Ferreira', 'Department Of Computer Science', 'Faculty Of Sciences', 'University Of Porto']

Date: 2023-05

One of the main obstacles to the successful treatment of cancer is the phenomenon of drug resistance. A common strategy to overcome resistance is the use of combination therapies. However, the space of possibilities is huge and efficient search strategies are required. Machine Learning (ML) can be a useful tool for the discovery of novel, clinically relevant anti-cancer drug combinations. In particular, deep learning (DL) has become a popular choice for modeling drug combination effects. Here, we set out to examine the impact of different methodological choices on the performance of multimodal DL-based drug synergy prediction methods, including the use of different input data types, preprocessing steps and model architectures. Focusing on the NCI ALMANAC dataset, we found that feature selection based on prior biological knowledge has a positive impact—limiting gene expression data to cancer or drug response-specific genes improved performance. Drug features appeared to be more predictive of drug response, with a 41% increase in coefficient of determination (R 2 ) and 26% increase in Spearman correlation relative to a baseline model that used only cell line and drug identifiers. Molecular fingerprint-based drug representations performed slightly better than learned representations—ECFP4 fingerprints increased R 2 by 5.3% and Spearman correlation by 2.8% w.r.t the best learned representations. In general, fully connected feature-encoding subnetworks outperformed other architectures. DL outperformed other ML methods by more than 35% (R 2 ) and 14% (Spearman). Additionally, an ensemble combining the top DL and ML models improved performance by about 6.5% (R 2 ) and 4% (Spearman). Using a state-of-the-art interpretability method, we showed that DL models can learn to associate drug and cell line features with drug response in a biologically meaningful way. The strategies explored in this study will help to improve the development of computational methods for the rational design of effective drug combinations for cancer therapy.

Cancer therapies often fail because tumor cells become resistant to treatment. One way to overcome resistance is by treating patients with a combination of two or more drugs. Some combinations may be more effective than when considering individual drug effects, a phenomenon called drug synergy. Computational drug synergy prediction methods can help to identify new, clinically relevant drug combinations. In this study, we developed several deep learning models for drug synergy prediction. We examined the effect of using different types of deep learning architectures, and different ways of representing drugs and cancer cell lines. We explored the use of biological prior knowledge to select relevant cell line features, and also tested data-driven feature reduction methods. We tested both precomputed drug features and deep learning methods that can directly learn features from raw representations of molecules. We also evaluated whether including genomic features, in addition to gene expression data, improves the predictive performance of the models. Through these experiments, we were able to identify strategies that will help guide the development of new deep learning models for drug synergy prediction in the future.

Funding: This study was supported by the Portuguese Foundation for Science and Technology (FCT), through a PhD scholarship (SFRH/BD/130913/2017 awarded to DB) and under the scope of the strategic funding of UIDB/04469/2020 unit. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability: The original drug response and RNA-Seq datasets used in this study are available from CellMinerCDB ( https://discover.nci.nih.gov/rsconnect/cellminercdb/ ), and the mutation and copy number variation data are available from CBioPortal ( https://www.cbioportal.org/ ). The preprocessed response dataset, the filtered gene expression, mutation and copy number variation files (before merging with the response dataset), and the fully preprocessed drug and gene expression data required to run the expr (DGI) + drugs(ECFP4) model described in the study can be obtained from Zenodo ( https://doi.org/10.5281/zenodo.6545638 ). All of the code used in this study is available online at https://github.com/BioSystemsUM/drug_response_pipeline .

Copyright: © 2023 Baptista et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Introduction

The phenomenon of drug resistance is one of the greatest challenges in the fight against cancer. Although many tumors initially respond well to a given treatment, the efficacy of single-drug anti-cancer therapies is often diminished due to the existence of tumor drug resistance mechanisms. Resistance-conferring characteristics may already be present in the tumor cells prior to therapy, or they may arise as an adaptive response of the tumor to the treatment itself [1]. One of the main drivers of resistance is intratumoral heterogeneity. Genomic instability in cancer leads to the emergence of subpopulations of cells within a tumor with distinct characteristics and different sensitivity to drugs. Treatment may exert selective pressure on the cells and select subpopulations possessing characteristics that favor drug resistance, leading to future relapse [2].

Combining multiple treatments instead of administering a single drug can help to reduce drug resistance [3]. Drug combinations may circumvent pre-existing resistance mechanisms more easily and prevent the development of acquired resistance mechanisms [4]. In addition, certain combinations may be more effective than would be expected when taking into account the effects of each of the constituent compounds on their own, a phenomenon called drug synergy. Drug synergy can increase treatment efficacy without requiring an increase in drug dosage, potentially avoiding an increase in toxicity as well [5]. Synergistic interactions can arise through a variety of mechanisms: the compounds in a combination may have the same target, different targets belonging to the same pathway/biological process, or different targets belonging to related pathways (S1 Fig) [6]. Drug combinations may also produce effects that are greater than expected when the activity of one of the drugs enhances the transport, permeation, distribution or metabolism of the other drug [6]. Synergy is quantified based on the difference between the observed effect of a given drug combination and the expected combination effect determined using a reference model [7]. Combination effects are then usually summarized as a single synergy score per drug pair, taking all of the tested doses into consideration. Classifying combinations as synergistic based on synergy scores is not straightforward, however. Multiple drug combination reference models exist, each with different underlying assumptions, which can lead to contradictory results [7, 8]—a drug combination that would be considered synergistic according to one reference model might be classified as antagonistic according to another. A given reference model may lead to erroneous conclusions when the mechanism behind the drug interaction fails to adhere to its assumptions [7]. In addition, if synergy is considered a feature of each dose pair combination instead of a characteristic of the drug pair itself, different conclusions regarding synergy may be reached [8]. Differences in experimental design [7, 8] and the dose-response profiles of individual compounds [7] may also affect the determination of combination effects.

Novel effective anti-cancer drug combinations can be discovered using high-throughput cell viability assays. In these assays, a large number of candidate drug combinations are screened at different concentrations across a panel of cancer cell lines and the cellular response to the drug is measured. In recent years, several datasets from large-scale drug screening initiatives have been made publicly available [9–11]. The largest of these is the National Cancer Institute (NCI) A Large Matrix of Anti-Neoplastic Agent Combinations (ALMANAC) project [10], which screened over 5,000 pairs of FDA-approved drugs against National Cancer Institute 60 Human Cancer Cell Line Screen (NCI-60), a panel of 60 tumor cell lines that have been extensively characterized at the molecular level [12]. The project uncovered several synergistic drug pairs, including two clinically novel combinations that are currently being evaluated in phase I clinical trials [10].

Despite the existence of these high-throughput technologies, screening all conceivable drug combinations is still infeasible, for both practical and financial reasons [11, 13]. Computational methods could greatly reduce the search space, thus minimizing the experimental effort required to find truly effective anti-cancer drug combinations. Biological network analysis-based approaches can be used to prioritize drug combinations and study the underlying mechanisms of joint action [14–16]. Another alternative is to use ML methods to model the response of cells to drug combinations. ML can be used to learn functional mappings between very high-dimensional input data and a score that reflects drug combination effects. This makes it a powerful approach to develop models that are able to predict drug synergy based on drug combination screening experiments and other relevant data. Several ML models for drug synergy prediction have been described in the literature [11, 17–21]. Many of these studies used tree-based ML methods, such as random forests (RFs) [17, 18, 20] or gradient boosting [18, 19, 21]. ML approaches for drug synergy prediction are usually developed using large-scale, publicly available drug combination screening data and omics datasets characterizing the screened cancer cell lines. Some of these resources are listed in Table 1, and we refer readers to other articles [22, 23] for information on additional resources.

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Table 1. Publicly available high-throughput drug combination screening datasets and large-scale cancer cell line genomics and transcriptomics datasets that can be used to develop drug synergy prediction models. The datasets that were used in this work are highlighted in bold. https://doi.org/10.1371/journal.pcbi.1010200.t001

One particular subset of ML that has attracted great interest from researchers in this field is deep learning (DL). These are models composed of multiple processing layers [31], giving them the ability to learn complex, non-linear functions. Furthermore, unlike most traditional ML methods, DL approaches typically do not require extensive feature selection before training, since they have the ability to learn higher-order representations directly from raw input data [32]. Since DL models can handle large amounts of high-dimensional and noisy data, they are good candidates for the development of drug synergy prediction models.

Preuer et al. [33] presented DeepSynergy, a feedforward, fully-connected deep neural network that uses chemical features and gene expression data to predict drug synergy. Xia et al. developed a multimodal DL model to predict the growth inhibition of cell lines from the NCI ALMANAC project [34]. This model includes separate feature-encoding subnetworks for each input data type (drug descriptors, gene expression, microRNA and proteomics data) and a cell line growth prediction network. Several other DL-based drug synergy prediction models have since been reported in the literature. Similar to the model proposed in 2018 by Xia et al., many of these more recent models adopt a multimodal architecture [35–38].

Beyond fully connected models [33, 34, 36, 37, 39], other innovative architectures have been proposed. Zhang et al. [40] developed a sparsely-connected deep belief network constrained by biological prior knowledge. The recent architecture of the TranSynergy model [41] includes a transformer [42] component, as well as fully connected layers. A method called REpresentation of Features as Images with NEighborhood Dependencies (REFINED) was developed to transform drug descriptors into images, so that convolutional neural networks (CNNs) could be used to model drug synergy instead of the typical fully connected networks [43]. Another study used graph neural networks (GNNs) for drug-specific subnetworks to learn drug representations directly from the compound structures in an end-to-end manner [38]. Several recent studies have used GNNs trained on graphs containing information on interactions between the drugs in a combination, between drugs and their targets, and/or interactions between genes or proteins in the cell lines [44–47].

Most drug synergy prediction models use drug features or gene expression features or a combination of both. Other models include additional cell line information, such as genetic data (somatic mutations and/or copy number variations (CNVs)) [35, 40] or proteomics data [34, 39]. Drug target-specific features have also been included [37, 40, 41]. Since adding more features increases the complexity of the models, assessing which types of input data are more informative and predictive of drug synergy is essential.

Precomputed molecular descriptors or fingerprints are used as chemical features to represent the drugs, as an alternative to the use of end-to-end DL methods to learn the relevant compound features directly from the compound structures. Given that the screening datasets that are currently available only contain a very limited number of compounds, it is still unclear whether there is any benefit in using learned representations instead of traditional fingerprints and descriptors. A recent study benchmarked several compound representations on a large drug synergy dataset and found that DL-based representations were able to outperform traditional fingerprints [21]. However, the authors also noted that the difference between the top performing DL-based methods and the best fingerprints was not substantial and that other concerns, such as interpretability, may be more important.

Feature reduction is often applied to the cell line omics data, either by using specific gene lists [38, 40, 41], or by employing unsupervised data dimensionality reduction techniques, such as principal component analysis (PCA) [39, 48] or autoencoders [35, 39]. Using predefined gene lists to select features provides greater control over the selection process and might make the models easier to interpret biologically. However, certain approaches, such as limiting the gene features to known drug targets present in the training set, may limit the generalization of the models. Data-driven approaches avoid this problem.

Another advantage of data-driven dimensionality reduction techniques is the capacity to be trained using much larger datasets with data from more cell lines than those used in the screening datasets [39], or even patient data [35]. Nevertheless, a limitation of this approach is the difficulty in interpreting the results. Therefore, evaluating which feature reduction methods are capable of achieving satisfactory performance, as well as simultaneously facilitating model interpretability, is an essential step when designing drug synergy prediction models.

The impact of different methodological aspects on the drug synergy prediction models is still unclear and a systematic evaluation is missing. In this work, we set out to investigate the impact of different methodological variables on the performance of drug synergy prediction DL methods, using the ALMANAC drug combination screening dataset. Namely, we evaluated the impact of different preprocessing steps, types of input data, and DL architectures on the final performance of the methods. Prior biological knowledge was used to select cell line features and to facilitate model interpretation.

Interpretability is an important requirement of biomedical predictive systems. We further explored recent methodologies to determine the importance of features and the interpretability of the prediction mechanisms.

We were able to identify the types of input data that are more predictive of drug response, as well as the feature selection and data representation methods that produce the best results. We also found that combining different models improves performance. Additionally, we demonstrate that the decisions made by the DL models are driven by biologically meaningful features.

The remainder of this article is structured as follows: the Results section briefly summarizes the different models and methodological choices that were tested in this work, and reports the results of these tests. It also includes a subsection focusing on model interpretability. In the Discussion section, the main findings of this study, as well as its limitations, are discussed. The research methodology is described in detail in the Materials and Methods section at the end of the article.

[END]
---
[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010200

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/