(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Antigen discovery by bioinformatics analysis and peptide microarray for the diagnosis of cystic echinococcosis [1]

['Gherard Batisti Biffignandi', 'Department Of Clinical-Surgical', 'Diagnostic', 'Pediatric Sciences', 'University Of Pavia', 'Pavia', 'Department Of Biology', 'Biotechnology', 'L.Spallanzani', 'Ambra Vola']

Date: 2023-05

Cystic echinococcosis (CE) is a complex, neglected parasitic zoonosis. Tools and expertise in the diagnosis of CE are not widely available and the development of accurate and easily implementable diagnostic assays for CE has been highlighted as a priority also by the WHO. Antigen discovery for the immunodiagnosis of CE has relied so far on classical methods, which are impractical and poorly sensitive. Here, starting from the Echinococcus granulosus sensu stricto genome, we aimed to identify antigenic peptides having a potential for CE diagnosis based on a multiple-criteria strategy involving the bioinformatics selection of proteins, followed by screening by peptide microarray and validation by ELISA, using a panel of well-characterized sera from patients with hepatic CE in all development stages, and from clinically relevant controls. This methodology has virtually never been applied in the field of CE. Importantly, the developed database can be used to identify other E. granulosus s.s. antigenic candidates.

The development of new laboratory-based, easy to use, and robust tests with improved performances for the diagnosis and follow-up of CE are therefore highly needed. Previous works suggested that patients with CE might have both infection-specific and stage-specific serology profiles [ 10 – 12 ], indicating that the development of such tests is possible. So far, 2D gel electrophoresis of cyst fluid followed by immunoblot with sera from patients with CE has been used to identify immunogenic proteins [ 10 ]. However, this method requires the collection of echinococcal cyst material from slaughtered animals or invasively from humans, which is impractical and prevent the use of material from all cyst stages, since not all cyst stages require treatment with invasive procedures. Furthermore, this approach has low sensitivity and thus might not be able to identify weakly expressed and/or stage-specific antigens. Available data on the proteomic and transcriptomic profiles of E. granulosus s.l. [ 13 ], together with the publication of its genome [ 12 ], now allow different antigen discovery approaches, such as studies based on bioinformatics, which could overcome the need for the availability of parasitic material. Antigen identification through protein/peptide microarrays challenged with patients’ sera is an innovative technology, which has greatly expanded in the last decade, also in the field of parasitic diseases, but is almost unexplored for CE [ 14 – 16 ]. In this study, we aimed to identify antigens of E. granulosus sensu stricto (s.l.) with potential high diagnostic accuracy and stage-specific expression, through a first in silico bioinformatics selection, followed by reactivity screening by peptide microarray and ELISA validation using sera from well-characterized patients with hepatic CE in different stages and clinically relevant controls with non-parasitic focal liver lesions.

This heterogeneity of presentations makes the differential diagnosis of CE cysts particularly broad, and thus difficult to make outside specialized centers. At present, serology has a complementary role, supporting diagnosis when imaging is inconclusive, but seroassays are not standardized, have suboptimal specificity, and lack sensitivity especially for cysts in CE1 and CE4-CE5 stages and in extra-hepatic localizations [ 6 – 8 ]. Furthermore, serology is currently not useful for the follow-up of CE since it does not correlate with stage/viability of the cysts [ 9 ], which is pivotal information to choose the clinical management approach and evaluate cyst’s evolution over time. Consequently, patients have to undergo years-long follow-up by imaging to evaluate the progression of the cyst and long-term response to treatment, with attendant costs and often need of traveling to referral centers, which is often unfeasible.

The diagnosis of CE currently relies on imaging, especially ultrasound for the abdominal localizations, but instruments and specific expertise for a correct diagnosis are not widespread. Furthermore, CE cysts pass though different stages, named CE1 to CE5 according to the WHO Informal Working Group on Echinococcosis [IWGE] classification [ 4 ]. CE1, CE2, CE3b stages are active (biologically viable), CE3a transitional (a proportion biologically viable and a proportion not viable), and CE4-CE5 stages are inactive (biologically with low viability or not viable). Each stage has peculiar imaging features [ 4 , 5 ]. Briefly, on ultrasonography, CE1 cysts are fluid-filled unilocular cysts with double cyst wall; CE2 are characterized by the presence of fluid-filled daughter cysts inside the original cyst; CE3a present the parasitic layers of the cyst wall detached and floating in the fluid-filled cyst content; CE3b contain both daughter cysts as well as a solid component showing the features of the CE4 stage; CE4 cysts have a solid content inside which folded parasitic layers, are visible as hypoechoic structures; and CE5 cysts have the features of CE4 cysts and evident peripheral calcifications.

Cystic echinococcosis (CE) is a parasitic zoonosis caused by infection with the larval stage of the tapeworm Echinococcus granulosus sensu lato (s.l.). The parasite’s life cycle develops between canids, mainly the domestic dog, as definitive hosts, and ungulates, mainly livestock such as sheep, as intermediate hosts. Humans can acquire infection by accidental ingestion of parasite eggs shed through infected dogs’ feces, and act as dead-end intermediate hosts. In humans, the parasite develops in the form of fluid-filled, expansive cysts, mainly localized in the liver, followed by lungs, but any organ and tissue can be affected [ 1 ]. CE is a chronic infection, the clinical manifestations of which are non-specific and range from asymptomatic infection to disabling and even fatal disease [ 1 ]. CE is especially prevalent in livestock-breeding communities worldwide, with the highest prevalence occurring in Central Asia, Western China, South America, East Africa, Eastern Europe and the Mediterranean. It is (under)estimated that over 1.2 million people are affected worldwide at any given time, and the World Health Organization (WHO) has recently renewed the focus on CE in its 2021–2030 roadmap for neglected tropical diseases [ 2 ]. Among the needed actions indicated for CE, the WHO highlights the improvement of diagnostics as critical. Robust and easy to apply laboratory assays for screening and diagnosis of human infection are lacking. Techniques currently employed for the diagnosis and staging of CE are not always easily implementable and require experience for their interpretation, resulting in misdiagnosis and/or inappropriate management [ 3 ], with attendant risks and costs.

The validation of the microarray results was done by a home-made ELISA. The initial set-up of the methodology was performed using a pool of sera obtained from patients with CE independently of the cysts status or previous therapy (positive pool) and a pool of sera obtained from subjects with non-parasitic hepatic cysts and healthy donors (negative pool). Different concentrations of antigens, blocking buffers, dilutions of pooled sera and of secondary antibody as well as different incubation times were tested. Briefly, the optimized protocol used in the validation analysis was as follows: 5 μg of peptide diluted in phosphate buffer (PBS, pH 7.4) (Euroclone, ECB4004L) were absorbed onto Maxisorp microtiter 96-well plates (NUNC) at 4°C for 16–20 hours. Then, the plate was saturated to avoid non-specific binding for 1 hour at 37°C hour using 5% non-fat dry milk (Biosigma, 711160) in PBS-Tween20 0.05% (blocking buffer). After the blocking step, the plate was incubated for 3 hours with 100 μl of serum diluted 1:2 in blocking buffer. The specific peptide-antibody complexes were identified using a rabbit anti-human IgG conjugated to horseradish peroxidase (Merk Life Science, AP101P) diluted at 1:10,000 in blocking buffer for 1 hour at room temperature. Tetramethylbenzidine ultrasensitive (Merk, T4444) was used as a substrate for 30 minutes at room temperature. The reaction was stopped by adding 50 μl/well of 1M sulphuric acid and optical density (OD) was read by a plate reader (Biorad) at 450 nm. Washing steps were done with 300 μl/well of PBS containing 0.05% Tween20. Native Antigen B (AgB) [ 34 – 38 ] and an AgB peptide pool [ 36 ] were used as experimental positive controls. The OD read for the wells where no antigen was adsorbed (“peptide-/serum+/secondary antibody+”) was considered as the background. Technical negative controls were “peptide+/serum-/secondary antibody+” and “peptide-/serum-/secondary antibody+”. Results of each well were expressed as OD read value subtracted of the respective background.

The selected proteins were then subjected to the second step of the pipeline, with the aim to identify 20 amino acid (aa)-long peptides [ 27 – 29 ] with predicted B-cell antigenic properties. To accomplish this goal, three software were applied in parallel: Bepipred, SVMTrip, and LBtope. Bepipred [ 30 ] combines the predictions of a hidden Markov model and the hydrophilicity property of amino acids. This software has quite low sensitivity and high specificity, achieving a level of 80% of specificity when the sensitivity was 30%. SVMTrip [ 29 ], based on a Support Vector Machine (SVM) which combines the tri-peptide similarity assessed with the Blosum62 matrix and the Propensity scores of the tri-peptide model, showed 80% sensitivity and 55% prediction with a five-fold cross-validation. LBtope [ 31 ] is also based on a SVM algorithm including characteristics like dipeptide composition and amino acid pair (AAP) profile. These profiles are used by the software to convert the protein sequence into numerical values and then used to feed the SVM model to assess the epitope prediction [ 28 ]. Since the software Bepipred works on different frames of length, 20aa antigenic sequences were extracted from the output of this software using a Python script. All sequences defined as immunogenic were retained, using default cut-offs for Bepipred and SVMTrip, and scores above 95% motif and LBtope_confirm (confirm dataset, trained on epitopes verified at least by two studies) for LBTope.

The identification of these proteins was accomplished through the first part of the pipeline, using in parallel the TMHMM2, SignalP4 and PredGPI software, largely used in previous studies with overlapping aims to our research [ 14 , 24 , 25 ]. The results from each software were filtered as follows. The candidates detected by TMHMM2 were selected according to strong indicators of the presence of a transmembrane protein or signal peptide: the number of predicted transmembrane helices (predTMHs) was ≥ 3; the expected number of amino acids present within the transmembrane helices (ExpAAsTMHs) was > 18; the expected number of amino acids within transmembrane helices in the first 60 amino acid (First60AAs) was ≥ 10. SignalP4 results were filtered selecting proteins with D-score >0.8. The D-cutoff score is a parameter that combines both signal-peptide and cleavage site prediction networks. A score above the specified threshold (0.45) indicates the presence of a signal peptide. Finally, proteins screened with PredGPI were selected only if the score of the prediction was >99%. Gene function and subcellular localization of the candidate proteins were also checked, using NCBI database and the DeepLOC algorithm [ 26 ] respectively, to provide additional clues on protein localization.

Three software were used in parallel: TMHMM2 [ 17 ] was applied for the prediction of transmembrane helices, SignalP4 [ 18 ] to identify putative secreted proteins, and predGPI [ 19 ] to assess glycosylphosphatidylinositol (GPI)-anchored proteins. The immune system accessibility has been associated with the proteins exposed on the cell surfaces or secreted/excreted [ 20 ], all proteins that have been shown to be involved in host-immunity interactions [ 21 ]. Proteins that contain a signal peptide, as well as transmembrane proteins are accessible to the immune system and involved in several processes such as cell signaling and recognition [ 20 ]. Moreover, the GPI-anchored proteins face the extracellular environment and have different functions including surface antigens, host-pathogen immune modulation and signaling, due to their localization on the external surface of the cells [ 19 ]. Specifically for cestode parasites, membrane proteins have been shown to be promising diagnostic antigens for cysticercosis [ 22 ].

Two cohorts of CE patients and controls were included in the study: a “screening cohort” and a “validation cohort”. Demographic and clinical information of all potentially eligible patients and their samples were retrieved from clinical records. For CE patients, the etiological diagnosis of CE was based on the identification by ultrasound of pathognomonic features in liver cysts. CE cysts were staged according the WHO-IWGE classification [ 4 ]. Patients were classified in five clinical groups based on cyst stage: CE1 (active, unilocular fluid-filled cysts), CE2/CE3b (active, with daughter cysts), CE3a (transitional, with detached parasitic layers), CE4/CE5 (inactive, solid content) having attained the inactive stage as the result of therapy in the last 5 years (“CE4/CE5 therapy in the last 5 years”), and CE4/CE5 having reached inactivation spontaneously (“CE4/CE5 no therapy”). Only patients with CE of the liver as the only cyst localization were included. Patients with >1 hepatic CE were included only if the cysts in active stages were in the same stage; when both active and inactive cysts were present, the patient/sample was classified according to the stage of the active cyst(s). For the purpose of patient/sample classification, CE3a cysts were considered as active. The exclusion criteria were the presence of extra-hepatic cysts or the presence of >1 active hepatic cysts in different active stages. Controls were patients with non-parasitic focal liver lesions that could enter in differential diagnosis with CE (e.g. biliary cysts, neoplasms). The “screening cohort” and the “validation cohort” had the same inclusion/exclusion criteria.

The performances of the antigenic peptides as predictors were tested on the dependent variable (presence of CE or presence of active CE cyst) using logistic regression analysis. According to the sensitivity and specificity assessed through the regression model followed by manual curation based on peptide annotations, we selected eight candidates, of which four most promising for the diagnosis of CE infection, and four with potential for discriminating between active and inactive CE cysts. The annotations, as well as the values of sensitivity and specificity predicted for each peptide are summarized in Table 2 .

The localization of these proteins was further evaluated using the DeepLoc software and all proteins were retained for the second step of the pipeline. Peptides resulting as promising based on the statistical analyses were then screened using BLAST v2.2.31 against the non-redundant (nr) database to avoid the inclusion of peptides showing high sequence identity with other parasites (e.g. Taenia solium) and thus possibly causing cross-reaction during the immunological tests. The vast majority of these 950 proteins was predicted to be associated to the cell membrane (n = 409), endoplasmic reticulum (n = 215), and extracellular compartment (n = 181). The presence of proteins known for being associated to extracellular vesicles and involved in the host-immune regulation [ 41 – 43 ] such as Calreticulin, Tetraspanin-7, Basement membrane-specific heparan sulfate, Peroxidasin, Antigen B, Neurogenic locus Notch protein, and Basigin was also observed.

4. Discussion

Hepatic CE is a neglected and complex disease. Thanks to its unique ability to depict pathognomonic signs of the parasitic cysts in each stage, ultrasonography is the mainstay of the etiological diagnosis of CE cysts, of cyst staging (that guides clinical management), and of follow-up [45]. However, ultrasound machines and expertise in interpreting CE-specific imaging features are not always available. Furthermore, the imaging-based active/inactive aspect of cysts does not perfectly correlate with cyst viability in biological terms. Indeed, CE3a cysts may be biologically viable or not viable [46] and a proportion of inactive CE4/5 cysts may actually be biologically alive as demonstrated by their possible reactivation over time [47–49]. This results in patients requiring years-long follow-up to assess the evolution of infection or the response to treatment over time. Serology has a complementary role since it may allow confirming the diagnosis of CE in case of doubtful imaging [50]. However, the application of serology in the diagnostic workup of CE is not standardized, seroassays are heterogeneous in format and performances, have inadequate diagnostic accuracy especially for CE1 and inactive cysts [6], and are not useful for follow-up since their results do not correlate with cyst’s evolution over time [9]. Therefore, the development of accurate, robust, and easy-to-use, ideally point-of-care, new laboratory tests is urgently needed, to support the etiological diagnosis of the lesions visualized on imaging and the follow-up of patients with CE through identification of the biological viability of the parasite, as well as the development of imaging-independent screening tools.

Currently available seroassays for CE are largely based on crude, variably purified antigens from CE cyst fluid (hydatid fluid) obtained from slaughtered animals, with attending problems in supply and standardization of antigenic composition [8,9]. Several recombinant antigens, which could be produced in large quantity and standardized manner, have been investigated as potential substitutes of the hydatid fluid and its components [9]. However, until now, no recombinant antigen has been able to override current limitations of seroassays for CE, especially concerning sensitivity for early active (CE1) and inactive cysts [6]. Recombinant antigens investigated so far have been largely derived from immunogenic proteins known to be present in high concentration in the hydatid fluid, identified through classic protein-based analysis approaches [10,51]. The identification of E. granulosus s.l. antigen candidates through peptide microarray starting from the in-silico analysis of the parasite genome, the approach carried out in our study, has been implemented only in one previous study [14], which is also the only one so far published on Echinococcus spp using this approach [16]. As recently reviewed [16], in the field of human parasitic infections, the microarray technology has been previously explored for the characterization of new diagnostic antigens especially for protozoan infections and, in particular in silico prediction of B-cell epitopes, has been mainly applied with positive results in the field of toxoplasmosis [52,53]. Concerning human helminth infections, the peptide microarray approach for antigen discovery with diagnostic scope has been used so far in a very limited number of studies targeting Schistosoma mansoni [54], Onchocerca volvulus [55], and, as said, Echinococcus spp. [14]. In that study, however, List and colleagues [14] did not have access to a fully sequenced genome, thus were only able to start their selection from about 1000 proteins. Additionally, they did not specifically prioritize excreted/secreted proteins, which are known for being potential diagnostic markers thanks to their accessibility to the immune system [56,57]. Finally, only patients with CE cysts in two stages were investigated and healthy blood donors as well as samples from patients with parasitic infections not causing lesions requiring differential diagnosis with CE were used as controls.

In this study, starting from the full E. granulosus s.s. genome [12], we compiled a database of candidate antigenic peptides having a potential for CE diagnosis based on a multiple-criteria strategy involving the bioinformatics selection of proteins that had the highest probability to be accessible to the immune system, in particular proteins having transmembrane domains and containing a signal peptide [58,59]. Indeed, immunogenic epitopes are present in high density along the transmembrane domains [60]; moreover, the proteins harboring a signal peptide are involved in secretion pathways [61], indicating i) translocation to the cell surface, ii) accessibility by the host immune system [62], and iii) possibility to act as immune-regulators [63]. Considering the antigenic properties of the GPI-anchor and its known role in host-parasite interactions [64,65], GPI-anchor predicted proteins were also added to the analysis. Moreover, the bioinformatics selection was steered to identify epitopes likely recognized by antibodies. The immunogenicity of the selected peptides was tested by microarray using a panel of well-characterized sera obtained from patients with CE cysts in all stages groups, and from clinically meaningful controls, i.e. patients with focal liver lesions potentially requiring differential diagnosis with CE. Our microarray analysis led to the identification of eight antigenic candidates, four of which with an estimated sensitivity of 76%-84% and a specificity of 60%-80% for CE diagnosis vs controls, and four with a sensitivity of 70%-73% and a specificity of 70%-90% for diagnosis of active vs inactive CE cysts. There was a substantial agreement in terms of reactivity of sera to the selected peptides in microarray and in ELISA, as already reported in studies with similar experimental approaches [54]. However, this was overall not reflected in terms of sensitivity and specificity values in the ELISA validation step. Indeed, seven out of eight peptides showed poor accuracy for CE diagnosis or low immunogenicity. Furthermore, none of the candidate peptides was significantly associated to the presence of active or inactive CE cysts. Interestingly, one of these peptides was part of the tetraspanin protein, a membrane protein described as a potential diagnostic candidate for cysticercosis [66] and Schistosoma japonicum infection [67]. However, the validation experiment using an independent panel of sera did not further confirm this result for the diagnosis of CE. Only the reactivity to peptide EGR_08586, part of a hypothetical protein, was significantly higher in the CE group compared to controls. Unfortunately, however, the preliminary results of the accuracy of this peptide in ELISA do not favor its potential for clinical use, since (i) its highest possible sensitivity (72%) would be too low to be used for infection screening purposes; (ii) it is not associated specifically to active infection, which would have opened the possibility for its use as an “active CE-only” screening tool; and (iii) its maximum specificity (86%) would also be unsatisfactory and coupled to a too low sensitivity (41%), for its possible use as a confirmatory test.

Altogether, these results show that the identified peptides, used as individual antigens, have at best suboptimal accuracy for the diagnosis of CE and are not useful for cysts stages differentiation. This may be due to a low immunogenicity of these peptides, but technical issues such as a low adsorption onto the ELISA plate due to their intrinsic chemical characteristics may have also occurred. The use of single peptides may also be suboptimal, since it has been suggested that the use a pool of peptides, as opposed to single peptides, might reach higher diagnostic accuracy [68]. Therefore, future studies should focus on the use of combinations of those peptides with predicted good and complementary accuracy for the diagnosis of CE (vs controls) or of active (vs inactive) CE cysts.

This study had several limitations. Firstly, the size of the available samples cohort was relatively low. However, it must be emphasized that many variables are known to influence the results of seroassays [7] and, in the field of echinococcosis, it is exceedingly rare to be able to analyze and obtain results from well-characterized samples from patients with homogeneous clinical characteristics (cyst stage, localization, treatment status), which represents an important value of the study. Secondly, we did not have access to samples from patients with alveolar echinococcosis (AE) caused by Echinococcus multilocularis, which can cause liver lesions entering in differential diagnosis with CE, and which shows a very high rate (70–100% [9,69,70] of cross-reactivity with E. granulosus s.l. in seroassays. Three of the eight peptides selected for validation showed in silico identity and therefore potential cross-reactivity with E. multilocularis, including EGR_08586. Therefore, analysis of cross-reactivity and sera from patients with AE will have to be included in any further validation study. Thirdly, a protein, as opposed to a peptide, microarray would have likely allowed identifying different and more performing antigens, since proteins retain their structure allowing complete antibody interactions, including those with conformational epitopes that are recognized by B cells only. However, the use of a protein microarray has been unfortunately not possible in our study, neither was the evaluation of peptide combinations. However, although linear peptides do not provide conformational information, they are recognized by both T and B lymphocytes and can form weak interactions with specific antibodies. Moreover, when the predictions were performed, there were not enough 3D structures of E. granulosus proteins available in the public databases and in silico prediction systems available at the time were still unsatisfactory. Fourthly, here we focused only on three groups of proteins. The choice of these categories was based on their general properties, as mentioned above, which made them theoretically suitable candidates for antigenic properties and secretion in the cyst fluid or in the apical syncytium of the germinal layer, the cellular structure of the parasitic wall. By no means, however, these groups are exhaustive of the protein types with known or potential antigenic properties, neither are surely exposed/released outside the cyst [23,71]. Non-protein molecules were also not included, which could give a more complete picture of the set of antibodies produced against a pathogen [72]. Finally, this is only one of the potential study designs for antigen discovery, which could be applied, while the combination of microarray starting from the pathogen transcriptome or secretome be an alternative approach [16]. Specifically in the field of CE, however, the complex cyst structure, the difficulty in accessing material from cysts in different stages (from either humans, or naturally infected or experimentally infected hosts/animal models), the limitations of in vitro CE vesicle culture, and the still limited knowledge of what and how parasite molecules reach the external host environment from the cyst [23,73], make these approaches more difficult. All these aspects surely warrant the attention of further studies, hoping that the attention of funding agency will be driven also on CE, so far excluded by the neglected tropical diseases prioritized for funding.

In conclusion, here we performed bioinformatics analysis and peptide microarray as a discovery approach to identify antigens which could be useful for the diagnosis of CE. Eight candidates were selected and validated. Reactivity to one peptide was significantly higher in CE patients compared to controls but had suboptimal diagnostic accuracy. Nevertheless, our results show that the approach applied in this study is feasible and encourages the scientific community to implement further studies, expanding the scientific methodology and the identification and validation of other antigenic candidates and their combinations, taking advantage also of the E. granulosus s.s. antigens database compiled in this study.

[END]
---
[1] Url: https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0011210

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/