(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
Infrastructure for bioinformatics applications in Tanzania: Lessons from the Sickle Cell Programme [1]
['Liberata A. Mwita', 'Department Of Pharmaceutical Microbiology', 'Muhimbili University Of Health', 'Allied Sciences', 'Dar Es Salaam', 'William F. Mawalla', 'Department Of Haematology', 'Blood Transfusion', 'Frank R. Mtiiye', 'Daniel Kandonga']
Date: 2023-04
Sickle cell disease (SCD) is a common genetic disorder in Africa. Some ongoing work in SCD research includes the analysis and comparisons of variation in phenotypic presentations and disease outcomes with the genotypic signatures. This has contributed to the observed growth of molecular and genetic data in SCD. However, while the “omics” data continues to pile, the capacity to interpret and turn the genetic findings into clinical practice is still underdeveloped, especially in the developing region. Building bioinformatics infrastructure and capacity in the region is key to bridging the gap. This paper seeks to illustrate how the Sickle Cell Programme (SCP) at the Muhimbili University of Health and Allied Sciences (MUHAS) in Tanzania, modeled the integration of infrastructure for bioinformatics and clinical research while running day-to-day clinical care for SCD in Tanzania.
Copyright: © 2023 Mwita et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Background
Bioinformatics is an interdisciplinary application of computation tools to harvest, organize, analyze, link, and store biological molecular data [1]. Clinical and epidemiological research continues to reveal significant phenotypic differences in presentation, progression, and disease outcomes in cohorts of patients such as those with sickle cell anemia, who were initially thought to have a common disease etiology [2,3]. The diverse presentations and clinical manifestations in these patients necessitate the need for genetics and bioinformatics research to elucidate the underlying causes of variations to improve individual patient management [2,4]. Thus, genome-wide association studies (GWAS) are expected to play a key role in the understanding of the genetic and molecular mechanisms underlying individual disease presentation [5,6]. GWAS has benefited sickle cell disease (SCD) research by helping to identify variants that are associated with SCD clinical complications such as vaso-occlusive pain and stroke [7–11]. In addition, GWAS results are now used to input data when doing functional studies to reveal mechanisms of candidate genes/single nucleotide polymorphisms (SNPs).
Bioinformatics and sickle cell disease SCD is the most common monogenic disease in humans [12]. Variation in hemoglobin S (HbS) occurs due to a single nucleotide substitution GAG→GTG in codon 6 of the beta-globin gene (E6V) on chromosome 11p15.5, resulting in a change of amino acid from glutamic acid to valine. On the other hand, mutation of the same codon in the hemoglobin C results in GAG→AAG change and substitution of amino acid from glutamic acid to lysine (E6K). Individuals with sickle cell anemia may have HbSS or HbSC that are the homozygous genotypes from variation in hemoglobin S and hemoglobin C while individuals with sickle cell trait may have HbAS or HbAC genotypes [13–16]. About 75% of the affected populations are born in Africa [12]. The disease follows the malaria belt due to the protection against P. falciparum malaria for those who are heterozygotes for the variant (HbAS) and therefore carriers of the disease [15,16]. Efforts to answer the question of how the βs–gene originated and spread in the malarial belt region lead to the identification of haplotypes in βs–gene cluster using restriction endonucleases [17]. Even though the resulting molecular defect at the β globin protein level is the same across the haplotypes, clinical manifestation, disease history, and severity vary significantly across patients [18,19]. The most prominent haplotypes in Africa are Cameroon in Central Africa, Senegal and Benin in West Africa, and Bantu (also known as the Central Africa Republic haplotype) in East and Central Africa. Early studies showed the Bantu haplotype to be associated with the worst prognosis, with an increased risk of development of complications and early mortality [14,20]. In particular, the Bantu haplotype is linked with severe vaso-occlusive events, stroke, early-onset of end-organ failure (i.e., lung, kidney, and eyes), and death, compared with non-Bantu haplotypes [20]. Until recently, the βs-gene was thought to arise from different mutations occurring at different times in the same locus, explaining the existence of different haplotypes. However, a more recent analysis supports the theory of single-origin mutation that occurred before the development of haplotypes [21]. The Sickle Cell Programme (SCP) [22] is one of the Tanzanian H3ABioNet nodes. The H3AbioNet supports the bioinformatics capacity and the team working on the data that was collected by clinicians and researchers in the different research projects at SCP [23]. The SCP began with a database of GWAS data of 1,952 SCD individuals that originally sought to elucidate the association between SCD and fetal hemoglobin (HbF) [11,22]. The findings of the original study led to a new study that selected a few individuals with extreme HbF to identify genetic variants associated with extreme fetal hemoglobin levels using targeted next-generation sequencing [24]. The GWAS database has now enabled the initiation of several other parallel studies. The SCP maintains a server dedicated to bioinformatics analysis. It also has access to high-performance computing (HPC) through collaboration with the Dar-es-salaam Institute of Technology (DIT) in Tanzania that provides opportunities for computational resources. At the moment, there are 2 main bioinformatics pipelines that the network supports: GWAS data analysis and next-generation sequencing (NGS) data analysis. The center has GWAS data of approximately 2,000 patients. There is also a continuous analysis of different phenotypes/conditions (i.e., hemoglobin F levels, anemia, and liver function) with the genotypes of the patient. The team has 2 bioinformaticians, 1 software engineer, 1 informatician, and 1 statistician employed to support the bioinformatics infrastructure. The team works closely within the program and program members include data clerks, clinicians, nurses, and other researchers.
Bioinformatics in Africa As a discipline and an integral component in health research and disease causal-intervention analysis, bioinformatics is lagging in Africa [25]. To build bioinformatics capacity and familiarize the continent and its scientists in this growing field, the Human Heredity and Health in Africa (H3Africa) (
https://h3abionet.org/) initiative provides a framework for integration and communication to enable full exploitation of Africa’s genomic and environmental data [26]. H3AbioNet, the Pan African Bioinformatics Network is building bioinformatics capacity through different tasks, including the building of Pan-African informatics infrastructure and providing informatics support to H3Africa projects, in addition to developing the H3Africa data coordinating center [27,28]. It also organizes regular online and onsite short bioinformatics courses in several African countries to enable participants to acquire basic and advanced knowledge of bioinformatics [29]. Some African Universities that received H3ABioNet funding were equipped with computational facilities such as desktops and servers that enabled them to have dedicated computer laboratories for running bioinformatics practical classes. This allowed the commencement of undergraduate bioinformatics courses in these universities [27,30]. In South Africa, bioinformatics started in the mid-1990s with the founding of the South Africa National Bioinformatics Institute (SANBI) at the University of Western Cape [31]. SANBI expanded throughout the country by establishing computational biology units in many universities involved in undergraduate and postgraduate training in bioinformatics and research activities. Researchers involved the government from the beginning; the latter provides funds and creates employment opportunities [31]. The establishment of societies, networks, and collaboration within and outside the country contributed to the rapid growth of the bioinformatics field [31,32]. South Africa is considered a leader in bioinformatics in African countries [31]. In the early 2000s, the West African countries led by Nigeria and Ghana started introducing bioinformatics as an academic field by hosting seminars, workshops, and symposia and later developing postgraduate courses [32–35]. The growing demand for experts and bioinformatics applications led to establishment of the Nigerian Bioinformatics and Genomics Network (NBGN) in 2019 [34]. Elsewhere, in Eastern Africa, the support from Fogarty International Center of the National Institutes of Health (FIC-NIH, United States of America) and the Initiative to Develop African Research Leaders (IDeAL) facilitated the starting of the Eastern Africa Network for Bioinformatics Training (EANBiT). EANBiT (
http://eanbit.icipe.org/) is a network of universities and research centers in Kenya, Uganda, and Tanzania. It offers fellowships for bioinformatics training at the Master’s level to increase the number and quality of potential PhD students and researchers.
Bioinformatics in other countries Different countries have followed different paths in recognizing the bioinformatics needs, building capacity in terms of infrastructures, and education to necessitate the start and continuation of bioinformatics in their settings. Greece and Cyprus [36] started their bioinformatics journey in the 1980s, they took 20 years of establishment, collaboration was key to recognition of bioinformatics activities in both countries, and informal and formal meetings between research groups were crucial. The establishment was followed by a recognition stage that led to the growth of bioinformatics in terms of research activities, education and training, service provision, and other community and infrastructure projects. They recommended the identification of national priorities, expanding research and research support without forgetting training at all levels and connecting with industries, and participation in local and international conferences as some of the means that facilitated the growth of bioinformatics in the 2 countries [36]. The objectives of bioinformatics training differ in the level of details and contents depending on the needs of the end user. Bioinformatics training is crucial as data size and complexity of sequencing data increases, there is a need to tailor the training and education according to the needs of the end user [37]. Different countries with different backgrounds, tools, and knowledge should tailor their bioinformatics training and research needs accordingly.
The integration model of sickle cell disease research in Tanzania The SCP engages in health provision in parallel with research, education, and advocacy activities [38]. Comprehensive healthcare is made possible through the integration of data capture and management, ICT, and day-to-day clinical services. The approach provides a platform for improved clinical care, research, and rapid feedback mechanisms (Fig 1). PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 1. SCP service integration model: The SCP has a central office that coordinates all the program’s activities. The center has 3 key sections; the Clinical and Research unit, the Data and Biorepository unit, and the Advocacy and Education unit. The center supports and coordinates the activities of the sickle cell clinics across the regional hospitals, and all the clinics are linked with the center’s servers using ICT systems, enabling online data collection and communication.
https://doi.org/10.1371/journal.pcbi.1010848.g001
[END]
---
[1] Url:
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010848
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/