Assessment of vector-host-pathogen relationships using data mining and machine learning.

Abstract:

:Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration of diverse datasets to produce new biological knowledge. This review provides a current snapshot of the systematic assessment of the relationships between microbial pathogens, arthropod vectors and mammalian hosts using data mining and machine learning. We employ PRISMA to identify 32 key papers relevant to this topic. Our analysis shows an increasing use of data mining and machine learning tasks and techniques, including prediction, classification, clustering, association rules mining, and deep learning, over the last decade. However, it also reveals a number of critical challenges in applying these to the study of vector-host-pathogen interactions at various systems biology levels. Here, relevant studies, current limitations and future directions are discussed. Furthermore, the quality of data in relevant papers was assessed using the FAIR (Findable, Accessible, Interoperable, Reusable) compliance criteria to evaluate and encourage reproducibility and shareability of research outcomes. Although shortcomings in their application remain, data mining and machine learning have significant potential to break new ground in understanding fundamental aspects of vector-host-pathogen relationships and their application in this field should be encouraged. In particular, while predictive modeling, feature engineering and supervised machine learning are already being used in the field, other data mining and machine learning methods such as deep learning and association rules analysis lag behind and should be implemented in combination with established methods to accelerate hypothesis and knowledge generation in the domain.

authors

Agany DDM,Pietri JE,Gnimpieba EZ

doi

10.1016/j.csbj.2020.06.031

subject

Has Abstract

pub_date

2020-06-25 00:00:00

pages

1704-1721

issn

2001-0370

pii

S2001-0370(20)30320-2

journal_volume

18

pub_type

杂志文章,评审
  • Fungi.guru: Comparative genomic and transcriptomic resource for the fungi kingdom.

    abstract::The fungi kingdom is composed of eukaryotic heterotrophs, which are responsible for balancing the ecosystem and play a major role as decomposers. They also produce a vast diversity of secondary metabolites, which have antibiotic or pharmacological properties. However, our lack of knowledge of gene function in fungi pr...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.11.019

    authors: Lim JJJ,Koh J,Moo JR,Villanueva EMF,Putri DA,Lim YS,Seetoh WS,Mulupuri S,Ng JWZ,Nguyen NLU,Reji R,Foo H,Zhao MX,Chan TL,Rodrigues EE,Kairon RS,Hee KM,Chee NC,Low AD,Chen ZHX,Lim SC,Lunardi V,Fong TC,Chua CX

    更新日期:2020-11-20 00:00:00

  • Microvesicles from indoxyl sulfate-treated endothelial cells induce vascular calcification in vitro.

    abstract::Vascular calcification (VC), an unpredictable pathophysiological process and critical event in patients with cardiovascular diseases (CVDs), is the leading cause of morbi-mortality and disability in chronic kidney disease (CKD) patients worldwide. Currently, no diagnostic method is available for identifying patients a...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.04.006

    authors: Alique M,Bodega G,Corchete E,García-Menéndez E,de Sequera P,Luque R,Rodríguez-Padrón D,Marqués M,Portolés J,Carracedo J,Ramírez R

    更新日期:2020-04-09 00:00:00

  • The era of big data: Genome-scale modelling meets machine learning.

    abstract::With omics data being generated at an unprecedented rate, genome-scale modelling has become pivotal in its organisation and analysis. However, machine learning methods have been gaining ground in cases where knowledge is insufficient to represent the mechanisms underlying such data or as a means for data curation prio...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.10.011

    authors: Antonakoudis A,Barbosa R,Kotidis P,Kontoravdi C

    更新日期:2020-10-16 00:00:00

  • The impact of Gag non-cleavage site mutations on HIV-1 viral fitness from integrative modelling and simulations.

    abstract::The high mutation rate in retroviruses is one of the leading causes of drug resistance. In human immunodeficiency virus type-1 (HIV-1), synergistic mutations in its protease and the protease substrate - the Group-specific antigen (Gag) polyprotein - work together to confer drug resistance against protease inhibitors a...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.12.022

    authors: Samsudin F,Gan SK,Bond PJ

    更新日期:2020-12-23 00:00:00

  • BOG: R-package for Bacterium and virus analysis of Orthologous Groups.

    abstract::BOG (Bacterium and virus analysis of Orthologous Groups) is a package for identifying groups of differentially regulated genes in the light of gene functions for various virus and bacteria genomes. It is designed to identify Clusters of Orthologous Groups (COGs) that are enriched among genes that have gone through sig...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2015.05.002

    authors: Park J,Taslim C,Lin S

    更新日期:2015-05-21 00:00:00

  • Effect of mutations on the thermostability of Aspergillus aculeatus β-1,4-galactanase.

    abstract::New variants of β-1,4-galactanase from the mesophilic organism Aspergillus aculeatus were designed using the structure of β-1,4-galactanase from the thermophile organism Myceliophthora thermophila as a template. Some of the variants were generated using PROPKA 3.0, a validated pKa prediction tool, to test its usefulne...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2015.03.010

    authors: Torpenholt S,De Maria L,Olsson MH,Christensen LH,Skjøt M,Westh P,Jensen JH,Lo Leggio L

    更新日期:2015-04-09 00:00:00

  • Meta-analysis of Liver and Heart Transcriptomic Data for Functional Annotation Transfer in Mammalian Orthologs.

    abstract::Functional annotation transfer across multi-gene family orthologs can lead to functional misannotations. We hypothesised that co-expression network will help predict functional orthologs amongst complex homologous gene families. To explore the use of transcriptomic data available in public domain to identify functiona...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2017.08.002

    authors: Reyes PFL,Michoel T,Joshi A,Devailly G

    更新日期:2017-08-26 00:00:00

  • Mini-review: Strategies for Variation and Evolution of Bacterial Antigens.

    abstract::Across the eubacteria, antigenic variation has emerged as a strategy to evade host immunity. However, phenotypic variation in some of these antigens also allows the bacteria to exploit variable host niches as well. The specific mechanisms are not shared-derived characters although there is considerable convergent evol...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2015.07.002

    authors: Foley J

    更新日期:2015-07-26 00:00:00

  • Dynamic exchange of two types of stator units in Bacillus subtilis flagellar motor in response to environmental changes.

    abstract::Bacteria can migrate towards more suitable environments by rotating flagella that are under the control of sensory signal transduction networks. The bacterial flagellum is composed of the long helical filament functioning as a propeller, the flexible hook as a universal joint and the basal body as a rotary motor power...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.10.009

    authors: Terahara N,Namba K,Minamino T

    更新日期:2020-10-15 00:00:00

  • Developing a mobile application to better inform patients and enable effective consultation in implant dentistry.

    abstract::The field of dentistry lacks satisfactory tools to help visualize planned procedures and their potential results to patients. Dentists struggle to provide an effective image in their patient's mind of the end results of the planned treatment only through verbal explanations. Thus, verbal explanations alone often canno...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2016.06.006

    authors: Canbazoglu E,Salman YB,Yildirim ME,Merdenyan B,Ince IF

    更新日期:2016-06-29 00:00:00

  • Combining Ramachandran plot and molecular dynamics simulation for structural-based variant classification: Using TP53 variants as model.

    abstract::The wide application of new DNA sequencing technologies is generating vast quantities of genetic variation data at unprecedented speed. Developing methodologies to decode the pathogenicity of the variants is imperatively demanding. We hypothesized that as deleterious variants may function through disturbing structural...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.11.041

    authors: Tam B,Sinha S,Wang SM

    更新日期:2020-12-02 00:00:00

  • Statistical methods for the analysis of high-throughput metabolomics data.

    abstract::Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regul...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.5936/csbj.201301009

    authors: Bartel J,Krumsiek J,Theis FJ

    更新日期:2013-03-22 00:00:00

  • Reactive oxygen species: A generalist in regulating development and pathogenicity of phytopathogenic fungi.

    abstract::Reactive oxygen species (ROS) are small molecules with high oxidative activity, and are usually produced as byproducts of metabolic processes in organisms. ROS play an important role during the interaction between plant hosts and pathogenic fungi. Phytopathogenic fungi have evolved sophisticated ROS producing and scav...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.10.024

    authors: Zhang Z,Chen Y,Li B,Chen T,Tian S

    更新日期:2020-11-04 00:00:00

  • AUDACITY: A comprehensive approach for the detection and classification of Runs of Homozygosity in medical and population genomics.

    abstract::Runs of Homozygosity (RoHs) are popular among geneticists as the footprint of demographic processes, evolutionary forces and inbreeding in shaping our genome, and are known to confer risk of Mendelian and complex diseases. Notwithstanding growing interest in their study, there is unmet need for reliable and rapid meth...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.07.003

    authors: Magi A,Giangregorio T,Semeraro R,Carangelo G,Palombo F,Romeo G,Seri M,Pippucci T

    更新日期:2020-07-14 00:00:00

  • Network analysis of human post-mortem microarrays reveals novel genes, microRNAs, and mechanistic scenarios of potential importance in fighting huntington's disease.

    abstract::Huntington's disease is a progressive neurodegenerative disorder characterized by motor disturbances, cognitive decline, and neuropsychiatric symptoms. In this study, we utilized network-based analysis in an attempt to explore and understand the underlying molecular mechanism and to identify critical molecular players...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2016.02.001

    authors: Chandrasekaran S,Bonchev D

    更新日期:2016-02-10 00:00:00

  • Computational drug repurposing for inflammatory bowel disease using genetic information.

    abstract::As knowledge of the genetics behind inflammatory bowel disease (IBD) has continually improved, there has been a demand for methods that can use this data in a clinically significant way. Genome-wide association analyses for IBD have identified 232 risk genetic loci for the disorder. While identification of these risk ...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2019.01.001

    authors: Grenier L,Hu P

    更新日期:2019-01-07 00:00:00

  • YaTCM: Yet another Traditional Chinese Medicine Database for Drug Discovery.

    abstract::Traditional Chinese Medicine (TCM) has a long history of widespread clinical applications, especially in East Asia, and is becoming frequently used in Western countries. However, owing to extreme complicacy in both chemical ingredients and mechanism of action, a deep understanding of TCM is still difficult. To acceler...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2018.11.002

    authors: Li B,Ma C,Zhao X,Hu Z,Du T,Xu X,Wang Z,Lin J

    更新日期:2018-11-23 00:00:00

  • Current computational methods for predicting protein interactions of natural products.

    abstract::Natural products (NPs) are an indispensable source of drugs and they have a better coverage of the pharmacological space than synthetic compounds, owing to their high structural diversity. The prediction of their interaction profiles with druggable protein targets remains a major challenge in modern drug discovery. Ex...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2019.08.008

    authors: Moumbock AFA,Li J,Mishra P,Gao M,Günther S

    更新日期:2019-10-28 00:00:00

  • On fusion methods for knowledge discovery from multi-omics datasets.

    abstract::Recent years have witnessed the tendency of measuring a biological sample on multiple omics scales for a comprehensive understanding of how biological activities on varying levels are perturbed by genetic variants, environments, and their interactions. This new trend raises substantial challenges to data integration a...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.02.011

    authors: Baldwin E,Han J,Luo W,Zhou J,An L,Liu J,Zhang HH,Li H

    更新日期:2020-03-05 00:00:00

  • An Artificial Neural Network Integrated Pipeline for Biomarker Discovery Using Alzheimer's Disease as a Case Study.

    abstract::The field of machine learning has allowed researchers to generate and analyse vast amounts of data using a wide variety of methodologies. Artificial Neural Networks (ANN) are some of the most commonly used statistical models and have been successful in biomarker discovery studies in multiple disease types. This review...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2018.02.001

    authors: Zafeiris D,Rutella S,Ball GR

    更新日期:2018-02-21 00:00:00

  • A molecular device: A DNA molecular lock driven by the nicking enzymes.

    abstract::As people are placing more and more importance on information security, how to realize the protection of information has become a hotspot of current research. As a security device, DNA molecular locks have great potential to realize information protection at the molecular level. However, building a highly secure molec...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.08.004

    authors: Zhang X,Zhang Q,Liu Y,Wang B,Zhou S

    更新日期:2020-08-06 00:00:00

  • Quantitative comparison of ABC membrane protein type I exporter structures in a standardized way.

    abstract::An increasing number of ABC membrane protein structures are determined by cryo-electron microscopy and X-ray crystallography, consequently identifying differences between their conformations has become an arising issue. Therefore, we propose to define standardized measures for ABC Type I exporter structure characteriz...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2018.10.008

    authors: Csizmadia G,Farkas B,Spagina Z,Tordai H,Hegedűs T

    更新日期:2018-10-18 00:00:00

  • Alternative splicing: Human disease and quantitative analysis from high-throughput sequencing.

    abstract::Alternative splicing contributes to the majority of protein diversity in higher eukaryotes by allowing one gene to generate multiple distinct protein isoforms. It adds another regulation layer of gene expression. Up to 95% of human multi-exon genes undergo alternative splicing to encode proteins with different functio...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.1016/j.csbj.2020.12.009

    authors: Jiang W,Chen L

    更新日期:2020-12-24 00:00:00

  • Discovery of a novel (R)-selective bacterial hydroxynitrile lyase from Acidobacterium capsulatum.

    abstract::Hydroxynitrile lyases (HNLs) are powerful carbon-carbon bond forming enzymes. The reverse of their natural reaction - the stereoselective addition of hydrogen cyanide (HCN) to carbonyls - yields chiral cyanohydrins, versatile building blocks for the pharmaceutical and chemical industry. Recently, bacterial HNLs have b...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2014.07.002

    authors: Wiedner R,Gruber-Khadjawi M,Schwab H,Steiner K

    更新日期:2014-07-08 00:00:00

  • Engineering microbes for plant polyketide biosynthesis.

    abstract::Polyketides are an important group of secondary metabolites, many of which have important industrial applications in the food and pharmaceutical industries. Polyketides are synthesized from one of three classes of enzymes differentiated by their biochemical features and product structure: type I, type II or type III p...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章,评审

    doi:10.5936/csbj.201210020

    authors: Lussier FX,Colatriano D,Wiltshire Z,Page JE,Martin VJ

    更新日期:2013-02-22 00:00:00

  • Structure-based discovery of neoandrographolide as a novel inhibitor of Rab5 to suppress cancer growth.

    abstract::Rab5 is a small GTPase that plays a crucial role in oncogenic signal transduction, which was considered as an attractive target for cancer therapy. Rapid GDP/GTP exchange in the packet of Rab5 sustains its high activity for promoting cancer progression. However, Rab5 currently remains undruggable due to the lack of sp...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.11.033

    authors: Zhang J,Sun Y,Zhong LY,Yu NN,Ouyang L,Fang RD,Wang Y,He QY

    更新日期:2020-11-30 00:00:00

  • Causal inference for the effect of environmental chemicals on chronic kidney disease.

    abstract::The impacts of environmental chemicals on the decline of kidney function have been suggested by a limited number of statistical and animal studies. Thus, those exposures may be modifiable risk factors for chronic kidney disease. Some of the chemicals, such as Perfluoroalkyl acid (PFA), are pervasive throughout our env...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2019.12.001

    authors: Zhao J,Hinton P,Chen J,Jiang J

    更新日期:2019-12-17 00:00:00

  • Both intra and inter-domain interactions define the intrinsic dynamics and allosteric mechanism in DNMT1s.

    abstract::DNA methyltransferase 1 (DNMT1), a large multidomain enzyme, is believed to be involved in the passive transmission of genomic methylation patterns via methylation maintenance. Yet, the molecular mechanism of interaction networks underlying DNMT1 structures, dynamics, and its biological significance has yet to be full...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.03.016

    authors: Liang Z,Zhu Y,Long J,Ye F,Hu G

    更新日期:2020-03-23 00:00:00

  • A Blockchain-Based Notarization Service for Biomedical Knowledge Retrieval.

    abstract::Biomedical research and clinical decision depend increasingly on scientific evidence realized by a number of authoritative databases, mostly public and continually enriched via peer scientific contributions. Given the dynamic nature of biomedical evidence data and their usage in the sensitive domain of biomedical scie...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2018.08.002

    authors: Kleinaki AS,Mytis-Gkometh P,Drosatos G,Efraimidis PS,Kaldoudi E

    更新日期:2018-08-17 00:00:00

  • Rewired functional regulatory networks among miRNA isoforms (isomiRs) from let-7 and miR-10 gene families in cancer.

    abstract::Classical microRNA (miRNA) has been so far believed as a single sequence, but it indeed contains multiple miRNA isoforms (isomiR) with various sequences and expression patterns. It is not clear whether these diverse isomiRs have potential relationships and whether they contribute to miRNA:mRNA interactions. Here, we a...

    journal_title:Computational and structural biotechnology journal

    pub_type: 杂志文章

    doi:10.1016/j.csbj.2020.05.001

    authors: Liang T,Han L,Guo L

    更新日期:2020-05-13 00:00:00