Predictability of drug-induced liver injury by machine learning.


BACKGROUND:Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. METHODS AND RESULTS:The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e., 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. DISCUSSION:We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. REVIEWERS:This article was reviewed by Maciej Kandula and Paweł P. Labaj.


Biol Direct


Biology direct


Chierici M,Francescatto M,Bussola N,Jurman G,Furlanello C




Has Abstract


2020-02-13 00:00:00












  • Putative adaptive inter-slope divergence of transposon frequency in fruit flies (Drosophila melanogaster) at "Evolution Canyon", Mount Carmel, Israel.

    abstract:BACKGROUND:The current analysis of transposon elements (TE) in Drosophila melanogaster at Evolution Canyon, (EC), Israel, is based on data and analysis done by our collaborators (Drs. J. Gonzalez, J. Martinez and W. Makalowski, this issue). They estimated the frequencies of 28 TEs (transposon elements) in fruit flies (...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Beiles A,Raz S,Ben-Abu Y,Nevo E

    更新日期:2015-10-14 00:00:00

  • Outer membrane protein genes and their small non-coding RNA regulator genes in Photorhabdus luminescens.

    abstract:INTRODUCTION:Three major outer membrane protein genes of Escherichia coli, ompF, ompC, and ompA respond to stress factors. Transcripts from these genes are regulated by the small non-coding RNAs micF, micC, and micA, respectively. Here we examine Photorhabdus luminescens, an organism that has a different habitat from E...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Papamichail D,Delihas N

    更新日期:2006-05-22 00:00:00

  • Domain enhanced lookup time accelerated BLAST.

    abstract:BACKGROUND:BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Boratyn GM,Schäffer AA,Agarwala R,Altschul SF,Lipman DJ,Madden TL

    更新日期:2012-04-17 00:00:00

  • On origin of genetic code and tRNA before translation.

    abstract:BACKGROUND:Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Rodin AS,Szathmáry E,Rodin SN

    更新日期:2011-02-22 00:00:00

  • Rotational restriction of nascent peptides as an essential element of co-translational protein folding: possible molecular players and structural consequences.

    abstract:BACKGROUND:A basic tenet of protein science is that all information about the spatial structure of proteins is present in their sequences. Nonetheless, many proteins fail to attain native structure upon experimental denaturation and refolding in vitro, raising the question of the specific role of cellular machinery in ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Sorokina I,Mushegian A

    更新日期:2017-05-31 00:00:00

  • A computational approach to candidate gene prioritization for X-linked mental retardation using annotation-based binary filtering and motif-based linear discriminatory analysis.

    abstract:BACKGROUND:Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches--the examination of similarities to known disease genes and/or the evaluation of functional annotation of...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Lombard Z,Park C,Makova KD,Ramsay M

    更新日期:2011-06-13 00:00:00

  • Global analyses of Chromosome 17 and 18 genes of lung telocytes compared with mesenchymal stem cells, fibroblasts, alveolar type II cells, airway epithelial cells, and lymphocytes.

    abstract:BACKGROUND:Telocytes (TCs) is an interstitial cell with extremely long and thin telopodes (Tps) with thin segments (podomers) and dilations (podoms) to interact with neighboring cells. TCs have been found in different organs, while there is still a lack of TCs-specific biomarkers to distinguish TCs from the other cells...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Wang J,Ye L,Jin M,Wang X

    更新日期:2015-03-11 00:00:00

  • Pathophysiology of Crohn's disease inflammation and recurrence.

    abstract::Chron's Disease is a chronic inflammatory intestinal disease, first described at the beginning of the last century. The disease is characterized by the alternation of periods of flares and remissions influenced by a complex pathogenesis in which inflammation plays a key role. Crohn's disease evolution is mediated by a...

    journal_title:Biology direct

    pub_type: 杂志文章,评审


    authors: Petagna L,Antonelli A,Ganini C,Bellato V,Campanelli M,Divizia A,Efrati C,Franceschilli M,Guida AM,Ingallinella S,Montagnese F,Sensi B,Siragusa L,Sica GS

    更新日期:2020-11-07 00:00:00

  • Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    abstract:BACKGROUND:H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homolo...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Zhou H,Gao S,Nguyen NN,Fan M,Jin J,Liu B,Zhao L,Xiong G,Tan M,Li S,Wong L

    更新日期:2014-04-08 00:00:00

  • Hereditary profiles of disorderly transcription?

    abstract:BACKGROUND:Microscopic examination of living cells often reveals that cells from some cell strains appear to be in a permanent state of disarray without obvious reason. In all probability such a disorderly state affects cell functioning. The aim of this study was to establish whether a disorderly state could occur that...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Simons JW

    更新日期:2006-04-02 00:00:00

  • A network-based approach to classify the three domains of life.

    abstract:BACKGROUND:Identifying group-specific characteristics in metabolic networks can provide better insight into evolutionary developments. Here, we present an approach to classify the three domains of life using topological information about the underlying metabolic networks. These networks have been shown to share domain-...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Mueller LA,Kugler KG,Netzer M,Graber A,Dehmer M

    更新日期:2011-10-13 00:00:00

  • Structural analysis of hubs in human NR-RTK network.

    abstract:BACKGROUND:Currently a huge amount of protein-protein interaction data is available therefore extracting meaningful ones are a challenging task. In a protein-protein interaction network, hubs are considered as key proteins maintaining function and stability of the network. Therefore, studying protein-protein complexes ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Choura M,Rebaï A

    更新日期:2011-10-05 00:00:00

  • Proteomic changes associated with deletion of the Magnaporthe oryzae conidial morphology-regulating gene COM1.

    abstract:BACKGROUND:The rice blast disease caused by Magnaporthe oryzae is a major constraint on world rice production. The conidia produced by this fungal pathogen are the main source of disease dissemination. The morphology of conidia may be a critical factor in the spore dispersal and virulence of M. oryzae in the field. Del...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Bhadauria V,Wang LX,Peng YL

    更新日期:2010-11-02 00:00:00

  • Plant viruses of the Amalgaviridae family evolved via recombination between viruses with double-stranded and negative-strand RNA genomes.

    abstract::Plant viruses of the recently recognized family Amalgaviridae have monopartite double-stranded (ds) RNA genomes and encode two proteins: an RNA-dependent RNA polymerase (RdRp) and a putative capsid protein (CP). Whereas the RdRp of amalgaviruses has been found to be most closely related to the RdRps of dsRNA viruses o...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Krupovic M,Dolja VV,Koonin EV

    更新日期:2015-03-29 00:00:00

  • A highly conserved family of inactivated archaeal B family DNA polymerases.

    abstract::A widespread and highly conserved family of apparently inactivated derivatives of archaeal B-family DNA polymerases is described. Phylogenetic analysis shows that the inactivated forms comprise a distinct clade among archaeal B-family polymerases and that, within this clade, Euryarchaea and Crenarchaea are clearly sep...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Rogozin IB,Makarova KS,Pavlov YI,Koonin EV

    更新日期:2008-08-06 00:00:00

  • The archaeo-eukaryotic GINS proteins and the archaeal primase catalytic subunit PriS share a common domain.

    abstract:UNLABELLED:Primase and GINS are essential factors for chromosomal DNA replication in eukaryotic and archaeal cells. Here we describe a previously undetected relationship between the C-terminal domain of the catalytic subunit (PriS) of archaeal primase and the B-domains of the archaeo-eukaryotic GINS proteins in the for...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Swiatek A,Macneill SA

    更新日期:2010-04-12 00:00:00

  • xHMMER3x2: Utilizing HMMER3's speed and HMMER2's sensitivity and specificity in the glocal alignment mode for improved large-scale protein domain annotation.

    abstract:BACKGROUND:While the local-mode HMMER3 is notable for its massive speed improvement, the slower glocal-mode HMMER2 is more exact for domain annotation by enforcing full domain-to-sequence alignments. Since a unit of domain necessarily implies a unit of function, local-mode HMMER3 alone remains insufficient for precise ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Yap CK,Eisenhaber B,Eisenhaber F,Wong WC

    更新日期:2016-11-29 00:00:00

  • Component retention in principal component analysis with application to cDNA microarray data.

    abstract::Shannon entropy is used to provide an estimate of the number of interpretable components in a principal component analysis. In addition, several ad hoc stopping rules for dimension determination are reviewed and a modification of the broken stick model is presented. The modification incorporates a test for the presenc...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Cangelosi R,Goriely A

    更新日期:2007-01-17 00:00:00

  • Optimal treatment and stochastic modeling of heterogeneous tumors.

    abstract:UNLABELLED:In this work we review past articles that have mathematically studied cancer heterogeneity and the impact of this heterogeneity on the structure of optimal therapy. We look at past works on modeling how heterogeneous tumors respond to radiotherapy, and take a particularly close look at how the optimal radiot...

    journal_title:Biology direct

    pub_type: 杂志文章,评审


    authors: Badri H,Leder K

    更新日期:2016-08-23 00:00:00

  • Evolution before genes.

    abstract:BACKGROUND:Our current understanding of evolution is so tightly linked to template-dependent replication of DNA and RNA molecules that the old idea from Oparin of a self-reproducing 'garbage bag' ('coacervate') of chemicals that predated fully-fledged cell-like entities seems to be farfetched to most scientists today. ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Vasas V,Fernando C,Santos M,Kauffman S,Szathmáry E

    更新日期:2012-01-05 00:00:00

  • Why call it developmental bias when it is just development?

    abstract::The concept of developmental constraints has been central to understand the role of development in morphological evolution. Developmental constraints are classically defined as biases imposed by development on the distribution of morphological variation.This opinion article argues that the concepts of developmental co...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Salazar-Ciudad I

    更新日期:2021-01-09 00:00:00

  • Once upon a time the cell membranes: 175 years of cell boundary research.

    abstract::All modern cells are bounded by cell membranes best described by the fluid mosaic model. This statement is so widely accepted by biologists that little attention is generally given to the theoretical importance of cell membranes in describing the cell. This has not always been the case. When the Cell Theory was first ...

    journal_title:Biology direct

    pub_type: 历史文章,杂志文章,评审


    authors: Lombard J

    更新日期:2014-12-19 00:00:00

  • The fundamental units, processes and patterns of evolution, and the tree of life conundrum.

    abstract:BACKGROUND:The elucidation of the dominant role of horizontal gene transfer (HGT) in the evolution of prokaryotes led to a severe crisis of the Tree of Life (TOL) concept and intense debates on this subject. CONCEPT:Prompted by the crisis of the TOL, we attempt to define the primary units and the fundamental patterns ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Koonin EV,Wolf YI

    更新日期:2009-09-29 00:00:00

  • Infinitely long branches and an informal test of common ancestry.

    abstract:BACKGROUND:The evidence for universal common ancestry (UCA) is vast and persuasive. A phylogenetic test has been proposed for quantifying its odds against independently originated sequences based on the comparison between one versus several trees. This test was successfully applied to a well-supported homologous sequen...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: de Oliveira Martins L,Posada D

    更新日期:2016-04-07 00:00:00

  • The manoeuvrability hypothesis to explain the maintenance of bilateral symmetry in animal evolution.

    abstract:BACKGROUND:The overwhelming majority of animal species exhibit bilateral symmetry. However, the precise evolutionary importance of bilateral symmetry is unknown, although elements of the understanding of the phenomenon have been present within the scientific community for decades. PRESENTATION OF THE HYPOTHESIS:Here w...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Holló G,Novák M

    更新日期:2012-07-12 00:00:00

  • Activating and inhibiting connections in biological network dynamics.

    abstract:BACKGROUND:Many studies of biochemical networks have analyzed network topology. Such work has suggested that specific types of network wiring may increase network robustness and therefore confer a selective advantage. However, knowledge of network topology does not allow one to predict network dynamical behavior--for e...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: McDonald D,Waterbury L,Knight R,Betterton MD

    更新日期:2008-12-04 00:00:00

  • Prediction and mechanistic analysis of drug-induced liver injury (DILI) based on chemical structure.

    abstract:BACKGROUND:Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models w...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Liu A,Walter M,Wright P,Bartosik A,Dolciami D,Elbasir A,Yang H,Bender A

    更新日期:2021-01-18 00:00:00

  • The mechanistic and evolutionary aspects of the 2'- and 3'-OH paradigm in biosynthetic machinery.

    abstract:BACKGROUND:The translation machinery underlies a multitude of biological processes within the cell. The design and implementation of the modern translation apparatus on even the simplest course of action is extremely complex, and involves different RNA and protein factors. According to the "RNA world" idea, the critica...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Safro M,Klipcan L

    更新日期:2013-07-08 00:00:00

  • Why eukaryotic cells use introns to enhance gene expression: splicing reduces transcription-associated mutagenesis by inhibiting topoisomerase I cutting activity.

    abstract:BACKGROUND:The costs and benefits of spliceosomal introns in eukaryotes have not been established. One recognized effect of intron splicing is its known enhancement of gene expression. However, the mechanism regulating such splicing-mediated expression enhancement has not been defined. Previous studies have shown that ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Niu DK,Yang YF

    更新日期:2011-05-18 00:00:00

  • Unraveling the biochemistry and provenance of pupylation: a prokaryotic analog of ubiquitination.

    abstract:UNLABELLED:Recently Mycobacterium tuberculosis was shown to possess a novel protein modification, in which a small protein Pup is conjugated to the epsilon-amino groups of lysines in target proteins. Analogous to ubiquitin modification in eukaryotes, this remarkable modification recruits proteins for degradation via ar...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Iyer LM,Burroughs AM,Aravind L

    更新日期:2008-11-03 00:00:00