Predictability of drug-induced liver injury by machine learning.


BACKGROUND:Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. METHODS AND RESULTS:The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e., 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. DISCUSSION:We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. REVIEWERS:This article was reviewed by Maciej Kandula and Paweł P. Labaj.


Biol Direct


Biology direct


Chierici M,Francescatto M,Bussola N,Jurman G,Furlanello C




Has Abstract


2020-02-13 00:00:00












  • Evolution before genes.

    abstract:BACKGROUND:Our current understanding of evolution is so tightly linked to template-dependent replication of DNA and RNA molecules that the old idea from Oparin of a self-reproducing 'garbage bag' ('coacervate') of chemicals that predated fully-fledged cell-like entities seems to be farfetched to most scientists today. ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Vasas V,Fernando C,Santos M,Kauffman S,Szathmáry E

    更新日期:2012-01-05 00:00:00

  • On origin of genetic code and tRNA before translation.

    abstract:BACKGROUND:Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Rodin AS,Szathmáry E,Rodin SN

    更新日期:2011-02-22 00:00:00

  • Impairment of translation in neurons as a putative causative factor for autism.

    abstract:BACKGROUND:A dramatic increase in the prevalence of autism and Autistic Spectrum Disorders (ASD) has been observed over the last two decades in USA, Europe and Asia. Given the accumulating data on the possible role of translation in the etiology of ASD, we analyzed potential effects of rare synonymous substitutions ass...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Poliakov E,Koonin EV,Rogozin IB

    更新日期:2014-07-10 00:00:00

  • The common ancestry of life.

    abstract:BACKGROUND:It is common belief that all cellular life forms on earth have a common origin. This view is supported by the universality of the genetic code and the universal conservation of multiple genes, particularly those that encode key components of the translation system. A remarkable recent study claims to provide...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Koonin EV,Wolf YI

    更新日期:2010-11-18 00:00:00

  • The origins of phagocytosis and eukaryogenesis.

    abstract:BACKGROUND:Phagocytosis, that is, engulfment of large particles by eukaryotic cells, is found in diverse organisms and is often thought to be central to the very origin of the eukaryotic cell, in particular, for the acquisition of bacterial endosymbionts including the ancestor of the mitochondrion. RESULTS:Comparisons...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Yutin N,Wolf MY,Wolf YI,Koonin EV

    更新日期:2009-02-26 00:00:00

  • Component retention in principal component analysis with application to cDNA microarray data.

    abstract::Shannon entropy is used to provide an estimate of the number of interpretable components in a principal component analysis. In addition, several ad hoc stopping rules for dimension determination are reviewed and a modification of the broken stick model is presented. The modification incorporates a test for the presenc...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Cangelosi R,Goriely A

    更新日期:2007-01-17 00:00:00

  • Description of plant tRNA-derived RNA fragments (tRFs) associated with argonaute and identification of their putative targets.

    abstract::tRNA-derived RNA fragments (tRFs) are 19mer small RNAs that associate with Argonaute (AGO) proteins in humans. However, in plants, it is unknown if tRFs bind with AGO proteins. Here, using public deep sequencing libraries of immunoprecipitated Argonaute proteins (AGO-IP) and bioinformatics approaches, we identified th...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Loss-Morais G,Waterhouse PM,Margis R

    更新日期:2013-02-12 00:00:00

  • xHMMER3x2: Utilizing HMMER3's speed and HMMER2's sensitivity and specificity in the glocal alignment mode for improved large-scale protein domain annotation.

    abstract:BACKGROUND:While the local-mode HMMER3 is notable for its massive speed improvement, the slower glocal-mode HMMER2 is more exact for domain annotation by enforcing full domain-to-sequence alignments. Since a unit of domain necessarily implies a unit of function, local-mode HMMER3 alone remains insufficient for precise ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Yap CK,Eisenhaber B,Eisenhaber F,Wong WC

    更新日期:2016-11-29 00:00:00

  • Issues associated with the use of phosphospecific antibodies to localise active and inactive pools of GSK-3 in cells.

    abstract:BACKGROUND:Glycogen synthase kinase-3 (GSK-3) is a ubiquitously expressed serine/threonine (Ser/Thr) kinase comprising two isoforms, GSK-3α and GSK-3β. Both enzymes are similarly inactivated by serine phosphorylation (GSK-3α at Ser21 and GSK-3β at Ser9) and activated by tyrosine phosphorylation (GSK-3α at Tyr279 and GS...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Campa VM,Kypta RM

    更新日期:2011-01-24 00:00:00

  • Once upon a time the cell membranes: 175 years of cell boundary research.

    abstract::All modern cells are bounded by cell membranes best described by the fluid mosaic model. This statement is so widely accepted by biologists that little attention is generally given to the theoretical importance of cell membranes in describing the cell. This has not always been the case. When the Cell Theory was first ...

    journal_title:Biology direct

    pub_type: 历史文章,杂志文章,评审


    authors: Lombard J

    更新日期:2014-12-19 00:00:00

  • Rotational restriction of nascent peptides as an essential element of co-translational protein folding: possible molecular players and structural consequences.

    abstract:BACKGROUND:A basic tenet of protein science is that all information about the spatial structure of proteins is present in their sequences. Nonetheless, many proteins fail to attain native structure upon experimental denaturation and refolding in vitro, raising the question of the specific role of cellular machinery in ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Sorokina I,Mushegian A

    更新日期:2017-05-31 00:00:00

  • Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements.

    abstract:BACKGROUND:In eukaryotes, RNA interference (RNAi) is a major mechanism of defense against viruses and transposable elements as well of regulating translation of endogenous mRNAs. The RNAi systems recognize the target RNA molecules via small guide RNAs that are completely or partially complementary to a region of the ta...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Makarova KS,Wolf YI,van der Oost J,Koonin EV

    更新日期:2009-08-25 00:00:00

  • Diverse bacterial genomes encode an operon of two genes, one of which is an unusual class-I release factor that potentially recognizes atypical mRNA signals other than normal stop codons.

    abstract:BACKGROUND:While all codons that specify amino acids are universally recognized by tRNA molecules, codons signaling termination of translation are recognized by proteins known as class-I release factors (RF). In most eukaryotes and archaea a single RF accomplishes termination at all three stop codons. In most bacteria,...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Baranov PV,Vestergaard B,Hamelryck T,Gesteland RF,Nyborg J,Atkins JF

    更新日期:2006-09-13 00:00:00

  • Why call it developmental bias when it is just development?

    abstract::The concept of developmental constraints has been central to understand the role of development in morphological evolution. Developmental constraints are classically defined as biases imposed by development on the distribution of morphological variation.This opinion article argues that the concepts of developmental co...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Salazar-Ciudad I

    更新日期:2021-01-09 00:00:00

  • The fundamental units, processes and patterns of evolution, and the tree of life conundrum.

    abstract:BACKGROUND:The elucidation of the dominant role of horizontal gene transfer (HGT) in the evolution of prokaryotes led to a severe crisis of the Tree of Life (TOL) concept and intense debates on this subject. CONCEPT:Prompted by the crisis of the TOL, we attempt to define the primary units and the fundamental patterns ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Koonin EV,Wolf YI

    更新日期:2009-09-29 00:00:00

  • The progene hypothesis: the nucleoprotein world and how life began.

    abstract::In this article, I review the results of studies on the origin of life distinct from the popular RNA world hypothesis. The alternate scenario postulates the origin of the first bimolecular genetic system (a polynucleotide gene and a polypeptide processive polymerase) with simultaneous replication and translation and i...

    journal_title:Biology direct

    pub_type: 杂志文章,评审


    authors: Altstein AD

    更新日期:2015-11-26 00:00:00

  • Origin of the nuclear proteome on the basis of pre-existing nuclear localization signals in prokaryotic proteins.

    abstract:BACKGROUND:The origin of the selective nuclear protein import machinery, which consists of nuclear pore complexes and adaptor molecules interacting with the nuclear localization signals (NLSs) of cargo molecules, is one of the most important events in the evolution of eukaryotic cells. How proteins were selected for im...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Lisitsyna OM,Kurnaeva MA,Arifulin EA,Shubina MY,Musinova YR,Mironov AA,Sheval EV

    更新日期:2020-04-28 00:00:00

  • Outer membrane protein genes and their small non-coding RNA regulator genes in Photorhabdus luminescens.

    abstract:INTRODUCTION:Three major outer membrane protein genes of Escherichia coli, ompF, ompC, and ompA respond to stress factors. Transcripts from these genes are regulated by the small non-coding RNAs micF, micC, and micA, respectively. Here we examine Photorhabdus luminescens, an organism that has a different habitat from E...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Papamichail D,Delihas N

    更新日期:2006-05-22 00:00:00

  • PEPstrMOD: structure prediction of peptides containing natural, non-natural and modified residues.

    abstract:BACKGROUND:In the past, many methods have been developed for peptide tertiary structure prediction but they are limited to peptides having natural amino acids. This study describes a method PEPstrMOD, which is an updated version of PEPstr, developed specifically for predicting the structure of peptides containing natur...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Singh S,Singh H,Tuknait A,Chaudhary K,Singh B,Kumaran S,Raghava GP

    更新日期:2015-12-21 00:00:00

  • Is pre-Darwinian evolution plausible?

    abstract:BACKGROUND:This essay highlights critical aspects of the plausibility of pre-Darwinian evolution. It is based on a critical review of some better-known open, far-from-equilibrium system-based scenarios supposed to explain processes that took place before Darwinian evolution had emerged and that resulted in the origin o...

    journal_title:Biology direct

    pub_type: 杂志文章,评审


    authors: Tessera M

    更新日期:2018-09-21 00:00:00

  • The mechanistic and evolutionary aspects of the 2'- and 3'-OH paradigm in biosynthetic machinery.

    abstract:BACKGROUND:The translation machinery underlies a multitude of biological processes within the cell. The design and implementation of the modern translation apparatus on even the simplest course of action is extremely complex, and involves different RNA and protein factors. According to the "RNA world" idea, the critica...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Safro M,Klipcan L

    更新日期:2013-07-08 00:00:00

  • Hereditary profiles of disorderly transcription?

    abstract:BACKGROUND:Microscopic examination of living cells often reveals that cells from some cell strains appear to be in a permanent state of disarray without obvious reason. In all probability such a disorderly state affects cell functioning. The aim of this study was to establish whether a disorderly state could occur that...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Simons JW

    更新日期:2006-04-02 00:00:00

  • Plant viruses of the Amalgaviridae family evolved via recombination between viruses with double-stranded and negative-strand RNA genomes.

    abstract::Plant viruses of the recently recognized family Amalgaviridae have monopartite double-stranded (ds) RNA genomes and encode two proteins: an RNA-dependent RNA polymerase (RdRp) and a putative capsid protein (CP). Whereas the RdRp of amalgaviruses has been found to be most closely related to the RdRps of dsRNA viruses o...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Krupovic M,Dolja VV,Koonin EV

    更新日期:2015-03-29 00:00:00

  • Proteomic changes associated with deletion of the Magnaporthe oryzae conidial morphology-regulating gene COM1.

    abstract:BACKGROUND:The rice blast disease caused by Magnaporthe oryzae is a major constraint on world rice production. The conidia produced by this fungal pathogen are the main source of disease dissemination. The morphology of conidia may be a critical factor in the spore dispersal and virulence of M. oryzae in the field. Del...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Bhadauria V,Wang LX,Peng YL

    更新日期:2010-11-02 00:00:00

  • From tumors to species: a SCANDAL hypothesis.

    abstract::ᅟ: Some tumor cells can evolve into transmissible parasites. Notable examples include the Tasmanian devil facial tumor disease, the canine transmissible venereal tumor and transmissible cancers of mollusks. We present a hypothesis that such transmissible tumors existed in the past and that some modern animal taxa are ...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Panchin AY,Aleoshin VV,Panchin YV

    更新日期:2019-01-23 00:00:00

  • Global analyses of Chromosome 17 and 18 genes of lung telocytes compared with mesenchymal stem cells, fibroblasts, alveolar type II cells, airway epithelial cells, and lymphocytes.

    abstract:BACKGROUND:Telocytes (TCs) is an interstitial cell with extremely long and thin telopodes (Tps) with thin segments (podomers) and dilations (podoms) to interact with neighboring cells. TCs have been found in different organs, while there is still a lack of TCs-specific biomarkers to distinguish TCs from the other cells...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Wang J,Ye L,Jin M,Wang X

    更新日期:2015-03-11 00:00:00

  • A computational approach to candidate gene prioritization for X-linked mental retardation using annotation-based binary filtering and motif-based linear discriminatory analysis.

    abstract:BACKGROUND:Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches--the examination of similarities to known disease genes and/or the evaluation of functional annotation of...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Lombard Z,Park C,Makova KD,Ramsay M

    更新日期:2011-06-13 00:00:00

  • Optimal treatment and stochastic modeling of heterogeneous tumors.

    abstract:UNLABELLED:In this work we review past articles that have mathematically studied cancer heterogeneity and the impact of this heterogeneity on the structure of optimal therapy. We look at past works on modeling how heterogeneous tumors respond to radiotherapy, and take a particularly close look at how the optimal radiot...

    journal_title:Biology direct

    pub_type: 杂志文章,评审


    authors: Badri H,Leder K

    更新日期:2016-08-23 00:00:00

  • The manoeuvrability hypothesis to explain the maintenance of bilateral symmetry in animal evolution.

    abstract:BACKGROUND:The overwhelming majority of animal species exhibit bilateral symmetry. However, the precise evolutionary importance of bilateral symmetry is unknown, although elements of the understanding of the phenomenon have been present within the scientific community for decades. PRESENTATION OF THE HYPOTHESIS:Here w...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Holló G,Novák M

    更新日期:2012-07-12 00:00:00

  • Evidence-based gene models for structural and functional annotations of the oil palm genome.

    abstract:BACKGROUND:Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes...

    journal_title:Biology direct

    pub_type: 杂志文章


    authors: Chan KL,Tatarinova TV,Rosli R,Amiruddin N,Azizi N,Halim MAA,Sanusi NSNM,Jayanthi N,Ponomarenko P,Triska M,Solovyev V,Firdaus-Raih M,Sambanthamurthi R,Murphy D,Low EL

    更新日期:2017-09-08 00:00:00