Abstract:
:Atomic charges play a very important role in drug-target recognition. However, computation of atomic charges with high-level quantum mechanics (QM) calculations is very time-consuming. A number of machine learning (ML)-based atomic charge prediction methods have been proposed to speed up the calculation of high-accuracy atomic charges in recent years. However, most of them used a set of predefined molecular properties, such as molecular fingerprints, for model construction, which is knowledge-dependent and may lead to biased predictions due to the representation preference of different molecular properties used for training. To solve the problem, we present a new architecture based on graph convolutional network (GCN) and develop a high-accuracy atomic charge prediction model named DeepAtomicCharge. The new GCN architecture is designed with only the atomic properties and the connection information between the atoms in molecules and can dynamically learn and convert molecules into appropriate atomic features without any prior knowledge of the molecules. Using the designed GCN architecture, substantial improvement is achieved for the prediction accuracy of atomic charges. The average root-mean-square error (RMSE) of DeepAtomicCharge is 0.0121 e, which is obviously more accurate than that (0.0180 e) reported by the previous benchmark study on the same two external test sets. Moreover, the new GCN architecture needs much lower storage space compared with other methods, and the predicted DDEC atomic charges can be efficiently used in large-scale structure-based drug design, thus opening a new avenue for high-performance atomic charge prediction and application.
journal_name
Brief Bioinformjournal_title
Briefings in bioinformaticsauthors
Wang J,Cao D,Tang C,Xu L,He Q,Yang B,Chen X,Sun H,Hou Tdoi
10.1093/bib/bbaa183subject
Has Abstractpub_date
2020-08-25 00:00:00eissn
1467-5463issn
1477-4054pii
5896581pub_type
杂志文章abstract::If the completion of the first draft of the human genome represents the coming of age of bioinformatics, then the emergence of bioinformatics as a university degree subject represents its establishment. In this paper bioinformatics as a subject for formal study is discussed, rather than as a subject for research, and ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/4.1.7
更新日期:2003-03-01 00:00:00
abstract::Glycosylation of proteins is involved in immune defense, cell-cell adhesion, cellular recognition and pathogen binding and is one of the most common and complex post-translational modifications. Science is still struggling to assign detailed mechanisms and functions to this form of conjugation. Even the structural ana...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbs045
更新日期:2013-05-01 00:00:00
abstract::The iteratively reweighted least square (IRLS) method is mostly identical to maximum likelihood (ML) method in terms of parameter estimation and power of quantitative trait locus (QTL) detection. But the IRLS is greatly superior to ML in terms of computing speed and the robustness of parameter estimation. In conjuncti...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbs062
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Whole genome sequencing (WGS) is increasingly used for Mycobacterium tuberculosis (Mtb) research. Countries with the highest tuberculosis (TB) burden face important challenges to integrate WGS into surveillance and research. METHODS:We assessed the global status of Mtb WGS and developed a 3-week training co...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa246
更新日期:2020-10-03 00:00:00
abstract::Competitive endogenous RNA (ceRNA) represents a novel layer of gene regulation that controls both physiological and pathological processes. However, there is still lack of computational tools for quickly identifying ceRNA regulation. To address this problem, we presented an R-package, CeRNASeek, which allows identifyi...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa048
更新日期:2020-05-04 00:00:00
abstract::Rapidly evolving sequencing technologies produce data on an unparalleled scale. A central challenge to the analysis of this data is sequence alignment, whereby sequence reads must be compared to a reference. A wide variety of alignment algorithms and software have been subsequently developed over the past two years. I...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbq015
更新日期:2010-09-01 00:00:00
abstract::Mediation analysis has been a useful tool for investigating the effect of mediators that lie in the path from the independent variable to the outcome. With the increasing dimensionality of mediators such as in (epi)genomics studies, high-dimensional mediation model is needed. In this work, we focus on epigenetic studi...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa113
更新日期:2020-07-01 00:00:00
abstract::Most of current gene expression signatures for cancer prognosis are based on risk scores, usually calculated as some summaries of expression levels of the signature genes, whose applications require presetting risk score thresholds and data normalization. In this study, we demonstrate the critical limitations of such ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbv064
更新日期:2016-03-01 00:00:00
abstract::With the increasing recognition of its role in trait and disease development, it is crucial to account for genetic imprinting to illustrate the genetic architecture of complex traits. Genetic mapping can be innovated to test and estimate effects of genetic imprinting in a segregating population derived from experiment...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbu019
更新日期:2015-05-01 00:00:00
abstract::In order to reach precision medicine and improve patients' quality of life, machine learning is increasingly used in medicine. Brain disorders are often complex and heterogeneous, and several modalities such as demographic, clinical, imaging, genetics and environmental data have been studied to improve their understan...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa310
更新日期:2020-12-15 00:00:00
abstract::Numerous studies have shown that copy number variation (CNV) in lncRNA regions play critical roles in the initiation and progression of cancer. However, our knowledge about their functionalities is still limited. Here, we firstly provided a computational method to identify lncRNAs with copy number variation (lncRNAs-C...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz113
更新日期:2020-12-01 00:00:00
abstract::DNA-binding hot spot residues of proteins are dominant and fundamental interface residues that contribute most of the binding free energy of protein-DNA interfaces. As experimental methods for identifying hot spots are expensive and time consuming, computational approaches are urgently required in predicting hot spots...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz037
更新日期:2020-05-21 00:00:00
abstract::RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a numb...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt086
更新日期:2015-01-01 00:00:00
abstract::Integrative analyses of genomic, epigenomic and transcriptomic features for human and various model organisms have revealed that many such features are nonrandomly distributed in the genome. Significant enrichment (or depletion) of genomic features is anticipated to be biologically important. Detection of genomic regi...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt053
更新日期:2014-11-01 00:00:00
abstract::Systems pharmacology is an emerging field that integrates systems biology and pharmacology to advance the process of drug discovery, development and the understanding of therapeutic mechanisms. The aim of the present work is to highlight the role that the systems pharmacology plays across the traditional herbal medici...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt035
更新日期:2014-09-01 00:00:00
abstract::The products of multi-exon genes are a mixture of alternatively spliced isoforms, from which the translated proteins can have similar, different or even opposing functions. It is therefore essential to differentiate and annotate functions for individual isoforms. Computational approaches provide an efficient complemen...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbv109
更新日期:2016-11-01 00:00:00
abstract::Since the 1st discovery of transcriptional enhancers in 1981, their textbook definition has remained largely unchanged in the past 37 years. With the emergence of high-throughput assays and genome editing, which are switching the paradigm from bottom-up discovery and testing of individual enhancers to top-down profili...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz030
更新日期:2020-05-21 00:00:00
abstract::Systematic sequencing of cancer genomes has revealed prevalent heterogeneity, with patients harboring various combinatorial patterns of genetic alteration. In particular, a phenomenon that a group of genes exhibits mutually exclusive patterns has been widespread across cancers, covering a broad spectrum of crucial can...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbx109
更新日期:2019-01-18 00:00:00
abstract::RNA-seq has been an increasingly popular high-throughput platform to identify differentially expressed (DE) genes, which is much more reproducible and accurate than the previous microarray technology. Yet, a number of statistical issues remain to be resolved in data analysis, largely due to the high-throughput data vo...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbv035
更新日期:2016-03-01 00:00:00
abstract::Gene set analysis (GSA) is one of the methods of choice for analyzing the results of current omics studies; however, it has been mainly developed to analyze mRNA (microarray, RNA-Seq) data. The following review includes an update regarding general methods and resources for GSA and then emphasizes GSA methods and tools...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz090
更新日期:2020-09-25 00:00:00
abstract::Prognostic tests using expression profiles of several dozen genes help provide treatment choices for prostate cancer (PCa). However, these tests require improvement to meet the clinical need for resolving overtreatment, which continues to be a pervasive problem in PCa management. Genomic selection (GS) methodology, wh...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa197
更新日期:2020-09-08 00:00:00
abstract::Precision and personalized medicine will be increasingly based on the integration of various type of information, particularly electronic health records and genome sequences. The availability of cheap genome sequencing services and the information interoperability will increase the role of online bioinformatics analys...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbw078
更新日期:2017-11-01 00:00:00
abstract::Computational detection methods have been widely used in studies on the biogenesis and the function of circular RNAs (circRNAs). However, all of the existing tools showed disadvantages on certain aspects of circRNA detection. Here, we propose an improved multithreading detection tool, CIRI2, which used an adapted maxi...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbx014
更新日期:2018-09-28 00:00:00
abstract::One of the most complex and computationally intensive tasks of genome sequence analysis is genome assembly. Even today, few centres have the resources, in both software and hardware, to assemble a genome from the thousands or millions of individual sequences generated in a whole-genome shotgun sequencing project. With...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/5.3.237
更新日期:2004-09-01 00:00:00
abstract::Posttranscriptional crosstalk and communication between RNAs yield large regulatory competing endogenous RNA (ceRNA) networks via shared microRNAs (miRNAs), as well as miRNA synergistic networks. The ceRNA crosstalk represents a novel layer of gene regulation that controls both physiological and pathological processes...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbx137
更新日期:2019-07-19 00:00:00
abstract::Technical advances such as the development of molecular cloning, Sanger sequencing, PCR and oligonucleotide microarrays are key to our current capacity to sequence, annotate and study complete organismal genomes. Recent years have seen the development of a variety of so-called 'next-generation' sequencing platforms, w...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbp046
更新日期:2010-03-01 00:00:00
abstract::The number of bioinformatics tools and resources that support molecular and cell biology approaches is continuously expanding. Moreover, systems and network biology analyses are accompanied more and more by integrated bioinformatics methods. Traditional information-centered university teaching methods often fail, as (...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt024
更新日期:2013-09-01 00:00:00
abstract::Deoxyribonucleic acid replication is one of the most crucial tasks taking place in the cell, and it has to be precisely regulated. This process is initiated in the replication origins (ORIs), and thus it is essential to identify such sites for a deeper understanding of the cellular processes and functions related to t...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa304
更新日期:2020-11-25 00:00:00
abstract:MOTIVATION:Long noncoding RNAs (lncRNAs) correspond to a eukaryotic noncoding RNA class that gained great attention in the past years as a higher layer of regulation for gene expression in cells. There is, however, a lack of specific computational approaches to reliably predict lncRNA in plants, which contrast the vari...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bby034
更新日期:2019-03-25 00:00:00
abstract::Next generation sequencers have greatly improved our ability to mine polymorphisms and mutations out of entire (or portions of) genomes. The reliability of their outputs, though, showed to be very related to the sequencing chemistry and to deeply affect the quality of the downstream analyses. We focus here on the two-...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbs048
更新日期:2013-11-01 00:00:00