Abstract:
MOTIVATION:With the recent advances in DNA sequencing technologies, the study of the genetic composition of living organisms has become more accessible for researchers. Several advances have been achieved because of it, especially in the health sciences. However, many challenges which emerge from the complexity of sequencing projects remain unsolved. Among them is the task of assembling DNA fragments from previously unsequenced organisms, which is classified as an NP-hard (nondeterministic polynomial time hard) problem, for which no efficient computational solution with reasonable execution time exists. However, several tools that produce approximate solutions have been used with results that have facilitated scientific discoveries, although there is ample room for improvement. As with other NP-hard problems, machine learning algorithms have been one of the approaches used in recent years in an attempt to find better solutions to the DNA fragment assembly problem, although still at a low scale. RESULTS:This paper presents a broad review of pioneering literature comprising artificial intelligence-based DNA assemblers-particularly the ones that use machine learning-to provide an overview of state-of-the-art approaches and to serve as a starting point for further study in this field.
journal_name
Brief Bioinformjournal_title
Briefings in bioinformaticsauthors
Padovani de Souza K,Setubal JC,Ponce de Leon F de Carvalho AC,Oliveira G,Chateau A,Alves Rdoi
10.1093/bib/bby072subject
Has Abstractpub_date
2019-11-27 00:00:00pages
2116-2129issue
6eissn
1467-5463issn
1477-4054pii
5074612journal_volume
20pub_type
杂志文章,评审abstract::It is easy for today's students and researchers to believe that modern bioinformatics emerged recently to assist next-generation sequencing data analysis. However, the very beginnings of bioinformatics occurred more than 50 years ago, when desktop computers were still a hypothesis and DNA could not yet be sequenced. T...
journal_title:Briefings in bioinformatics
pub_type: 历史文章,杂志文章,评审
doi:10.1093/bib/bby063
更新日期:2019-11-27 00:00:00
abstract::Fibrosis is a key component in the pathogenic mechanism of a variety of diseases. These diseases involving fibrosis may share common mechanisms and therapeutic targets, and therefore common intervention strategies and medicines may be applicable for these diseases. For this reason, deliberately introducing anti-fibros...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa115
更新日期:2020-06-22 00:00:00
abstract::Dissecting the genetic mechanism underlying a complex disease hinges on discovering gene-environment interactions (GXE). However, detecting GXE is a challenging problem especially when the genetic variants under study are rare. Haplotype-based tests have several advantages over the so-called collapsing tests for detec...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz031
更新日期:2020-05-21 00:00:00
abstract::With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important. We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene e...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt029
更新日期:2014-07-01 00:00:00
abstract::Since the completion of the Human Genome Project, it has been widely established that most DNA is not transcribed into proteins. These non-protein-coding regions are believed to be moderators within transcriptional and post-transcriptional processes, which play key roles in the onset of diseases. Long non-coding RNAs ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbv114
更新日期:2017-01-01 00:00:00
abstract::Next generation sequencers have greatly improved our ability to mine polymorphisms and mutations out of entire (or portions of) genomes. The reliability of their outputs, though, showed to be very related to the sequencing chemistry and to deeply affect the quality of the downstream analyses. We focus here on the two-...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbs048
更新日期:2013-11-01 00:00:00
abstract::Maintaining the consistency of genomic annotations is an increasingly complex task because of the iterative and dynamic nature of assembly and annotation, growing numbers of biological databases and insufficient integration of annotations across databases. As information exchange among databases is poor, a 'novel' seq...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbw017
更新日期:2017-03-01 00:00:00
abstract::Mathematical models can serve as a tool to formalize biological knowledge from diverse sources, to investigate biological questions in a formal way, to test experimental hypotheses, to predict the effect of perturbations and to identify underlying mechanisms. We present a pipeline of computational tools that performs ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbx163
更新日期:2019-07-19 00:00:00
abstract::CyanoPATH is a database that curates and analyzes the common genomic functional repertoire for cyanobacteria harmful algal blooms (CyanoHABs) in eutrophic waters. Based on the literature of empirical studies and genome/protein databases, it summarizes four types of information: common biological functions (pathways) d...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa375
更新日期:2020-12-16 00:00:00
abstract::The use of microarray technology to measure gene expression on a genome-wide scale has been well established for more than a decade. Methods to process and analyse the vast quantity of expression data generated by a typical microarray experiment are similarly well-established. The Affymetrix Exon 1.0 ST array is a rel...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbq086
更新日期:2011-11-01 00:00:00
abstract::Information Integrator is an extension to IBM's relational database DB2, which uses data federation to provide benefits to molecular biology researchers through two unique capabilities: increased flexibility in combining data from disparate sources, and SQL access to non-SQL data, easing the task of automating data an...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/4.4.375
更新日期:2003-12-01 00:00:00
abstract::Verification in phylogenetics represents an extremely difficult subject. Phylogenetic analysis deals with the reconstruction of evolutionary histories of species, and as long as mankind is not able to travel in time, it will not be possible to verify deep evolutionary histories reconstructed with modern computational ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbq079
更新日期:2011-05-01 00:00:00
abstract::The prevalence of dropout events is a serious problem for single-cell Hi-C (scHiC) data due to insufficient sequencing depth and data coverage, which brings difficulties in downstream studies such as clustering and structural analysis. Complicating things further is the fact that dropouts are confounded with structura...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa289
更新日期:2020-11-17 00:00:00
abstract::Rapidly evolving sequencing technologies produce data on an unparalleled scale. A central challenge to the analysis of this data is sequence alignment, whereby sequence reads must be compared to a reference. A wide variety of alignment algorithms and software have been subsequently developed over the past two years. I...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbq015
更新日期:2010-09-01 00:00:00
abstract::Functional annotation of protein sequence with high accuracy has become one of the most important issues in modern biomedical studies, and computational approaches of significantly accelerated analysis process and enhanced accuracy are greatly desired. Although a variety of methods have been developed to elevate prote...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz081
更新日期:2020-07-15 00:00:00
abstract::In order to reach precision medicine and improve patients' quality of life, machine learning is increasingly used in medicine. Brain disorders are often complex and heterogeneous, and several modalities such as demographic, clinical, imaging, genetics and environmental data have been studied to improve their understan...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa310
更新日期:2020-12-15 00:00:00
abstract::This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbw049
更新日期:2017-07-01 00:00:00
abstract::Data from the SEER reports reveal that the occurrence rate of a cancer type generally follows a unimodal distribution over age, peaking at an age that is cancer-type specific and ranges from 30+ through 70+. Previous studies attribute such bell-shaped distributions to the reduced proliferative potential in senior year...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa349
更新日期:2020-12-30 00:00:00
abstract::The presence of different transcripts of a gene across samples can be analysed by whole-transcriptome microarrays. Reproducing results from published microarray data represents a challenge owing to the vast amounts of data and the large variety of preprocessing and filtering steps used before the actual analysis is ca...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbt011
更新日期:2014-07-01 00:00:00
abstract::The increasing ease with which massive genetic information can be obtained from patients or healthy individuals has stimulated the development of interpretive bioinformatics tools as aids in clinical practice. Most such tools analyze evolutionary information and simple physical-chemical properties to predict whether r...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz146
更新日期:2021-01-18 00:00:00
abstract::Relative changes in mRNA as well as protein levels induced by sublethal doses of antibiotics on bacteria are measured and results visualised in the context of metabolic pathway diagrams. The mRNA levels present at a given time point after the addition of the antibiotic are measured using microarrays from Affymetrix. A...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/1.3.297
更新日期:2000-09-01 00:00:00
abstract::The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation beyond CpG sites and CpG islands. These technologies have opened new avenues to understand the interplay between epigenetic events, chromatin plasticity and gene regulation. ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbu016
更新日期:2015-05-01 00:00:00
abstract::RNA editing is a widespread co/posttranscriptional mechanism affecting primary RNAs by specific nucleotide modifications, which plays relevant roles in molecular processes including regulation of gene expression and/or the processing of noncoding RNAs. In recent years, the detection of editing sites has been improved ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbx129
更新日期:2019-03-22 00:00:00
abstract::The explosion in genomic sequence available in public databases has resulted in an unprecedented opportunity for computational whole genome analyses. A number of promising comparative-based approaches have been developed for gene finding, regulatory element discovery and other purposes, and it is clear that these tool...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/3.1.18
更新日期:2002-03-01 00:00:00
abstract::Gene expression data have played an essential role in many biomedical studies. When the number of genes is large and sample size is limited, there is a 'lack of information' problem, leading to low-quality findings. To tackle this problem, both horizontal and vertical data integrations have been developed, where verti...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa169
更新日期:2020-08-14 00:00:00
abstract:BACKGROUND:Whole genome sequencing (WGS) is increasingly used for Mycobacterium tuberculosis (Mtb) research. Countries with the highest tuberculosis (TB) burden face important challenges to integrate WGS into surveillance and research. METHODS:We assessed the global status of Mtb WGS and developed a 3-week training co...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa246
更新日期:2020-10-03 00:00:00
abstract::Phase separation is an important mechanism that mediates the spatial distribution of proteins in different cellular compartments. While phase-separated proteins share certain sequence characteristics, including intrinsically disordered regions (IDRs) and prion-like domains, such characteristics are insufficient for ma...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa187
更新日期:2020-09-02 00:00:00
abstract::Protein dynamics is central to all biological processes, including signal transduction, cellular regulation and biological catalysis. Among them, in-depth exploration of ligand-driven protein dynamics contributes to an optimal understanding of protein function, which is particularly relevant to drug discovery. Hence, ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbz141
更新日期:2020-12-01 00:00:00
abstract::Cell lines are widely used as in vitro models of tumorigenesis. However, an increasing number of researchers have found that cell lines differ from their sourced tumour samples after long-term cell culture. The application of unsuitable cell lines in experiments will affect the experimental accuracy and the treatment ...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章,评审
doi:10.1093/bib/bbw082
更新日期:2017-05-01 00:00:00
abstract::In drug development, preclinical safety and pharmacokinetics assessments of candidate drugs to ensure the safety profile are a must. While in vivo and in vitro tests are traditionally used, experimental determinations have disadvantages, as they are usually time-consuming and costly. In silico predictions of these pre...
journal_title:Briefings in bioinformatics
pub_type: 杂志文章
doi:10.1093/bib/bbaa160
更新日期:2020-08-07 00:00:00