Characteristics and evolution of the ecosystem of software tools supporting research in molecular biology.

Abstract:

:Daily work in molecular biology presently depends on a large number of computational tools. An in-depth, large-scale study of that 'ecosystem' of Web tools, its characteristics, interconnectivity, patterns of usage/citation, temporal evolution and rate of decay is crucial for understanding the forces that shape it and for informing initiatives aimed at its funding, long-term maintenance and improvement. In particular, the long-term maintenance of these tools is compromised because of their specific development model. Hundreds of published studies become irreproducible de facto, as the software tools used to conduct them become unavailable. In this study, we present a large-scale survey of >5400 publications describing Web servers within the two main bibliographic resources for disseminating new software developments in molecular biology. For all these servers, we studied their citation patterns, the subjects they address, their citation networks and the temporal evolution of these factors. We also analysed how these factors affect the availability of these servers (whether they are alive). Our results show that this ecosystem of tools is highly interconnected and adapts to the 'trendy' subjects in every moment. The servers present characteristic temporal patterns of citation/usage, and there is a worrying rate of server 'death', which is influenced by factors such as the server popularity and the institutions that hosts it. These results can inform initiatives aimed at the long-term maintenance of these resources.

journal_name

Brief Bioinform

authors

Pazos F,Chagoyen M

doi

10.1093/bib/bby001

subject

Has Abstract

pub_date

2019-07-19 00:00:00

pages

1329-1336

issue

4

eissn

1467-5463

issn

1477-4054

pii

4812639

journal_volume

20

pub_type

杂志文章
  • Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing.

    abstract::Technical advances such as the development of molecular cloning, Sanger sequencing, PCR and oligonucleotide microarrays are key to our current capacity to sequence, annotate and study complete organismal genomes. Recent years have seen the development of a variety of so-called 'next-generation' sequencing platforms, w...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbp046

    authors: Horner DS,Pavesi G,Castrignanò T,De Meo PD,Liuni S,Sammeth M,Picardi E,Pesole G

    更新日期:2010-03-01 00:00:00

  • Irinotecan and vandetanib create synergies for treatment of pancreatic cancer patients with concomitant TP53 and KRAS mutations.

    abstract:BACKGROUND:The most frequently mutated gene pairs in pancreatic adenocarcinoma (PAAD) are KRAS and TP53, and our goal is to illustrate the multiomics and molecular dynamics landscapes of KRAS/TP53 mutation and also to obtain prospective novel drugs for KRAS- and TP53-mutated PAAD patients. Moreover, we also made an att...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa149

    authors: Kaushik AC,Wang YJ,Wang X,Wei DQ

    更新日期:2020-07-31 00:00:00

  • Protein structure prediction in genomics.

    abstract::As the number of completely sequenced genomes rapidly increases, including now the complete Human Genome sequence, the post-genomic problems of genome-scale protein structure determination and the issue of gene function identification become ever more pressing. In fact, these problems can be seen as interrelated in th...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/2.2.111

    authors: Jones DT

    更新日期:2001-05-01 00:00:00

  • Machine learning meets genome assembly.

    abstract:MOTIVATION:With the recent advances in DNA sequencing technologies, the study of the genetic composition of living organisms has become more accessible for researchers. Several advances have been achieved because of it, especially in the health sciences. However, many challenges which emerge from the complexity of sequ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bby072

    authors: Padovani de Souza K,Setubal JC,Ponce de Leon F de Carvalho AC,Oliveira G,Chateau A,Alves R

    更新日期:2019-11-27 00:00:00

  • Extended application of genomic selection to screen multiomics data for prognostic signatures of prostate cancer.

    abstract::Prognostic tests using expression profiles of several dozen genes help provide treatment choices for prostate cancer (PCa). However, these tests require improvement to meet the clinical need for resolving overtreatment, which continues to be a pervasive problem in PCa management. Genomic selection (GS) methodology, wh...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa197

    authors: Li R,Wang S,Cui Y,Qu H,Chater JM,Zhang L,Wei J,Wang M,Xu Y,Yu L,Lu J,Feng Y,Zhou R,Huang Y,Ma R,Zhu J,Zhong W,Jia Z

    更新日期:2020-09-08 00:00:00

  • Iteratively reweighted LASSO for mapping multiple quantitative trait loci.

    abstract::The iteratively reweighted least square (IRLS) method is mostly identical to maximum likelihood (ML) method in terms of parameter estimation and power of quantitative trait locus (QTL) detection. But the IRLS is greatly superior to ML in terms of computing speed and the robustness of parameter estimation. In conjuncti...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs062

    authors: Liu Y,Yang T,Li H,Yang R

    更新日期:2014-01-01 00:00:00

  • Multilevel heterogeneous omics data integration with kernel fusion.

    abstract::High-throughput omics data are generated almost with no limit nowadays. It becomes increasingly important to integrate different omics data types to disentangle the molecular machinery of complex diseases with the hope for better disease prevention and treatment. Since the relationship among different omics data featu...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby115

    authors: Yang H,Cao H,He T,Wang T,Cui Y

    更新日期:2018-11-29 00:00:00

  • The virtual cell--a candidate co-ordinator for 'middle-out' modelling of biological systems.

    abstract::Understanding the functioning of biological systems depends on tackling complexity spanning spatial scales from genome to organ to whole organism. The basic unit of life, the cell, acts to co-ordinate information received across these scales and processes the myriad of signals to produce an integrated cellular respons...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbp010

    authors: Walker DC,Southgate J

    更新日期:2009-07-01 00:00:00

  • Using prior knowledge from cellular pathways and molecular networks for diagnostic specimen classification.

    abstract::For many complex diseases, an earlier and more reliable diagnosis is considered a key prerequisite for developing more effective therapies to prevent or delay disease progression. Classical statistical learning approaches for specimen classification using omics data, however, often cannot provide diagnostic models wit...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv044

    authors: Glaab E

    更新日期:2016-05-01 00:00:00

  • Cloud 3D-QSAR: a web tool for the development of quantitative structure-activity relationship models in drug discovery.

    abstract::Effective drug discovery contributes to the treatment of numerous diseases but is limited by high costs and long cycles. The Quantitative Structure-Activity Relationship (QSAR) method was introduced to evaluate the activity of a large number of compounds virtually, reducing the time and labor costs required for chemic...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa276

    authors: Wang YL,Wang F,Shi XX,Jia CY,Wu FX,Hao GF,Yang GF

    更新日期:2020-11-03 00:00:00

  • TRCirc: a resource for transcriptional regulation information of circRNAs.

    abstract::In recent years, high-throughput genomic technologies like chromatin immunoprecipitation sequencing (ChIp-seq) and transcriptome sequencing (RNA-seq) have been becoming both more refined and less expensive, making them more accessible. Many circular RNAs (circRNAs) that originate from back-spliced exons have been iden...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby083

    authors: Tang Z,Li X,Zhao J,Qian F,Feng C,Li Y,Zhang J,Jiang Y,Yang Y,Wang Q,Li C

    更新日期:2019-11-27 00:00:00

  • Comparing enrichment analysis and machine learning for identifying gene properties that discriminate between gene classes.

    abstract::Biologists very often use enrichment methods based on statistical hypothesis tests to identify gene properties that are significantly over-represented in a given set of genes of interest, by comparison with a 'background' set of genes. These enrichment methods, although based on rigorous statistical foundations, are n...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz028

    authors: Fabris F,Palmer D,de Magalhães JP,Freitas AA

    更新日期:2020-05-21 00:00:00

  • Bioinformatic analysis of SMN1-ACE/ACE2 interactions hinted at a potential protective effect of spinal muscular atrophy against COVID-19-induced lung injury.

    abstract::Patients with spinal muscular atrophy (SMA) are susceptible to the respiratory infections and might be at a heightened risk of poor clinical outcomes upon contracting coronavirus disease 2019 (COVID-19). In the face of the COVID-19 pandemic, the potential associations of SMA with the susceptibility to and prognosticat...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa285

    authors: Li Z,Li X,Shen J,Tan H,Rong T,Lin Y,Feng E,Chen Z,Jiao Y,Liu G,Zhang L,Vai Chan MT,Kei Wu WK

    更新日期:2020-11-14 00:00:00

  • Understanding the unimodal distributions of cancer occurrence rates: it takes two factors for a cancer to occur.

    abstract::Data from the SEER reports reveal that the occurrence rate of a cancer type generally follows a unimodal distribution over age, peaking at an age that is cancer-type specific and ranges from 30+ through 70+. Previous studies attribute such bell-shaped distributions to the reduced proliferative potential in senior year...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa349

    authors: Qiu S,An Z,Tan R,He PA,Jing J,Li H,Wu S,Xu Y

    更新日期:2020-12-30 00:00:00

  • Accounting for differential variability in detecting differentially methylated regions.

    abstract::DNA methylation plays an essential role in cancer. Differential variability (DV) in cancer was recently observed that contributes to cancer heterogeneity and has been shown to be crucial in detecting epigenetic field defects, DNA methylation alterations happening early in carcinogenesis. As neighboring CpG sites are h...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx097

    authors: Wang Y,Teschendorff AE,Widschwendter M,Wang S

    更新日期:2019-01-18 00:00:00

  • Architecture for interoperable software in biology.

    abstract::Understanding biological complexity demands a combination of high-throughput data and interdisciplinary skills. One way to bring to bear the necessary combination of data types and expertise is by encapsulating domain knowledge in software and composing that software to create a customized data analysis environment. T...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs074

    authors: Bare JC,Baliga NS

    更新日期:2014-07-01 00:00:00

  • Methodological aspects of whole-genome bisulfite sequencing analysis.

    abstract::The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation beyond CpG sites and CpG islands. These technologies have opened new avenues to understand the interplay between epigenetic events, chromatin plasticity and gene regulation. ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbu016

    authors: Adusumalli S,Mohd Omar MF,Soong R,Benoukraf T

    更新日期:2015-05-01 00:00:00

  • Computational methods for annotation of plant regulatory non-coding RNAs using RNA-seq.

    abstract::Plant transcriptome encompasses numerous endogenous, regulatory non-coding RNAs (ncRNAs) that play a major biological role in regulating key physiological mechanisms. While studies have shown that ncRNAs are extremely diverse and ubiquitous, the functions of the vast majority of ncRNAs are still unknown. With ever-inc...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa322

    authors: Vivek AT,Kumar S

    更新日期:2020-12-18 00:00:00

  • MeSHHeading2vec: a new method for representing MeSH headings as vectors based on graph embedding algorithm.

    abstract::Effectively representing Medical Subject Headings (MeSH) headings (terms) such as disease and drug as discriminative vectors could greatly improve the performance of downstream computational prediction models. However, these terms are often abstract and difficult to quantify. In this paper, we converted the MeSH tree ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa037

    authors: Guo ZH,You ZH,Huang DS,Yi HC,Zheng K,Chen ZH,Wang YB

    更新日期:2020-03-31 00:00:00

  • Docking of peptides to GPCRs using a combination of CABS-dock with FlexPepDock refinement.

    abstract::The structural description of peptide ligands bound to G protein-coupled receptors (GPCRs) is important for the discovery of new drugs and deeper understanding of the molecular mechanisms of life. Here we describe a three-stage protocol for the molecular docking of peptides to GPCRs using a set of different programs: ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa109

    authors: Badaczewska-Dawid AE,Kmiecik S,Koliński M

    更新日期:2020-06-10 00:00:00

  • Computational recognition for long non-coding RNA (lncRNA): Software and databases.

    abstract::Since the completion of the Human Genome Project, it has been widely established that most DNA is not transcribed into proteins. These non-protein-coding regions are believed to be moderators within transcriptional and post-transcriptional processes, which play key roles in the onset of diseases. Long non-coding RNAs ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbv114

    authors: Yotsukura S,duVerle D,Hancock T,Natsume-Kitatani Y,Mamitsuka H

    更新日期:2017-01-01 00:00:00

  • Result verification, code verification and computation of support values in phylogenetics.

    abstract::Verification in phylogenetics represents an extremely difficult subject. Phylogenetic analysis deals with the reconstruction of evolutionary histories of species, and as long as mankind is not able to travel in time, it will not be possible to verify deep evolutionary histories reconstructed with modern computational ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbq079

    authors: Stamatakis A,Izquierdo-Carrasco F

    更新日期:2011-05-01 00:00:00

  • Gene set analysis methods for the functional interpretation of non-mRNA data-Genomic range and ncRNA data.

    abstract::Gene set analysis (GSA) is one of the methods of choice for analyzing the results of current omics studies; however, it has been mainly developed to analyze mRNA (microarray, RNA-Seq) data. The following review includes an update regarding general methods and resources for GSA and then emphasizes GSA methods and tools...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz090

    authors: Mora A

    更新日期:2020-09-25 00:00:00

  • A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling.

    abstract::The products of multi-exon genes are a mixture of alternatively spliced isoforms, from which the translated proteins can have similar, different or even opposing functions. It is therefore essential to differentiate and annotate functions for individual isoforms. Computational approaches provide an efficient complemen...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv109

    authors: Li HD,Omenn GS,Guan Y

    更新日期:2016-11-01 00:00:00

  • Exon array data analysis using Affymetrix power tools and R statistical software.

    abstract::The use of microarray technology to measure gene expression on a genome-wide scale has been well established for more than a decade. Methods to process and analyse the vast quantity of expression data generated by a typical microarray experiment are similarly well-established. The Affymetrix Exon 1.0 ST array is a rel...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbq086

    authors: Lockstone HE

    更新日期:2011-11-01 00:00:00

  • Probe mapping across multiple microarray platforms.

    abstract::Access to gene expression data has become increasingly common in recent years; however, analysis has become more difficult as it is often desirable to integrate data from different platforms. Probe mapping across microarray platforms is the first and most crucial step for data integration. In this article, we systemat...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbr076

    authors: Allen JD,Wang S,Chen M,Girard L,Minna JD,Xie Y,Xiao G

    更新日期:2012-09-01 00:00:00

  • Teaching the bioinformatics of signaling networks: an integrated approach to facilitate multi-disciplinary learning.

    abstract::The number of bioinformatics tools and resources that support molecular and cell biology approaches is continuously expanding. Moreover, systems and network biology analyses are accompanied more and more by integrated bioinformatics methods. Traditional information-centered university teaching methods often fail, as (...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt024

    authors: Korcsmaros T,Dunai ZA,Vellai T,Csermely P

    更新日期:2013-09-01 00:00:00

  • Vertical integration methods for gene expression data analysis.

    abstract::Gene expression data have played an essential role in many biomedical studies. When the number of genes is large and sample size is limited, there is a 'lack of information' problem, leading to low-quality findings. To tackle this problem, both horizontal and vertical data integrations have been developed, where verti...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa169

    authors: Wu M,Yi H,Ma S

    更新日期:2020-08-14 00:00:00

  • Visualising gene expression in its metabolic context.

    abstract::Relative changes in mRNA as well as protein levels induced by sublethal doses of antibiotics on bacteria are measured and results visualised in the context of metabolic pathway diagrams. The mRNA levels present at a given time point after the addition of the antibiotic are measured using microarrays from Affymetrix. A...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/1.3.297

    authors: Wolf D,Gray CP,de Saizieu A

    更新日期:2000-09-01 00:00:00

  • Resolving the problem of multiple accessions of the same transcript deposited across various public databases.

    abstract::Maintaining the consistency of genomic annotations is an increasingly complex task because of the iterative and dynamic nature of assembly and annotation, growing numbers of biological databases and insufficient integration of annotations across databases. As information exchange among databases is poor, a 'novel' seq...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw017

    authors: Weirick T,John D,Uchida S

    更新日期:2017-03-01 00:00:00