Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics.

Abstract:

BACKGROUND:We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. RESULTS:We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter beta for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. CONCLUSIONS:The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Cherkasov A,Ho Sui SJ,Brunham RC,Jones SJ

doi

10.1186/1471-2105-5-101

keywords:

subject

Has Abstract

pub_date

2004-07-26 00:00:00

pages

101

issn

1471-2105

pii

1471-2105-5-101

journal_volume

5

pub_type

杂志文章
  • Class prediction for high-dimensional class-imbalanced data.

    abstract:BACKGROUND:The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-523

    authors: Blagus R,Lusa L

    更新日期:2010-10-20 00:00:00

  • Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model.

    abstract:BACKGROUND:Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-272

    authors: He X,Sarma MS,Ling X,Chee B,Zhai C,Schatz B

    更新日期:2010-05-20 00:00:00

  • Alternative mapping of probes to genes for Affymetrix chips.

    abstract:BACKGROUND:Short oligonucleotide arrays have several probes measuring the expression level of each target transcript. Therefore the selection of probes is a key component for the quality of measurements. However, once probes have been selected and synthesized on an array, it is still possible to re-evaluate the results...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-111

    authors: Gautier L,Møller M,Friis-Hansen L,Knudsen S

    更新日期:2004-08-14 00:00:00

  • MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans.

    abstract:BACKGROUND:MicroRNAs (miRNAs) are recognized as one of the most important families of non-coding RNAs that serve as important sequence-specific post-transcriptional regulators of gene expression. Identification of miRNAs is an important requirement for understanding the mechanisms of post-transcriptional regulation. Hu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-341

    authors: Huang TH,Fan B,Rothschild MF,Hu ZL,Li K,Zhao SH

    更新日期:2007-09-17 00:00:00

  • Qxpak.5: old mixed model solutions for new genomics problems.

    abstract:BACKGROUND:Mixed models have a long and fruitful history in statistics. They are pertinent to genomics problems because they are highly versatile, accommodating a wide variety of situations within the same theoretical and algorithmic framework. RESULTS:Qxpak is a package for versatile statistical genomics, specificall...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-202

    authors: Pérez-Enciso M,Misztal I

    更新日期:2011-05-25 00:00:00

  • NOXclass: prediction of protein-protein interaction types.

    abstract:BACKGROUND:Structural models determined by X-ray crystallography play a central role in understanding protein-protein interactions at the molecular level. Interpretation of these models requires the distinction between non-specific crystal packing contacts and biologically relevant interactions. This has been investiga...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-27

    authors: Zhu H,Domingues FS,Sommer I,Lengauer T

    更新日期:2006-01-19 00:00:00

  • Time-course analysis of genome-wide gene expression data from hormone-responsive human breast cancer cells.

    abstract:BACKGROUND:Microarray experiments enable simultaneous measurement of the expression levels of virtually all transcripts present in cells, thereby providing a 'molecular picture' of the cell state. On the other hand, the genomic responses to a pharmacological or hormonal stimulus are dynamic molecular processes, where t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-S2-S12

    authors: Mutarelli M,Cicatiello L,Ferraro L,Grober OM,Ravo M,Facchiano AM,Angelini C,Weisz A

    更新日期:2008-03-26 00:00:00

  • Parameterizing sequence alignment with an explicit evolutionary model.

    abstract:BACKGROUND:Inference of sequence homology is inherently an evolutionary question, dependent upon evolutionary divergence. However, the insertion and deletion penalties in the most widely used methods for inferring homology by sequence alignment, including BLAST and profile hidden Markov models (profile HMMs), are not b...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0832-5

    authors: Rivas E,Eddy SR

    更新日期:2015-12-10 00:00:00

  • Improved identification of conserved cassette exons using Bayesian networks.

    abstract:BACKGROUND:Alternative splicing is a major contributor to the diversity of eukaryotic transcriptomes and proteomes. Currently, large scale detection of alternative splicing using expressed sequence tags (ESTs) or microarrays does not capture all alternative splicing events. Moreover, for many species genomic data is be...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-477

    authors: Sinha R,Hiller M,Pudimat R,Gausmann U,Platzer M,Backofen R

    更新日期:2008-11-12 00:00:00

  • Multi-view feature selection for identifying gene markers: a diversified biological data driven approach.

    abstract:BACKGROUND:In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene express...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03810-0

    authors: Acharya S,Cui L,Pan Y

    更新日期:2020-12-30 00:00:00

  • Markov chain Monte Carlo for active module identification problem.

    abstract:BACKGROUND:Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual c...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03572-9

    authors: Alexeev N,Isomurodov J,Sukhov V,Korotkevich G,Sergushichev A

    更新日期:2020-11-18 00:00:00

  • DeepCryoPicker: fully automated deep neural network for single protein particle picking in cryo-EM.

    abstract:BACKGROUND:Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03809-7

    authors: Al-Azzawi A,Ouadou A,Max H,Duan Y,Tanner JJ,Cheng J

    更新日期:2020-11-09 00:00:00

  • BioNanoAnalyst: a visualisation tool to assess genome assembly quality using BioNano data.

    abstract:BACKGROUND:Reference genome assemblies are valuable, as they provide insights into gene content, genetic evolution and domestication. The higher the quality of a reference genome assembly the more accurate the downstream analysis will be. During the last few years, major efforts have been made towards improving the qua...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1735-4

    authors: Yuan Y,Bayer PE,Scheben A,Chan CK,Edwards D

    更新日期:2017-06-30 00:00:00

  • Linear space string correction algorithm using the Damerau-Levenshtein distance.

    abstract:BACKGROUND:The Damerau-Levenshtein (DL) distance metric has been widely used in the biological science. It tries to identify the similar region of DNA,RNA and protein sequences by transforming one sequence to the another using the substitution, insertion, deletion and transposition operations. Lowrance and Wagner have ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3184-8

    authors: Zhao C,Sahni S

    更新日期:2020-12-09 00:00:00

  • An integrative method to normalize RNA-Seq data.

    abstract:BACKGROUND:Transcriptome sequencing is a powerful tool for measuring gene expression, but as well as some other technologies, various artifacts and biases affect the quantification. In order to correct some of them, several normalization approaches have emerged, differing both in the statistical strategy employed and i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-188

    authors: Filloux C,Cédric M,Romain P,Lionel F,Christophe K,Dominique R,Abderrahman M,Daniel P

    更新日期:2014-06-14 00:00:00

  • A knowledge discovery object model API for Java.

    abstract:BACKGROUND:Biological data resources have become heterogeneous and derive from multiple sources. This introduces challenges in the management and utilization of this data in software development. Although efforts are underway to create a standard format for the transmission and storage of biological data, this objectiv...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-51

    authors: Zuyderduyn SD,Jones SJ

    更新日期:2003-10-28 00:00:00

  • An automatic device for detection and classification of malaria parasite species in thick blood film.

    abstract:BACKGROUND:Current malaria diagnosis relies primarily on microscopic examination of Giemsa-stained thick and thin blood films. This method requires vigorously trained technicians to efficiently detect and classify the malaria parasite species such as Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) for an appropria...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S17-S18

    authors: Kaewkamnerd S,Uthaipibull C,Intarapanich A,Pannarut M,Chaotheing S,Tongsima S

    更新日期:2012-01-01 00:00:00

  • HMMvar-func: a new method for predicting the functional outcome of genetic variants.

    abstract:BACKGROUND:Numerous tools have been developed to predict the fitness effects (i.e., neutral, deleterious, or beneficial) of genetic variants on corresponding proteins. However, prediction in terms of whether a variant causes the variant bearing protein to lose the original function or gain new function is also needed f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0781-z

    authors: Liu M,Watson LT,Zhang L

    更新日期:2015-10-30 00:00:00

  • A simple method for assessing sample sizes in microarray experiments.

    abstract:BACKGROUND:In this short article, we discuss a simple method for assessing sample size requirements in microarray experiments. RESULTS:Our method starts with the output from a permutation-based analysis for a set of pilot data, e.g. from the SAM package. Then for a given hypothesized mean difference and various sample...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-106

    authors: Tibshirani R

    更新日期:2006-03-02 00:00:00

  • LDpop: an interactive online tool to calculate and visualize geographic LD patterns.

    abstract:BACKGROUND:Linkage disequilibrium (LD)-the non-random association of alleles at different loci-defines population-specific haplotypes which vary by genomic ancestry. Assessment of allelic frequencies and LD patterns from a variety of ancestral populations enables researchers to better understand population histories as...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3340-1

    authors: Alexander TA,Machiela MJ

    更新日期:2020-01-10 00:00:00

  • Menoci: lightweight extensible web portal enhancing data management for biomedical research projects.

    abstract:BACKGROUND:Biomedical research projects deal with data management requirements from multiple sources like funding agencies' guidelines, publisher policies, discipline best practices, and their own users' needs. We describe functional and quality requirements based on many years of experience implementing data managemen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03928-1

    authors: Suhr M,Lehmann C,Bauer CR,Bender T,Knopp C,Freckmann L,Öst Hansen B,Henke C,Aschenbrandt G,Kühlborn LK,Rheinländer S,Weber L,Marzec B,Hellkamp M,Wieder P,Sax U,Kusch H,Nussbeck SY

    更新日期:2020-12-17 00:00:00

  • Exploring the transcription factor activity in high-throughput gene expression data using RLQ analysis.

    abstract:BACKGROUND:Interpretation of gene expression microarray data in the light of external information on both columns and rows (experimental variables and gene annotations) facilitates the extraction of pertinent information hidden in these complex data. Biologists classically interpret genes of interest after retrieving f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-178

    authors: Baty F,Rüdiger J,Miglino N,Kern L,Borger P,Brutsche M

    更新日期:2013-06-06 00:00:00

  • iMEGES: integrated mental-disorder GEnome score by deep neural network for prioritizing the susceptibility genes for mental disorders in personal genomes.

    abstract:BACKGROUND:A range of rare and common genetic variants have been discovered to be potentially associated with mental diseases, but many more have not been uncovered. Powerful integrative methods are needed to systematically prioritize both variants and genes that confer susceptibility to mental diseases in personal gen...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2469-7

    authors: Khan A,Liu Q,Wang K

    更新日期:2018-12-28 00:00:00

  • Random forest versus logistic regression: a large-scale benchmark experiment.

    abstract:BACKGROUND AND GOAL:The Random Forest (RF) algorithm for regression and classification has considerably gained popularity since its introduction in 2001. Meanwhile, it has grown to a standard classification approach competing with logistic regression in many innovation-friendly scientific fields. RESULTS:In this conte...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2264-5

    authors: Couronné R,Probst P,Boulesteix AL

    更新日期:2018-07-17 00:00:00

  • CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies.

    abstract:BACKGROUND:Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology. However, the unique error patterns in the third generation sequencing (TGS) technologies could reduce the accuracy of these programs. RESULTS:We d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03777-y

    authors: Bui VK,Wei C

    更新日期:2020-10-20 00:00:00

  • Optimizing agent-based transmission models for infectious diseases.

    abstract:BACKGROUND:Infectious disease modeling and computational power have evolved such that large-scale agent-based models (ABMs) have become feasible. However, the increasing hardware complexity requires adapted software designs to achieve the full potential of current high-performance workstations. RESULTS:We have found l...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0612-2

    authors: Willem L,Stijven S,Tijskens E,Beutels P,Hens N,Broeckhove J

    更新日期:2015-06-02 00:00:00

  • WebChem Viewer: a tool for the easy dissemination of chemical and structural data sets.

    abstract:BACKGROUND:Sharing sets of chemical data (e.g., chemical properties, docking scores, etc.) among collaborators with diverse skill sets is a common task in computer-aided drug design and medicinal chemistry. The ability to associate this data with images of the relevant molecular structures greatly facilitates scientifi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-159

    authors: Durrant JD,Amaro RE

    更新日期:2014-05-23 00:00:00

  • Is EC class predictable from reaction mechanism?

    abstract:BACKGROUND:We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-60

    authors: Nath N,Mitchell JB

    更新日期:2012-04-24 00:00:00

  • A novel approach for predicting protein S-glutathionylation.

    abstract:BACKGROUND:S-glutathionylation is the formation of disulfide bonds between the tripeptide glutathione and cysteine residues of the protein, protecting them from irreversible oxidation and in some cases causing change in their functions. Regulatory glutathionylation of proteins is a controllable and reversible process a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03571-w

    authors: Anashkina AA,Poluektov YM,Dmitriev VA,Kuznetsov EN,Mitkevich VA,Makarov AA,Petrushanko IY

    更新日期:2020-09-14 00:00:00

  • BRCA-Pathway: a structural integration and visualization system of TCGA breast cancer data on KEGG pathways.

    abstract:BACKGROUND:Bioinformatics research for finding biological mechanisms can be done by analysis of transcriptome data with pathway based interpretation. Therefore, researchers have tried to develop tools to analyze transcriptome data with pathway based interpretation. Over the years, the amount of omics data has become hu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2016-6

    authors: Kim I,Choi S,Kim S

    更新日期:2018-02-19 00:00:00