Abstract:
BACKGROUND:Automated genotype calling in tetraploid species was until recently not possible, which hampered genetic analysis. Modern genotyping assays often produce two signals, one for each allele of a bi-allelic marker. While ample software is available to obtain genotypes (homozygous for either allele, or heterozygous) for diploid species from these signals, such software is not available for tetraploid species which may be scored as five alternative genotypes (aaaa, baaa, bbaa, bbba and bbbb; nulliplex to quadruplex). RESULTS:We present a novel algorithm, implemented in the R package fitTetra, to assign genotypes for bi-allelic markers to tetraploid samples from genotyping assays that produce intensity signals for both alleles. The algorithm is based on the fitting of several mixture models with five components, one for each of the five possible genotypes. The models have different numbers of parameters specifying the relation between the five component means, and some of them impose a constraint on the mixing proportions to conform to Hardy-Weinberg equilibrium (HWE) ratios. The software rejects markers that do not allow a reliable genotyping for the majority of the samples, and it assigns a missing score to samples that cannot be scored into one of the five possible genotypes with sufficient confidence. CONCLUSIONS:We have validated the software with data of a collection of 224 potato varieties assayed with an Illumina GoldenGate™ 384 SNP array and shown that all SNPs with informative ratio distributions are fitted. Almost all fitted models appear to be correct based on visual inspection and comparison with diploid samples. When the collection of potato varieties is analyzed as if it were a population, almost all markers seem to be in Hardy-Weinberg equilibrium. The R package fitTetra is freely available under the GNU Public License from http://www.plantbreeding.wur.nl/UK/software_fitTetra.html and as Additional files with this article.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Voorrips RE,Gort G,Vosman Bdoi
10.1186/1471-2105-12-172subject
Has Abstractpub_date
2011-05-19 00:00:00pages
172issn
1471-2105pii
1471-2105-12-172journal_volume
12pub_type
杂志文章abstract:BACKGROUND:Prioritizing disease genes is trying to identify potential disease causing genes for a given phenotype, which can be applied to reveal the inherited basis of human diseases and facilitate drug development. Our motivation is inspired by label propagation algorithm and the false positive protein-protein intera...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2040-6
更新日期:2018-02-08 00:00:00
abstract:BACKGROUND:Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. RESULTS:In this paper, we have proposed a Vertical...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-353
更新日期:2011-08-25 00:00:00
abstract:BACKGROUND:Designing small-molecule kinase inhibitors with desirable selectivity profiles is a major challenge in drug discovery. A high-throughput screen for inhibitors of a given kinase will typically yield many compounds that inhibit more than one kinase. A series of chemical modifications are usually required befor...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-491
更新日期:2008-11-25 00:00:00
abstract:BACKGROUND:Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome sequence. They have been subject to many mutations and other genomic events (copies, deletions, captures) during transposition. The identification ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-474
更新日期:2010-09-22 00:00:00
abstract:BACKGROUND:New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bio...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S14-S4
更新日期:2009-11-10 00:00:00
abstract:BACKGROUND:The analysis of tissue-specific protein interaction networks and their functional enrichment in pathological and normal tissues provides insights on the etiology of diseases. The Pan-cancer proteomic project, in The Cancer Genome Atlas, collects protein expressions in human cancers and it is a reference reso...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2183-5
更新日期:2018-07-09 00:00:00
abstract:BACKGROUND:The spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers. Thus, eliciting chromatin conformation is important, yet challenging due to compaction, dyna...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3424-y
更新日期:2020-02-24 00:00:00
abstract:BACKGROUND:Enhancers are stretches of DNA (100-1000 bp) that play a major role in development gene expression, evolution and disease. It has been recently shown that in high-level eukaryotes enhancers rarely work alone, instead they collaborate by forming clusters of cis-regulatory modules (CRMs). Although the binding ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-0980-2
更新日期:2016-03-18 00:00:00
abstract:BACKGROUND:Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite de...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3211-9
更新日期:2019-12-24 00:00:00
abstract:BACKGROUND:Automated protein function prediction methods are needed to keep pace with high-throughput sequencing. With the existence of many programs and databases for inferring different protein functions, a pipeline that properly integrates these resources will benefit from the advantages of each method. However, int...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-52
更新日期:2008-01-25 00:00:00
abstract:BACKGROUND:We study the statistical properties of fragment coverage in genome sequencing experiments. In an extension of the classic Lander-Waterman model, we consider the effect of the length distribution of fragments. We also introduce a coding of the shape of the coverage depth function as a tree and explain how thi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-430
更新日期:2010-08-18 00:00:00
abstract:BACKGROUND:The adaptation of the CRISPR-Cas9 system to pooled library gene knockout screens in mammalian cells represents a major technological leap over RNA interference, the prior state of the art. New methods for analyzing the data and evaluating results are needed. RESULTS:We offer BAGEL (Bayesian Analysis of Gene...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1015-8
更新日期:2016-04-16 00:00:00
abstract:BACKGROUND:Current malaria diagnosis relies primarily on microscopic examination of Giemsa-stained thick and thin blood films. This method requires vigorously trained technicians to efficiently detect and classify the malaria parasite species such as Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) for an appropria...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S17-S18
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Di...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-278
更新日期:2012-10-30 00:00:00
abstract:BACKGROUND:An important mechanism of endocrine activity is chemicals entering target cells via transport proteins and then interacting with hormone receptors such as the estrogen receptor (ER). α-Fetoprotein (AFP) is a major transport protein in rodent serum that can bind and sequester estrogens, thus preventing entry ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S14-S6
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:The analysis of genome synteny is a common practice in comparative genomics. With the advent of DNA sequencing technologies, individual biologists can rapidly produce their genomic sequences of interest. Although web-based synteny visualization tools are convenient for biologists to use, none of the existing...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-316
更新日期:2011-08-02 00:00:00
abstract:BACKGROUND:While biomedical text mining is emerging as an important research area, practical results have proven difficult to achieve. We believe that an important first step towards more accurate text-mining lies in the ability to identify and characterize text that satisfies various types of information needs. We rep...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-356
更新日期:2006-07-25 00:00:00
abstract:BACKGROUND:We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-60
更新日期:2012-04-24 00:00:00
abstract:BACKGROUND:The information in large collections of phylogenetic trees is useful for many comparative genomic studies. Therefore, there is a need for flexible tools that allow exploration of such collections in order to retrieve relevant data as quickly as possible. RESULTS:In this paper, we present TPMS (Tree Pattern-...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-109
更新日期:2013-03-27 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are single-stranded non-coding RNAs known to regulate a wide range of cellular processes by silencing the gene expression at the protein and/or mRNA levels. Computational prediction of miRNA targets is essential for elucidating the detailed functions of miRNA. However, the prediction speci...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-476
更新日期:2010-09-22 00:00:00
abstract:BACKGROUND:The PathoLogic program constructs Pathway/Genome databases by using a genome's annotation to predict the set of metabolic pathways present in an organism. PathoLogic determines the set of reactions composing those pathways from the enzymes annotated in the organism's genome. Most annotation efforts fail to a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-76
更新日期:2004-06-09 00:00:00
abstract:BACKGROUND:In real-time PCR data analysis, the cycle threshold (CT) method is currently the gold standard. This method is based on an assumption of equal PCR efficiency in all reactions, and precision may suffer if this condition is not met. Nonlinear regression analysis (NLR) or curve fitting has therefore been sugges...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-107
更新日期:2006-03-03 00:00:00
abstract:BACKGROUND:One of the main goals in cancer studies including high-throughput microRNA (miRNA) and mRNA data is to find and assess prognostic signatures capable of predicting clinical outcome. Both mRNA and miRNA expression changes in cancer diseases are described to reflect clinical characteristics like staging and pro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-488
更新日期:2011-12-21 00:00:00
abstract:BACKGROUND:Hierarchical Multi-Label Classification is a classification task where the classes to be predicted are hierarchically organized. Each instance can be assigned to classes belonging to more than one path in the hierarchy. This scenario is typically found in protein function prediction, considering that each pr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1232-1
更新日期:2016-09-15 00:00:00
abstract:BACKGROUND:Analysis of gene expression data in terms of a priori-defined gene sets has recently received significant attention as this approach typically yields more compact and interpretable results than those produced by traditional methods that rely on individual genes. The set-level strategy can also be adopted wit...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S10-S15
更新日期:2012-06-25 00:00:00
abstract:BACKGROUND:The developments of high-throughput genotyping technologies, which enable the simultaneous genotyping of hundreds of thousands of single nucleotide polymorphisms (SNP) have the potential to increase the benefits of genetic epidemiology studies. Although the enhanced resolution of these platforms increases th...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-102
更新日期:2009-04-03 00:00:00
abstract:BACKGROUND:The traditional phylogeny analysis within gene family is mainly based on DNA or amino acid sequence homologies. However, these phylogenetic tree analyses are not suitable for those "non-traditional" gene families like microRNA with very short sequences. For the normal protein-coding gene families, low bootst...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-79
更新日期:2011-03-18 00:00:00
abstract:BACKGROUND:The importance of biodiversity conservation has been increasing steadily due to its benefits to human beings. Recently, producing and managing biodiversity databases have become much easier because of the information technology (IT) advancement. This made the general public's participation in biodiversity co...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S15-S11
更新日期:2009-12-03 00:00:00
abstract:BACKGROUND:Fluorescence microscopy is widely used to determine the subcellular location of proteins. Efforts to determine location on a proteome-wide basis create a need for automated methods to analyze the resulting images. Over the past ten years, the feasibility of using machine learning methods to recognize all maj...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-210
更新日期:2007-06-19 00:00:00
abstract:BACKGROUND:Many biases and spurious effects are inherent in RNA-seq technology, resulting in a non-uniform distribution of sequencing read counts for each base position in a gene. Therefore, a base-level strategy is required to model the non-uniformity. Also, the properties of sequencing read counts can be leveraged to...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1780-z
更新日期:2017-08-09 00:00:00