A format for databasing and comparison of AFLP fingerprint profiles.

Abstract:

BACKGROUND:Amplified fragment length polymorphism (AFLP) is a PCR-based technique that involves restriction of genomic DNA followed by ligation of adaptors to the fragments generated and selective PCR amplification of a subset of these fragments. The amplified fragments are separated on a sequencing gel and visualized by autoradiography or fluorescent sequencing equipment. AFLP allows high-resolution genotyping but the lack of a format for databasing and comparison of AFLP fingerprint profiles limits its wider applications in profiling large numbers of biological samples. RESULTS:A scheme is described to represent a DNA fingerprint profile with a nucleotide sequence-like format in which the information line contains the minimal necessary details to interpret an AFLP DNA fingerprint profile. They include technique used, information on restriction enzymes, primer combination, biological source for DNA materials, fragment sizing and annotation. The bodylines contain information on size and relative intensity of DNA fragments by a string of defined alphabets or symbols. Algorithms for normalizing raw data, binning of fragments and comparing AFLP DNA fingerprint profiles are described. Firstly, the peak heights are normalized against their average and then represented by five symbols according to their relative intensities. Secondly, a binning algorithm based loosely on common springs and rubber bands is applied, which positions sequence fragments into their best possible integer approximations. A BLAST-like reward-penalty concept is used to compare AFLP fingerprint profiles by matching peaks using two metrics: score and percentage of similarity. A software package was developed based on our scheme and proposed algorithms. Example of use this software is given in evaluating novelty of a new tropical orchid cultivar by comparing its AFLP fingerprint profile against those of related commercial cultivars in a database. CONCLUSIONS:AFLP DNA fingerprint profiles can be databased and compared effectively with software developed based on our scheme and algorithms. It will facilitate wider use of this DNA fingerprinting technique in areas such as forensic study, intellectual property protection for biological materials and biodiversity management. Moreover, the same concepts can be applied to databasing and comparing DNA fingerprint profiles obtained with other DNA fingerprint techniques.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Hong Y,Chuah A

doi

10.1186/1471-2105-4-7

keywords:

subject

Has Abstract

pub_date

2003-02-25 00:00:00

pages

7

issn

1471-2105

journal_volume

4

pub_type

杂志文章
  • An SVM-based system for predicting protein subnuclear localizations.

    abstract:BACKGROUND:The large gap between the number of protein sequences in databases and the number of functionally characterized proteins calls for the development of a fast computational tool for the prediction of subnuclear and subcellular localizations generally applicable to protein sequences. The information on localiza...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-291

    authors: Lei Z,Dai Y

    更新日期:2005-12-07 00:00:00

  • PVT: an efficient computational procedure to speed up next-generation sequence analysis.

    abstract:BACKGROUND:High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-167

    authors: Maji RK,Sarkar A,Khatua S,Dasgupta S,Ghosh Z

    更新日期:2014-06-04 00:00:00

  • SitesIdentify: a protein functional site prediction tool.

    abstract:BACKGROUND:The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function i...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-379

    authors: Bray T,Chan P,Bougouffa S,Greaves R,Doig AJ,Warwicker J

    更新日期:2009-11-18 00:00:00

  • Automated prediction of HIV drug resistance from genotype data.

    abstract:BACKGROUND:HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS:A unified...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1114-6

    authors: Shen C,Yu X,Harrison RW,Weber IT

    更新日期:2016-08-31 00:00:00

  • Colonyzer: automated quantification of micro-organism growth characteristics on solid agar.

    abstract:BACKGROUND:High-throughput screens comparing growth rates of arrays of distinct micro-organism cultures on solid agar are useful, rapid methods of quantifying genetic interactions. Growth rate is an informative phenotype which can be estimated by measuring cell densities at one or more times after inoculation. Precise ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-287

    authors: Lawless C,Wilkinson DJ,Young A,Addinall SG,Lydall DA

    更新日期:2010-05-28 00:00:00

  • Assembly-free genome comparison based on next-generation sequencing reads and variable length patterns.

    abstract:BACKGROUND:With the advent of Next-Generation Sequencing technologies (NGS), a large amount of short read data has been generated. If a reference genome is not available, the assembly of a template sequence is usually challenging because of repeats and the short length of reads. When NGS reads cannot be mapped onto a r...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S9-S1

    authors: Comin M,Schimd M

    更新日期:2014-01-01 00:00:00

  • Exploring the transcription factor activity in high-throughput gene expression data using RLQ analysis.

    abstract:BACKGROUND:Interpretation of gene expression microarray data in the light of external information on both columns and rows (experimental variables and gene annotations) facilitates the extraction of pertinent information hidden in these complex data. Biologists classically interpret genes of interest after retrieving f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-178

    authors: Baty F,Rüdiger J,Miglino N,Kern L,Borger P,Brutsche M

    更新日期:2013-06-06 00:00:00

  • Computational approaches to protein inference in shotgun proteomics.

    abstract::Shotgun proteomics has recently emerged as a powerful approach to characterizing proteomes in biological samples. Its overall objective is to identify the form and quantity of each protein in a high-throughput manner by coupling liquid chromatography with tandem mass spectrometry. As a consequence of its high throughp...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1186/1471-2105-13-S16-S4

    authors: Li YF,Radivojac P

    更新日期:2012-01-01 00:00:00

  • Integrated prediction of one-dimensional structural features and their relationships with conformational flexibility in helical membrane proteins.

    abstract:BACKGROUND:Many structural properties such as solvent accessibility, dihedral angles and helix-helix contacts can be assigned to each residue in a membrane protein. Independent studies exist on the analysis and sequence-based prediction of some of these so-called one-dimensional features. However, there is little expla...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-533

    authors: Ahmad S,Singh YH,Paudel Y,Mori T,Sugita Y,Mizuguchi K

    更新日期:2010-10-27 00:00:00

  • A global optimization algorithm for protein surface alignment.

    abstract:BACKGROUND:A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-488

    authors: Bertolazzi P,Guerra C,Liuzzi G

    更新日期:2010-09-29 00:00:00

  • The effect of rare variants on inflation of the test statistics in case-control analyses.

    abstract:BACKGROUND:The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test stati...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0496-1

    authors: Pirie A,Wood A,Lush M,Tyrer J,Pharoah PD

    更新日期:2015-02-20 00:00:00

  • FQStat: a parallel architecture for very high-speed assessment of sequencing quality metrics.

    abstract:BACKGROUND:High throughput DNA/RNA sequencing has revolutionized biological and clinical research. Sequencing is widely used, and generates very large amounts of data, mainly due to reduced cost and advanced technologies. Quickly assessing the quality of giga-to-tera base levels of sequencing data has become a routine ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3015-y

    authors: Chanumolu SK,Albahrani M,Otu HH

    更新日期:2019-08-15 00:00:00

  • OpenMS - an open-source software framework for mass spectrometry.

    abstract:BACKGROUND:Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-163

    authors: Sturm M,Bertsch A,Gröpl C,Hildebrandt A,Hussong R,Lange E,Pfeifer N,Schulz-Trieglaff O,Zerck A,Reinert K,Kohlbacher O

    更新日期:2008-03-26 00:00:00

  • Integrated olfactory receptor and microarray gene expression databases.

    abstract:BACKGROUND:Gene expression patterns of olfactory receptors (ORs) are an important component of the signal encoding mechanism in the olfactory system since they determine the interactions between odorant ligands and sensory neurons. We have developed the Olfactory Receptor Microarray Database (ORMD) to house OR gene exp...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-231

    authors: Liu N,Crasto CJ,Ma M

    更新日期:2007-06-30 00:00:00

  • Quality determination and the repair of poor quality spots in array experiments.

    abstract:BACKGROUND:A common feature of microarray experiments is the occurrence of missing gene expression data. These missing values occur for a variety of reasons, in particular, because of the filtering of poor quality spots and the removal of undefined values when a logarithmic transformation is applied to negative backgro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-234

    authors: Tom BD,Gilks WR,Brooke-Powell ET,Ajioka JW

    更新日期:2005-09-26 00:00:00

  • Bounded search for de novo identification of degenerate cis-regulatory elements.

    abstract:BACKGROUND:The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-cou...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-254

    authors: Carlson JM,Chakravarty A,Khetani RS,Gross RH

    更新日期:2006-05-15 00:00:00

  • On the consistency of orthology relationships.

    abstract:BACKGROUND:Orthologs inference is the starting point of most comparative genomics studies, and a plethora of methods have been designed in the last decade to address this challenging task. In this paper we focus on the problems of deciding consistency with a species tree (known or not) of a partial set of orthology/par...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1267-3

    authors: Jones M,Paul C,Scornavacca C

    更新日期:2016-11-11 00:00:00

  • PFClust: a novel parameter free clustering algorithm.

    abstract:BACKGROUND:We present the algorithm PFClust (Parameter Free Clustering), which is able automatically to cluster data and identify a suitable number of clusters to group them into without requiring any parameters to be specified by the user. The algorithm partitions a dataset into a number of clusters that share some co...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-213

    authors: Mavridis L,Nath N,Mitchell JB

    更新日期:2013-07-03 00:00:00

  • Fast and accurate clustering of noncoding RNAs using ensembles of sequence alignments and secondary structures.

    abstract:BACKGROUND:Clustering of unannotated transcripts is an important task to identify novel families of noncoding RNAs (ncRNAs). Several hierarchical clustering methods have been developed using similarity measures based on the scores of structural alignment. However, the high computational cost of exact structural alignme...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S1-S48

    authors: Saito Y,Sato K,Sakakibara Y

    更新日期:2011-02-15 00:00:00

  • In silico modelling of hormone response elements.

    abstract:BACKGROUND:An important step in understanding the conditions that specify gene expression is the recognition of gene regulatory elements. Due to high diversity of different types of transcription factors and their DNA binding preferences, it is a challenging problem to establish an accurate model for recognition of fun...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-S4-S27

    authors: Stepanova M,Lin F,Lin VC

    更新日期:2006-12-12 00:00:00

  • Protein local 3D structure prediction by Super Granule Support Vector Machines (Super GSVM).

    abstract:BACKGROUND:Understanding the relationship between the protein sequence and the 3D structure is a major research area in bioinformatics. The prediction of complete protein tertiary structure based only on sequence information is still an impractical work. This paper aims at revealing the hidden knowledge of the sequence...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S11-S15

    authors: Chen B,Johnson M

    更新日期:2009-10-08 00:00:00

  • Phylogenomics and sequence-structure-function relationships in the GmrSD family of Type IV restriction enzymes.

    abstract:BACKGROUND:GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that G...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0773-z

    authors: Machnicka MA,Kaminska KH,Dunin-Horkawicz S,Bujnicki JM

    更新日期:2015-10-23 00:00:00

  • A weighted string kernel for protein fold recognition.

    abstract:BACKGROUND:Alignment-free methods for comparing protein sequences have proved to be viable alternatives to approaches that first rely on an alignment of the sequences to be compared. Much work however need to be done before those methods provide reliable fold recognition for proteins whose sequences share little simila...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1795-5

    authors: Nojoomi S,Koehl P

    更新日期:2017-08-25 00:00:00

  • Functionally specified protein signatures distinctive for each of the different blue copper proteins.

    abstract:BACKGROUND:Proteins having similar functions from different sources can be identified by the occurrence in their sequences, a conserved cluster of amino acids referred to as pattern, motif, signature or fingerprint. The wide usage of protein sequence analysis in par with the growth of databases signifies the importance...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-127

    authors: Giri AV,Anishetty S,Gautam P

    更新日期:2004-09-09 00:00:00

  • Model based heritability scores for high-throughput sequencing data.

    abstract:BACKGROUND:Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing. RESULTS:We propose s...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1539-6

    authors: Rudra P,Shi WJ,Vestal B,Russell PH,Odell A,Dowell RD,Radcliffe RA,Saba LM,Kechris K

    更新日期:2017-03-02 00:00:00

  • CGHpower: exploring sample size calculations for chromosomal copy number experiments.

    abstract:BACKGROUND:Determining a suitable sample size is an important step in the planning of microarray experiments. Increasing the number of arrays gives more statistical power, but adds to the total cost of the experiment. Several approaches for sample size determination have been developed for expression array studies, but...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-331

    authors: Scheinin I,Ferreira JA,Knuutila S,Meijer GA,van de Wiel MA,Ylstra B

    更新日期:2010-06-17 00:00:00

  • RWRMTN: a tool for predicting disease-associated microRNAs based on a microRNA-target gene network.

    abstract:BACKGROUND:The misregulation of microRNA (miRNA) has been shown to cause diseases. Recently, we have proposed a computational method based on a random walk framework on a miRNA-target gene network to predict disease-associated miRNAs. The prediction performance of our method is better than that of some existing state-o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03578-3

    authors: Le DH,Tran TTH

    更新日期:2020-06-15 00:00:00

  • GibbsST: a Gibbs sampling method for motif discovery with enhanced resistance to local optima.

    abstract:BACKGROUND:Computational discovery of transcription factor binding sites (TFBS) is a challenging but important problem of bioinformatics. In this study, improvement of a Gibbs sampling based technique for TFBS discovery is attempted through an approach that is widely known, but which has never been investigated before:...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-486

    authors: Shida K

    更新日期:2006-11-04 00:00:00

  • Knowledge-guided multi-scale independent component analysis for biomarker identification.

    abstract:BACKGROUND:Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a nove...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-416

    authors: Chen L,Xuan J,Wang C,Shih IeM,Wang Y,Zhang Z,Hoffman E,Clarke R

    更新日期:2008-10-06 00:00:00

  • Spot quantification in two dimensional gel electrophoresis image analysis: comparison of different approaches and presentation of a novel compound fitting algorithm.

    abstract:BACKGROUND:Various computer-based methods exist for the detection and quantification of protein spots in two dimensional gel electrophoresis images. Area-based methods are commonly used for spot quantification: an area is assigned to each spot and the sum of the pixel intensities in that area, the so-called volume, is ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-181

    authors: Brauner JM,Groemer TW,Stroebel A,Grosse-Holz S,Oberstein T,Wiltfang J,Kornhuber J,Maler JM

    更新日期:2014-06-11 00:00:00