ModuleOrganizer: detecting modules in families of transposable elements.

Abstract:

BACKGROUND:Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome sequence. They have been subject to many mutations and other genomic events (copies, deletions, captures) during transposition. The identification of these transformations remains a difficult issue. The study of families of transposable elements is generally founded on a multiple alignment of their sequences, a critical step that is adapted to transposons containing mostly localized nucleotide mutations. Many transposons that have lost their protein-coding capacity have undergone more complex rearrangements, needing the development of more complex methods in order to characterize the architecture of sequence variations. RESULTS:In this study, we introduce the concept of a transposable element module, a flexible motif present in at least two sequences of a family of transposable elements and built on a succession of maximal repeats. The paper proposes an assembly method working on a set of exact maximal repeats of a set of sequences to create such modules. It results in a graphical view of sequences segmented into modules, a representation that allows a flexible analysis of the transformations that have occurred between them. We have chosen as a demonstration data set in depth analysis of the transposable element Foldback in Drosophila melanogaster. Comparison with multiple alignment methods shows that our method is more sensitive for highly variable sequences. The study of this family and the two other families AtREP21 and SIDER2 reveals new copies of very different sizes and various combinations of modules which show the potential of our method. CONCLUSIONS:ModuleOrganizer is available on the Genouest bioinformatics center at http://moduleorganizer.genouest.org.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Tempel S,Rousseau C,Tahi F,Nicolas J

doi

10.1186/1471-2105-11-474

subject

Has Abstract

pub_date

2010-09-22 00:00:00

pages

474

issn

1471-2105

pii

1471-2105-11-474

journal_volume

11

pub_type

杂志文章
  • Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.

    abstract:BACKGROUND:High-throughput sequencing technologies, such as the Illumina Genome Analyzer, are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key for drawing meaningful and accurate conclusions from the massive and complex datasets generat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-94

    authors: Bullard JH,Purdom E,Hansen KD,Dudoit S

    更新日期:2010-02-18 00:00:00

  • Random generalized linear model: a highly accurate and interpretable ensemble predictor.

    abstract:BACKGROUND:Ensemble predictors such as the random forest are known to have superior accuracy but their black-box predictions are difficult to interpret. In contrast, a generalized linear model (GLM) is very interpretable especially when forward feature selection is used to construct the model. However, forward feature ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-5

    authors: Song L,Langfelder P,Horvath S

    更新日期:2013-01-16 00:00:00

  • NeurphologyJ: an automatic neuronal morphology quantification method and its application in pharmacological discovery.

    abstract:BACKGROUND:Automatic quantification of neuronal morphology from images of fluorescence microscopy plays an increasingly important role in high-content screenings. However, there exist very few freeware tools and methods which provide automatic neuronal morphology quantification for pharmacological discovery. RESULTS:T...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-230

    authors: Ho SY,Chao CY,Huang HL,Chiu TW,Charoenkwan P,Hwang E

    更新日期:2011-06-08 00:00:00

  • hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation.

    abstract:BACKGROUND:Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next gene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2412-y

    authors: Choo-Wosoba H,Albert PS,Zhu B

    更新日期:2018-11-14 00:00:00

  • An automated method for rapid identification of putative gene family members in plants.

    abstract:BACKGROUND:Gene duplication events have played a significant role in genome evolution, particularly in plants. Exhaustive searches for all members of a known gene family as well as the identification of new gene families has become increasingly important. Subfunctionalization via changes in regulatory sequences followi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-S2-S19

    authors: Frank RL,Mane A,Ercal F

    更新日期:2006-09-06 00:00:00

  • CONSTAX: a tool for improved taxonomic resolution of environmental fungal ITS sequences.

    abstract:BACKGROUND:One of the most crucial steps in high-throughput sequence-based microbiome studies is the taxonomic assignment of sequences belonging to operational taxonomic units (OTUs). Without taxonomic classification, functional and biological information of microbial communities cannot be inferred or interpreted. The ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1952-x

    authors: Gdanetz K,Benucci GMN,Vande Pol N,Bonito G

    更新日期:2017-12-06 00:00:00

  • JISTIC: identification of significant targets in cancer.

    abstract:BACKGROUND:Cancer is caused through a multistep process, in which a succession of genetic changes, each conferring a competitive advantage for growth and proliferation, leads to the progressive conversion of normal human cells into malignant cancer cells. Interrogation of cancer genomes holds the promise of understandi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-189

    authors: Sanchez-Garcia F,Akavia UD,Mozes E,Pe'er D

    更新日期:2010-04-14 00:00:00

  • Species-specific analysis of protein sequence motifs using mutual information.

    abstract:BACKGROUND:Protein sequence motifs are by definition short fragments of conserved amino acids, often associated with a specific function. Accordingly protein sequence profiles derived from multiple sequence alignments provide an alternative description of functional motifs characterizing families of related sequences. ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-164

    authors: Hummel J,Keshvari N,Weckwerth W,Selbig J

    更新日期:2005-06-29 00:00:00

  • Efficient computation of absent words in genomic sequences.

    abstract:BACKGROUND:Analysis of sequence composition is a routine task in genome research. Organisms are characterized by their base composition, dinucleotide relative abundance, codon usage, and so on. Unique subsequences are markers of special interest in genome comparison, expression profiling, and genetic engineering. Relat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-167

    authors: Herold J,Kurtz S,Giegerich R

    更新日期:2008-03-26 00:00:00

  • Temporal dynamics of protein complexes in PPI networks: a case study using yeast cell cycle dynamics.

    abstract::Complexes of physically interacting proteins are one of the fundamental functional units responsible for driving key biological mechanisms within the cell. With the advent of high-throughput techniques, significant amount of protein interaction (PPI) data has been catalogued for organisms such as yeast, which has in t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S17-S16

    authors: Srihari S,Leong HW

    更新日期:2012-01-01 00:00:00

  • Bias detection and correction in RNA-Sequencing data.

    abstract:BACKGROUND:High throughput sequencing technology provides us unprecedented opportunities to study transcriptome dynamics. Compared to microarray-based gene expression profiling, RNA-Seq has many advantages, such as high resolution, low background, and ability to identify novel transcripts. Moreover, for genes with mult...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-290

    authors: Zheng W,Chung LM,Zhao H

    更新日期:2011-07-19 00:00:00

  • Automatic localization and identification of mitochondria in cellular electron cryo-tomography using faster-RCNN.

    abstract:BACKGROUND:Cryo-electron tomography (cryo-ET) enables the 3D visualization of cellular organization in near-native state which plays important roles in the field of structural cell biology. However, due to the low signal-to-noise ratio (SNR), large volume and high content complexity within cells, it remains difficult a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2650-7

    authors: Li R,Zeng X,Sigmund SE,Lin R,Zhou B,Liu C,Wang K,Jiang R,Freyberg Z,Lv H,Xu M

    更新日期:2019-03-29 00:00:00

  • Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

    abstract:BACKGROUND:Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. Th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-99

    authors: Vlasblom J,Wodak SJ

    更新日期:2009-03-30 00:00:00

  • Local functional descriptors for surface comparison based binding prediction.

    abstract:BACKGROUND:Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to p...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-314

    authors: Cipriano GM,Phillips GN Jr,Gleicher M

    更新日期:2012-11-24 00:00:00

  • DNAscan: personal computer compatible NGS analysis, annotation and visualisation.

    abstract:BACKGROUND:Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics tools for a wide range o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2791-8

    authors: Iacoangeli A,Al Khleifat A,Sproviero W,Shatunov A,Jones AR,Morgan SL,Pittman A,Dobson RJ,Newhouse SJ,Al-Chalabi A

    更新日期:2019-04-27 00:00:00

  • Quartet decomposition server: a platform for analyzing phylogenetic trees.

    abstract:BACKGROUND:The frequent exchange of genetic material among prokaryotes means that extracting a majority or plurality phylogenetic signal from many gene families, and the identification of gene families that are in significant conflict with the plurality signal is a frequent task in comparative genomics, and especially ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-123

    authors: Mao F,Williams D,Zhaxybayeva O,Poptsova M,Lapierre P,Gogarten JP,Xu Y

    更新日期:2012-06-07 00:00:00

  • MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction.

    abstract:BACKGROUND:Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for predi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-274

    authors: Blum T,Briesemeister S,Kohlbacher O

    更新日期:2009-09-01 00:00:00

  • Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

    abstract:BACKGROUND:Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain datab...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-4-16

    authors: Truong K,Ikura M

    更新日期:2003-05-06 00:00:00

  • Robust detection of periodic time series measured from biological systems.

    abstract:BACKGROUND:Periodic phenomena are widespread in biology. The problem of finding periodicity in biological time series can be viewed as a multiple hypothesis testing of the spectral content of a given time series. The exact noise characteristics are unknown in many bioinformatics applications. Furthermore, the observed ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-117

    authors: Ahdesmäki M,Lähdesmäki H,Pearson R,Huttunen H,Yli-Harja O

    更新日期:2005-05-13 00:00:00

  • A novel parametric approach to mine gene regulatory relationship from microarray datasets.

    abstract:BACKGROUND:Microarray has been widely used to measure the gene expression level on the genome scale in the current decade. Many algorithms have been developed to reconstruct gene regulatory networks based on microarray data. Unfortunately, most of these models and algorithms focus on global properties of the expression...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S11-S15

    authors: Liu W,Li D,Liu Q,Zhu Y,He F

    更新日期:2010-12-14 00:00:00

  • Network hub-node prioritization of gene regulation with intra-network association.

    abstract:BACKGROUND:To identify and prioritize the influential hub genes in a gene-set or biological pathway, most analyses rely on calculation of marginal effects or tests of statistical significance. These procedures may be inappropriate since hub nodes are common connection points and therefore may interact with other nodes ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3444-7

    authors: Chang HC,Chu CP,Lin SJ,Hsiao CK

    更新日期:2020-03-12 00:00:00

  • The EnzymeTracker: an open-source laboratory information management system for sample tracking.

    abstract:BACKGROUND:In many laboratories, researchers store experimental data on their own workstation using spreadsheets. However, this approach poses a number of problems, ranging from sharing issues to inefficient data-mining. Standard spreadsheets are also error-prone, as data do not undergo any validation process. To overc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-15

    authors: Triplet T,Butler G

    更新日期:2012-01-26 00:00:00

  • Discovering functional interaction patterns in protein-protein interaction networks.

    abstract:BACKGROUND:In recent years, a considerable amount of research effort has been directed to the analysis of biological networks with the availability of genome-scale networks of genes and/or proteins of an increasing number of organisms. A protein-protein interaction (PPI) network is a particular biological network which...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-276

    authors: Turanalp ME,Can T

    更新日期:2008-06-11 00:00:00

  • Selection of optimal reference genes for normalization in quantitative RT-PCR.

    abstract:BACKGROUND:Normalization in real-time qRT-PCR is necessary to compensate for experimental variation. A popular normalization strategy employs reference gene(s), which may introduce additional variability into normalized expression levels due to innate variation (between tissues, individuals, etc). To minimize this inna...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-253

    authors: Chervoneva I,Li Y,Schulz S,Croker S,Wilson C,Waldman SA,Hyslop T

    更新日期:2010-05-14 00:00:00

  • PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm.

    abstract:BACKGROUND:Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. RESULTS:PubFocus w...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-424

    authors: Plikus MV,Zhang Z,Chuong CM

    更新日期:2006-10-02 00:00:00

  • Efficient estimation of grouped survival models.

    abstract:BACKGROUND:Time- and dose-to-event phenotypes used in basic science and translational studies are commonly measured imprecisely or incompletely due to limitations of the experimental design or data collection schema. For example, drug-induced toxicities are not reported by the actual time or dose triggering the event, ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2899-x

    authors: Li Z,Lin J,Sibley AB,Truong T,Chua KC,Jiang Y,McCarthy J,Kroetz DL,Allen A,Owzar K

    更新日期:2019-05-28 00:00:00

  • Considering scores between unrelated proteins in the search database improves profile comparison.

    abstract:BACKGROUND:Profile-based comparison of multiple sequence alignments is a powerful methodology for the detection remote protein sequence similarity, which is essential for the inference and analysis of protein structure, function, and evolution. Accurate estimation of statistical significance of detected profile similar...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-399

    authors: Sadreyev RI,Wang Y,Grishin NV

    更新日期:2009-12-04 00:00:00

  • SAlign-a structure aware method for global PPI network alignment.

    abstract:BACKGROUND:High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of pr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03827-5

    authors: Ayub U,Haider I,Naveed H

    更新日期:2020-11-04 00:00:00

  • Phylogenomics and sequence-structure-function relationships in the GmrSD family of Type IV restriction enzymes.

    abstract:BACKGROUND:GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that G...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0773-z

    authors: Machnicka MA,Kaminska KH,Dunin-Horkawicz S,Bujnicki JM

    更新日期:2015-10-23 00:00:00

  • On pairwise distances and median score of three genomes under DCJ.

    abstract::In comparative genomics, the rearrangement distance between two genomes (equal the minimal number of genome rearrangements required to transform them into a single genome) is often used for measuring their evolutionary remoteness. Generalization of this measure to three genomes is known as the median score (while a re...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S19-S1

    authors: Aganezov S Jr,Alekseyev MA

    更新日期:2012-01-01 00:00:00