Abstract:
BACKGROUND:Horizontal gene transfer, i.e. the acquisition of genetic material from nonparent organism, is considered an important force driving species evolution. Many cases of horizontal gene transfer from prokaryotes to eukaryotes have been registered, but no transfer mechanism has been deciphered so far, although viruses were proposed as possible vectors in several studies. In agreement with this idea, in our previous study we discovered that in two eukaryotic proteins bacteriophage recombination site (AttP) was adjacent to the regions originating via horizontal gene transfer. In one of those cases AttP site was present inside the introns of cysteine-rich repeats. In the present study we aimed to apply computational tools for finding multiple horizontal gene transfer events in large genome databases. For that purpose we used a sequence of cysteine-rich repeats to identify genes potentially acquired through horizontal transfer. RESULTS:HMMER remote similarity search significantly detected 382 proteins containing cysteine-rich repeats. All of them, except 8 sequences, belong to eukaryotes. In 124 proteins the presence of conserved structural domains was predicted. In spite of the fact that cysteine-rich repeats are found almost exclusively in eukaryotic proteins, many predicted domains are most common for prokaryotes or bacteriophages. Ninety-eight proteins out of 124 contain typical prokaryotic domains. In those cases proteins were considered as potentially originating via horizontal transfer. In addition, HHblits search revealed that two domains of the same fungal protein, Glycoside hydrolase and Peptidase M15, have high similarity with proteins of two different prokaryotic species, hinting at independent horizontal gene transfer events. CONCLUSIONS:Cysteine-rich repeats in eukaryotic proteins are usually accompanied by conserved domains typical for prokaryotes or bacteriophages. These proteins, containing both cysteine-rich repeats, and characteristic prokaryotic domains, might represent multiple independent horizontal gene transfer events from prokaryotes to eukaryotes. We believe that the presence of bacteriophage recombination site inside cysteine-rich repeat coding sequence may facilitate horizontal genes transfer. Thus computational approach, described in the present study, can help finding multiple sequences originated from horizontal transfer in eukaryotic genomes.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Daugavet MA,Shabelnikov SV,Podgornaya OIdoi
10.1186/s12859-020-03599-ysubject
Has Abstractpub_date
2020-07-24 00:00:00pages
305issue
Suppl 12issn
1471-2105pii
10.1186/s12859-020-03599-yjournal_volume
21pub_type
杂志文章abstract:BACKGROUND:Maize is a leading crop in the modern agricultural industry that accounts for more than 40% grain production worldwide. THe double haploid technique that uses fewer breeding generations for generating a maize line has accelerated the pace of development of superior commercial seed varieties and has been tran...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2267-2
更新日期:2018-08-13 00:00:00
abstract:BACKGROUND:The generation of multiple sequence alignments (MSAs) is a crucial step for many bioinformatic analyses. Thus improving MSA accuracy and identifying potential errors in MSAs is important for a wide range of post-genomic research. We present a novel method called MergeAlign which constructs consensus MSAs fro...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-117
更新日期:2012-05-30 00:00:00
abstract:BACKGROUND:It has long been recognized that sensitivity analysis plays a key role in modeling and analyzing cellular and biochemical processes. Systems biology markup language (SBML) has become a well-known platform for coding and sharing mathematical models of such processes. However, current SBML compatible software ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-342
更新日期:2008-08-15 00:00:00
abstract:BACKGROUND:While biomedical text mining is emerging as an important research area, practical results have proven difficult to achieve. We believe that an important first step towards more accurate text-mining lies in the ability to identify and characterize text that satisfies various types of information needs. We rep...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-356
更新日期:2006-07-25 00:00:00
abstract:BACKGROUND:Inferring molecular pathway activity is an important step towards reducing the complexity of genomic data, understanding the heterogeneity in clinical outcome, and obtaining molecular correlates of cancer imaging traits. Increasingly, approaches towards pathway activity inference combine molecular profiles (...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-403
更新日期:2011-10-19 00:00:00
abstract:BACKGROUND:Protein-DNA interactions are important for many cellular processes, however structural knowledge for a large fraction of known and putative complexes is still lacking. Computational docking methods aim at the prediction of complex architecture given detailed structures of its constituents. They are becoming ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-228
更新日期:2012-09-11 00:00:00
abstract:BACKGROUND:Most state-of-the-art biomedical entity normalization systems, such as rule-based systems, merely rely on morphological information of entity mentions, but rarely consider their semantic information. In this paper, we introduce a novel convolutional neural network (CNN) architecture that regards biomedical e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1805-7
更新日期:2017-10-03 00:00:00
abstract:UNLABELLED: BACKGROUND:Acquiring and exploring whole genome sequence information for a species under investigation is now a routine experimental approach. On most genome browsers, typically, only the DNA sequence, EST support, motif search results, and GO annotations are displayed. However, for many species, a growing...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-447
更新日期:2011-11-15 00:00:00
abstract:BACKGROUND:In many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources. RESULTS:We proposed a probabilistic classification algorithm based on labe...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-S12-S5
更新日期:2013-01-01 00:00:00
abstract:BACKGROUND:Antibiotics are the widely prescribed drugs for children and most likely to be related with adverse reactions. Record on adverse reactions and allergies from antibiotics considerably affect the prescription choices. We consider this a biomedical decision-making problem and explore hidden knowledge in survey ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-S6-S7
更新日期:2014-01-01 00:00:00
abstract:BACKGROUND:Infectious disease modeling and computational power have evolved such that large-scale agent-based models (ABMs) have become feasible. However, the increasing hardware complexity requires adapted software designs to achieve the full potential of current high-performance workstations. RESULTS:We have found l...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0612-2
更新日期:2015-06-02 00:00:00
abstract:BACKGROUND:DNA methylation is an important heritable epigenetic mark that plays a crucial role in transcriptional regulation and the pathogenesis of various human disorders. The commonly used DNA methylation measurement approaches, e.g., Illumina Infinium HumanMethylation-27 and -450 BeadChip arrays (27 K and 450 K arr...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03865-z
更新日期:2020-12-01 00:00:00
abstract:BACKGROUND:The computation of phylogenetic trees on the same set of species that are based on different orthologous genes can lead to incongruent trees. One possible explanation for this behavior are interspecific hybridization events recombining genes of different species. An important approach to analyze such events ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0660-7
更新日期:2015-07-30 00:00:00
abstract::We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists work...
journal_title:BMC bioinformatics
pub_type:
doi:10.1186/1471-2105-9-S1-S1
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:To understand biological processes and diseases, it is crucial to unravel the concerted interplay of transcription factors (TFs), microRNAs (miRNAs) and their targets within regulatory networks and fundamental sub-networks. An integrative computational resource generating a comprehensive view of these regula...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-67
更新日期:2011-03-04 00:00:00
abstract:BACKGROUND:Protein sequence alignment analyses have become a crucial step for many bioinformatics studies during the past decades. Multiple sequence alignment (MSA) and pair-wise sequence alignment (PSA) are two major approaches in sequence alignment. Former benchmark studies revealed drawbacks of MSA methods on nucleo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2524-4
更新日期:2018-12-31 00:00:00
abstract:BACKGROUND:Formal classification of a large collection of protein structures aids the understanding of evolutionary relationships among them. Classifications involving manual steps, such as SCOP and CATH, face the challenge of increasing volume of available structures. Automatic methods such as FSSP or Dali Domain Dict...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-74
更新日期:2008-01-31 00:00:00
abstract:BACKGROUND:Recent advances in global genomic profiling methodologies have enabled multi-dimensional characterization of biological systems. Complete analysis of these genomic profiles require an in depth look at parallel profiles of segmental DNA copy number status, DNA methylation state, single nucleotide polymorphism...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-243
更新日期:2008-05-20 00:00:00
abstract:BACKGROUND:The efficient and robust statistical analysis of the shape of plant organs of different cultivars is an important investigation issue in plant breeding and enables a robust cultivar description within the breeding progress. Laserscanning is a highly accurate and high resolution technique to acquire the 3D sh...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-03654-8
更新日期:2020-07-29 00:00:00
abstract:BACKGROUND:The canonical code, although prevailing in complex genomes, is not universal. It was shown the canonical genetic code superior robustness compared to random codes, but it is not clearly determined how it evolved towards its current form. The error minimization theory considers the minimization of point mutat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1608-x
更新日期:2017-03-27 00:00:00
abstract:BACKGROUND:The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number o...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-523
更新日期:2010-10-20 00:00:00
abstract:BACKGROUND:The PathoLogic program constructs Pathway/Genome databases by using a genome's annotation to predict the set of metabolic pathways present in an organism. PathoLogic determines the set of reactions composing those pathways from the enzymes annotated in the organism's genome. Most annotation efforts fail to a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-76
更新日期:2004-06-09 00:00:00
abstract:BACKGROUND:In addition to single-locus (main) effects of disease variants, there is a growing consensus that gene-gene and gene-environment interactions may play important roles in disease etiology. However, for the very large numbers of genetic markers currently in use, it has proven difficult to develop suitable and ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-S1-S75
更新日期:2009-01-30 00:00:00
abstract:BACKGROUND:Mechanistic models that describe the dynamical behaviors of biochemical systems are common in computational systems biology, especially in the realm of cellular signaling. The development of families of such models, either by a single research group or by different groups working within the same area, presen...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-316
更新日期:2014-09-25 00:00:00
abstract:BACKGROUND:Microarray co-expression signatures are an important tool for studying gene function and relations between genes. In addition to genuine biological co-expression, correlated signals can result from technical deficiencies like hybridization of reporters with off-target transcripts. An approach that is able to...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-461
更新日期:2007-11-26 00:00:00
abstract:BACKGROUND:New sequencing techniques require new visualization strategies, as is the case for epigenomics data such as DNA base modifications, small non-coding RNAs, and histone modifications. RESULTS:We present a set of plugins for the genome browser JBrowse that are targeted for epigenomics visualizations. Specifica...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2160-z
更新日期:2018-04-25 00:00:00
abstract:BACKGROUND:We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S16
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Viral infectious diseases are the serious threat for human health. The receptor-binding is the first step for the viral infection of hosts. To more effectively treat human viral infectious diseases, the hidden virus-receptor interactions must be discovered. However, current computational methods for predicti...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3278-3
更新日期:2019-12-27 00:00:00
abstract:BACKGROUND:Genome-wide transcriptional profiling of patient blood samples offers a powerful tool to investigate underlying disease mechanisms and personalized treatment decisions. Most studies are based on analysis of total peripheral blood mononuclear cells (PBMCs), a mixed population. In this case, accuracy is inhere...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-258
更新日期:2011-06-24 00:00:00
abstract:BACKGROUND:Recently copy number variation (CNV) has gained considerable interest as a type of genomic/genetic variation that plays an important role in disease susceptibility. Advances in sequencing technology have created an opportunity for detecting CNVs more accurately. Recently whole exome sequencing (WES) has beco...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1705-x
更新日期:2017-05-31 00:00:00