Phylogenomics and sequence-structure-function relationships in the GmrSD family of Type IV restriction enzymes.

Abstract:

BACKGROUND:GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that GmrS acts as endonuclease and NTPase whereas GmrD binds DNA. METHODS:In this work we applied homology detection, sequence conservation analysis, fold recognition and homology modeling methods to study sequence-structure-function relationships in the GmrSD restriction endonucleases family. We also analyzed the phylogeny and genomic context of the family members. RESULTS:Results of our comparative genomics study show that GmrS exhibits similarity to proteins from the ParB/Srx fold which can have both NTPase and nuclease activity. In contrast to the previous studies though, we attribute the nuclease activity also to GmrD as we found it to contain the HNH endonuclease motif. We revealed residues potentially important for structure and function in both domains. Moreover, we found that GmrSD systems exist predominantly as a fused, double-domain form rather than as a heterodimer and that their homologs are often encoded in regions enriched in defense and gene mobility-related elements. Finally, phylogenetic reconstructions of GmrS and GmrD domains revealed that they coevolved and only few GmrSD systems appear to be assembled from distantly related GmrS and GmrD components. CONCLUSIONS:Our study provides insight into sequence-structure-function relationships in the yet poorly characterized family of Type IV restriction enzymes. Comparative genomics allowed to propose possible role of GmrD domain in the function of the GmrSD enzyme and possible active sites of both GmrS and GmrD domains. Presented results can guide further experimental characterization of these enzymes.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Machnicka MA,Kaminska KH,Dunin-Horkawicz S,Bujnicki JM

doi

10.1186/s12859-015-0773-z

subject

Has Abstract

pub_date

2015-10-23 00:00:00

pages

336

issn

1471-2105

pii

10.1186/s12859-015-0773-z

journal_volume

16

pub_type

杂志文章
  • DECA: scalable XHMM exome copy-number variant calling with ADAM and Apache Spark.

    abstract:BACKGROUND:XHMM is a widely used tool for copy-number variant (CNV) discovery from whole exome sequencing data but can require hours to days to run for large cohorts. A more scalable implementation would reduce the need for specialized computational resources and enable increased exploration of the configuration parame...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3108-7

    authors: Linderman MD,Chia D,Wallace F,Nothaft FA

    更新日期:2019-10-11 00:00:00

  • Phylogenetic detection of conserved gene clusters in microbial genomes.

    abstract:BACKGROUND:Microbial genomes contain an abundance of genes with conserved proximity forming clusters on the chromosome. However, the conservation can be a result of many factors such as vertical inheritance, or functional selection. Thus, identification of conserved gene clusters that are under functional selection pro...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-243

    authors: Zheng Y,Anton BP,Roberts RJ,Kasif S

    更新日期:2005-10-03 00:00:00

  • Maximum expected accuracy structural neighbors of an RNA secondary structure.

    abstract:BACKGROUND:Since RNA molecules regulate genes and control alternative splicing by allostery, it is important to develop algorithms to predict RNA conformational switches. Some tools, such as paRNAss, RNAshapes and RNAbor, can be used to predict potential conformational switches; nevertheless, no existent tool can detec...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S5-S6

    authors: Clote P,Lou F,Lorenz WA

    更新日期:2012-04-12 00:00:00

  • Dynamic changes in the secondary structure of ECE-1 and XCE account for their different substrate specificities.

    abstract:BACKGROUND:X-converting enzyme (XCE) involved in nervous control of respiration, is a member of the M13 family of zinc peptidases, for which no natural substrate has been identified yet. In contrast, it's well characterized homologue endothelin-converting enzyme-1 (ECE-1) showed broad substrate specificity and acts as ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-285

    authors: Ul-Haq Z,Iqbal S,Moin ST

    更新日期:2012-11-01 00:00:00

  • Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach.

    abstract:BACKGROUND:Cellular functions are coordinately carried out by groups of genes forming functional modules. Identifying such modules in the transcriptional regulatory network (TRN) of organisms is important for understanding the structure and function of these fundamental cellular networks and essential for the emerging ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-199

    authors: Ma HW,Buer J,Zeng AP

    更新日期:2004-12-16 00:00:00

  • Quick, "imputation-free" meta-analysis with proxy-SNPs.

    abstract:BACKGROUND:Meta-analysis (MA) is widely used to pool genome-wide association studies (GWASes) in order to a) increase the power to detect strong or weak genotype effects or b) as a result verification method. As a consequence of differing SNP panels among genotyping chips, imputation is the method of choice within GWAS...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-231

    authors: Meesters C,Leber M,Herold C,Angisch M,Mattheisen M,Drichel D,Lacour A,Becker T

    更新日期:2012-09-12 00:00:00

  • Identification of consensus RNA secondary structures using suffix arrays.

    abstract:BACKGROUND:The identification of a consensus RNA motif often consists in finding a conserved secondary structure with minimum free energy in an ensemble of aligned sequences. However, an alignment is often difficult to obtain without prior structural information. Thus the need for tools to automate this process. RESUL...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-244

    authors: Anwar M,Nguyen T,Turcotte M

    更新日期:2006-05-05 00:00:00

  • TCGA2BED: extracting, extending, integrating, and querying The Cancer Genome Atlas.

    abstract:BACKGROUND:Data extraction and integration methods are becoming essential to effectively access and take advantage of the huge amounts of heterogeneous genomics and clinical data increasingly available. In this work, we focus on The Cancer Genome Atlas, a comprehensive archive of tumoral data containing the results of ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-016-1419-5

    authors: Cumbo F,Fiscon G,Ceri S,Masseroli M,Weitschek E

    更新日期:2017-01-03 00:00:00

  • Insertion and deletion correcting DNA barcodes based on watermarks.

    abstract:BACKGROUND:Barcode multiplexing is a key strategy for sharing the rising capacity of next-generation sequencing devices: Synthetic DNA tags, called barcodes, are attached to natural DNA fragments within the library preparation procedure. Different libraries, can individually be labeled with barcodes for a joint sequenc...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0482-7

    authors: Kracht D,Schober S

    更新日期:2015-02-18 00:00:00

  • GenHtr: a tool for comparative assessment of genetic heterogeneity in microbial genomes generated by massive short-read sequencing.

    abstract:BACKGROUND:Microevolution is the study of short-term changes of alleles within a population and their effects on the phenotype of organisms. The result of the below-species-level evolution is heterogeneity, where populations consist of subpopulations with a large number of structural variations. Heterogeneity analysis ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-508

    authors: Yu G

    更新日期:2010-10-12 00:00:00

  • Predicting protein functions by relaxation labelling protein interaction network.

    abstract:BACKGROUND:One of key issues in the post-genomic era is to assign functions to uncharacterized proteins. Since proteins seldom act alone; rather, they must interact with other biomolecular units to execute their functions. Thus, the functions of unknown proteins may be discovered through studying their interactions wit...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S64

    authors: Hu P,Jiang H,Emili A

    更新日期:2010-01-18 00:00:00

  • PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm.

    abstract:BACKGROUND:Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. RESULTS:PubFocus w...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-424

    authors: Plikus MV,Zhang Z,Chuong CM

    更新日期:2006-10-02 00:00:00

  • Notos - a galaxy tool to analyze CpN observed expected ratios for inferring DNA methylation types.

    abstract:BACKGROUND:DNA methylation patterns store epigenetic information in the vast majority of eukaryotic species. The relatively high costs and technical challenges associated with the detection of DNA methylation however have created a bias in the number of methylation studies towards model organisms. Consequently, it rema...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2115-4

    authors: Bulla I,Aliaga B,Lacal V,Bulla J,Grunau C,Chaparro C

    更新日期:2018-03-27 00:00:00

  • Gene set enrichment meta-learning analysis: next- generation sequencing versus microarrays.

    abstract:BACKGROUND:Reproducibility of results can have a significant impact on the acceptance of new technologies in gene expression analysis. With the recent introduction of the so-called next-generation sequencing (NGS) technology and established microarrays, one is able to choose between two completely different platforms f...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-176

    authors: Stiglic G,Bajgot M,Kokol P

    更新日期:2010-04-08 00:00:00

  • 3DScapeCS: application of three dimensional, parallel, dynamic network visualization in Cytoscape.

    abstract:BACKGROUND:The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-d...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-322

    authors: Wang Q,Tang B,Song L,Ren B,Liang Q,Xie F,Zhuo Y,Liu X,Zhang L

    更新日期:2013-11-14 00:00:00

  • Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases.

    abstract:BACKGROUND:Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion dise...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S4-S3

    authors: Pellegrini M,Renda ME,Vecchio A

    更新日期:2012-03-28 00:00:00

  • A multiple-alignment based primer design algorithm for genetically highly variable DNA targets.

    abstract:BACKGROUND:Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to populatio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-255

    authors: Brodin J,Krishnamoorthy M,Athreya G,Fischer W,Hraber P,Gleasner C,Green L,Korber B,Leitner T

    更新日期:2013-08-21 00:00:00

  • Assessing and predicting protein interactions by combining manifold embedding with multiple information integration.

    abstract:BACKGROUND:Protein-protein interactions (PPIs) play crucial roles in virtually every aspect of cellular function within an organism. Over the last decade, the development of novel high-throughput techniques has resulted in enormous amounts of data and provided valuable resources for studying protein interactions. Howev...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S7-S3

    authors: Lei YK,You ZH,Ji Z,Zhu L,Huang DS

    更新日期:2012-05-08 00:00:00

  • Comparative evaluation of gene-set analysis methods.

    abstract:BACKGROUND:Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Tes...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-431

    authors: Liu Q,Dinu I,Adewale AJ,Potter JD,Yasui Y

    更新日期:2007-11-07 00:00:00

  • Filling out the structural map of the NTF2-like superfamily.

    abstract:BACKGROUND:The NTF2-like superfamily is a versatile group of protein domains sharing a common fold. The sequences of these domains are very diverse and they share no common sequence motif. These domains serve a range of different functions within the proteins in which they are found, including both catalytic and non-ca...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-327

    authors: Eberhardt RY,Chang Y,Bateman A,Murzin AG,Axelrod HL,Hwang WC,Aravind L

    更新日期:2013-11-19 00:00:00

  • PVT: an efficient computational procedure to speed up next-generation sequence analysis.

    abstract:BACKGROUND:High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the dif...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-167

    authors: Maji RK,Sarkar A,Khatua S,Dasgupta S,Ghosh Z

    更新日期:2014-06-04 00:00:00

  • Widespread evidence of viral miRNAs targeting host pathways.

    abstract:BACKGROUND:MicroRNAs (miRNA) are regulatory genes that target and repress other RNA molecules via sequence-specific binding. Several biological processes are regulated across many organisms by evolutionarily conserved miRNAs. Plants and invertebrates employ their miRNA in defense against viruses by targeting and degrad...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S2-S3

    authors: Carl JW Jr,Trgovcich J,Hannenhalli S

    更新日期:2013-01-01 00:00:00

  • Extracting predictors for lung adenocarcinoma based on Granger causality test and stepwise character selection.

    abstract:BACKGROUND:Lung adenocarcinoma is the most common type of lung cancer, with high mortality worldwide. Its occurrence and development were thoroughly studied by high-throughput expression microarray, which produced abundant data on gene expression, DNA methylation, and miRNA quantification. However, the hub genes, which...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2739-z

    authors: Fan X,Wang Y,Tang XQ

    更新日期:2019-05-01 00:00:00

  • Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests.

    abstract:BACKGROUND:To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is compu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-324

    authors: Abby SS,Tannier E,Gouy M,Daubin V

    更新日期:2010-06-15 00:00:00

  • A web services choreography scenario for interoperating bioinformatics applications.

    abstract:BACKGROUND:Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run a...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-5-25

    authors: de Knikker R,Guo Y,Li JL,Kwan AK,Yip KY,Cheung DW,Cheung KH

    更新日期:2004-03-10 00:00:00

  • Analysis of Bovine Viral Diarrhea Viruses-infected monocytes: identification of cytopathic and non-cytopathic biotype differences.

    abstract:BACKGROUND:Bovine Viral Diarrhea Virus (BVDV) infection is widespread in cattle worldwide, causing important economic losses. Pathogenesis of the disease caused by BVDV is complex, as each BVDV strain has two biotypes: non-cytopathic (ncp) and cytopathic (cp). BVDV can cause a persistent latent infection and immune sup...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S6-S9

    authors: Ammari M,McCarthy FM,Nanduri B,Pinchuk LM

    更新日期:2010-10-07 00:00:00

  • A high-throughput de novo sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry.

    abstract:BACKGROUND:High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino ac...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-118

    authors: Pan C,Park BH,McDonald WH,Carey PA,Banfield JF,VerBerkmoes NC,Hettich RL,Samatova NF

    更新日期:2010-03-05 00:00:00

  • MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction.

    abstract:BACKGROUND:Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for predi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-274

    authors: Blum T,Briesemeister S,Kohlbacher O

    更新日期:2009-09-01 00:00:00

  • Detection of gene pathways with predictive power for breast cancer prognosis.

    abstract:BACKGROUND:Prognosis is of critical interest in breast cancer research. Biomedical studies suggest that genomic measurements may have independent predictive power for prognosis. Gene profiling studies have been conducted to search for predictive genomic measurements. Genes have the inherent pathway structure, where pat...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-1

    authors: Ma S,Kosorok MR

    更新日期:2010-01-01 00:00:00

  • DraGnET: software for storing, managing and analyzing annotated draft genome sequence data.

    abstract:BACKGROUND:New "next generation" DNA sequencing technologies offer individual researchers the ability to rapidly generate large amounts of genome sequence data at dramatically reduced costs. As a result, a need has arisen for new software tools for storage, management and analysis of genome sequence data. Although bioi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-100

    authors: Duncan S,Sirkanungo R,Miller L,Phillips GJ

    更新日期:2010-02-22 00:00:00