Discovery and characterization of Alu repeat sequences via precise local read assembly.

Abstract:

:Alu insertions have contributed to >11% of the human genome and ∼30-35 Alu subfamilies remain actively mobile, yet the characterization of polymorphic Alu insertions from short-read data remains a challenge. We build on existing computational methods to combine Alu detection and de novo assembly of WGS data as a means to reconstruct the full sequence of insertion events from Illumina paired end reads. Comparison with published calls obtained using PacBio long-reads indicates a false discovery rate below 5%, at the cost of reduced sensitivity due to the colocation of reference and non-reference repeats. We generate a highly accurate call set of 1614 completely assembled Alu variants from 53 samples from the Human Genome Diversity Project (HGDP) panel. We utilize the reconstructed alternative insertion haplotypes to genotype 1010 fully assembled insertions, obtaining >99% agreement with genotypes obtained by PCR. In our assembled sequences, we find evidence of premature insertion mechanisms and observe 5' truncation in 16% of AluYa5 and AluYb8 insertions. The sites of truncation coincide with stem-loop structures and SRP9/14 binding sites in the Alu RNA, implicating L1 ORF2p pausing in the generation of 5' truncations. Additionally, we identified variable AluJ and AluS elements that likely arose due to non-retrotransposition mechanisms.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Wildschutte JH,Baron A,Diroff NM,Kidd JM

doi

10.1093/nar/gkv1089

subject

Has Abstract

pub_date

2015-12-02 00:00:00

pages

10292-307

issue

21

eissn

0305-1048

issn

1362-4962

pii

gkv1089

journal_volume

43

pub_type

杂志文章
  • Affinity proteomic dissection of the human nuclear cap-binding complex interactome.

    abstract::A 5',7-methylguanosine cap is a quintessential feature of RNA polymerase II-transcribed RNAs, and a textbook aspect of co-transcriptional RNA processing. The cap is bound by the cap-binding complex (CBC), canonically consisting of nuclear cap-binding proteins 1 and 2 (NCBP1/2). Interest in the CBC has recently renewed...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa743

    authors: Dou Y,Kalmykova S,Pashkova M,Oghbaie M,Jiang H,Molloy KR,Chait BT,Rout MP,Fenyö D,Jensen TH,Altukhov I,LaCava J

    更新日期:2020-10-09 00:00:00

  • Systematic screening of CTCF binding partners identifies that BHLHE40 regulates CTCF genome-wide distribution and long-range chromatin interactions.

    abstract::CTCF plays a pivotal role in mediating chromatin interactions, but it does not do so alone. A number of factors have been reported to co-localize with CTCF and regulate CTCF loops, but no comprehensive analysis of binding partners has been performed. This prompted us to identify CTCF loop participants and regulators b...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa705

    authors: Hu G,Dong X,Gong S,Song Y,Hutchins AP,Yao H

    更新日期:2020-09-25 00:00:00

  • Molecular footprints of human immunoglobulin gene evolution: a new sequence family.

    abstract::Analysis of the human VK (ref. 2) gene locus led to the detection of a new sequence family (L sequences). Its copy number is in the range of 10(2). The L sequences, which are about 500 bp long, are found as part of the 3' flanking regions of a clustered set of human VKI genes but they occur also separate from the gene...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/12.13.5265

    authors: Straubinger B,Pech M,Mühlebach K,Jaenichen HR,Bauer HG,Zachau HG

    更新日期:1984-07-11 00:00:00

  • siRNAs from miRNA sites mediate DNA methylation of target genes.

    abstract::Arabidopsis microRNA (miRNA) genes (MIR) give rise to 20- to 22-nt miRNAs that are generated predominantly by the type III endoribonuclease Dicer-like 1 (DCL1) but do not require any RNA-dependent RNA Polymerases (RDRs) or RNA Polymerase IV (Pol IV). Here, we identify a novel class of non-conserved MIR genes that give...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq590

    authors: Chellappan P,Xia J,Zhou X,Gao S,Zhang X,Coutino G,Vazquez F,Zhang W,Jin H

    更新日期:2010-11-01 00:00:00

  • An efficient string matching algorithm with k differences for nucleotide and amino acid sequences.

    abstract::There are a few algorithms designed to solve the problem of the optimal alignment of one sequence, the pattern, of length m, with another, longer sequence the text, of length n. These algorithms allow mismatches, deletions and insertions. Algorithms to date run in O(mn) time. Let us define an integer, k, which is the ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/14.1.31

    authors: Landau GM,Vishkin U,Nussinov R

    更新日期:1986-01-10 00:00:00

  • Crystal engineering of HIV-1 reverse transcriptase for structure-based drug design.

    abstract::HIV-1 reverse transcriptase (RT) is a primary target for anti-AIDS drugs. Structures of HIV-1 RT, usually determined at approximately 2.5-3.0 A resolution, are important for understanding enzyme function and mechanisms of drug resistance in addition to being helpful in the design of RT inhibitors. Despite hundreds of ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn464

    authors: Bauman JD,Das K,Ho WC,Baweja M,Himmel DM,Clark AD Jr,Oren DA,Boyer PL,Hughes SH,Shatkin AJ,Arnold E

    更新日期:2008-09-01 00:00:00

  • SCOP: a structural classification of proteins database.

    abstract::The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and distant evolutionary relationships; the third, fold...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/28.1.257

    authors: Lo Conte L,Ailey B,Hubbard TJ,Brenner SE,Murzin AG,Chothia C

    更新日期:2000-01-01 00:00:00

  • Action of pancreatic DNase: requirements for activation of DNA as a template-primer for DNA polymerase.

    abstract::Pancreatic DNase requires both Ca2+ and Mg2+ for its activity as measured by formation of an activated DNA template for in vitro DNA polymerase alpha assay and by the hyperchromic shift. Mn2+ can partially satisfy the Mg2+ requirement of the DNase for activation of DNA but the resulting template is only 50% as active ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/4.8.2641

    authors: Baril E,Mitchener J,Lee L,Baril B

    更新日期:1977-08-01 00:00:00

  • Differential regulation of RNF8-mediated Lys48- and Lys63-based poly-ubiquitylation.

    abstract::Pairing of a given E3 ubiquitin ligase with different E2s allows synthesis of ubiquitin conjugates of different topologies. While this phenomenon contributes to functional diversity, it remains largely unknown how a single E3 ubiquitin ligase recognizes multiple E2s, and whether identical structural requirements deter...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr655

    authors: Lok GT,Sy SM,Dong SS,Ching YP,Tsao SW,Thomson TM,Huen MS

    更新日期:2012-01-01 00:00:00

  • MutaBind estimates and interprets the effects of sequence variants on protein-protein interactions.

    abstract::Proteins engage in highly selective interactions with their macromolecular partners. Sequence variants that alter protein binding affinity may cause significant perturbations or complete abolishment of function, potentially leading to diseases. There exists a persistent need to develop a mechanistic understanding of i...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkw374

    authors: Li M,Simonetti FL,Goncearenco A,Panchenko AR

    更新日期:2016-07-08 00:00:00

  • A distinct class of homeodomain proteins is encoded by two sequentially expressed Drosophila genes from the 93D/E cluster.

    abstract::Homeodomains appear to be one of the most frequently employed DNA-binding domains in a superfamily of transacting factors. It is likely that during evolution several sub-types of homeodomain have evolved from a common ancestral domain, resulting in distinct but closely related DNA-binding preferences. Here we describe...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.7.1202

    authors: Jagla K,Stanceva I,Dretzen G,Bellard F,Bellard M

    更新日期:1994-04-11 00:00:00

  • Models of triple-stranded polynucleotides with optimised stereochemistry.

    abstract::Detailed models are presented for the triple-stranded polynucleotide helices of poly (U)-poly (A)-poly (U) (two forms), poly (U)-poly d (A) -poly (U), poly d(C)-poly d(I)-poly d(C), poly d(T)-polyd(A)-poly d(T) and poly (I)-poly (A)-poly (I). The models were genrated using a computerized, linked-atom procedure which p...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/3.10.2459

    authors: Arnott S,Bond PJ,Selsing E,Smith PJ

    更新日期:1976-10-01 00:00:00

  • Identification and characterization of the mouse nuclear export factor (Nxf) family members.

    abstract::TAP/hNXF1 is a key factor that mediates general cellular mRNA export from the nucleus, and its orthologs are structurally and functionally conserved from yeast to humans. Metazoans encode additional proteins that share homology and domain organization with TAP/hNXF1, suggesting their participation in mRNA metabolism; ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gki706

    authors: Tan W,Zolotukhin AS,Tretyakova I,Bear J,Lindtner S,Smulevitch SV,Felber BK

    更新日期:2005-07-13 00:00:00

  • Genome urbanization: clusters of topologically co-regulated genes delineate functional compartments in the genome of Saccharomyces cerevisiae.

    abstract::The eukaryotic genome evolves under the dual constraint of maintaining coordinated gene transcription and performing effective DNA replication and cell division, the coupling of which brings about inevitable DNA topological tension. DNA supercoiling is resolved and, in some cases, even harnessed by the genome through ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkx198

    authors: Tsochatzidou M,Malliarou M,Papanikolaou N,Roca J,Nikolaou C

    更新日期:2017-06-02 00:00:00

  • Crystal structure of the 5hmC specific endonuclease PvuRts1I.

    abstract::PvuRts1I is a prototype for a larger family of restriction endonucleases that cleave DNA containing 5-hydroxymethylcytosine (5hmC) or 5-glucosylhydroxymethylcytosine (5ghmC), but not 5-methylcytosine (5mC) or cytosine. Here, we report a crystal structure of the enzyme at 2.35 Å resolution. Although the protein has bee...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku186

    authors: Kazrani AA,Kowalska M,Czapinska H,Bochtler M

    更新日期:2014-05-01 00:00:00

  • The 5' and 3' splice sites come together via a three dimensional diffusion mechanism.

    abstract::We present evidence that the splice sites in mammalian pre-mRNAs are brought together via a three dimensional diffusion mechanism. We tested two mechanisms for splice site pairing: a lateral diffusion ('scanning') model and the currently favored three dimensional diffusion ('jumping') model. Two lines of evidence that...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/24.9.1638

    authors: Pasman Z,Garcia-Blanco MA

    更新日期:1996-05-01 00:00:00

  • DNA sequence of Rhizobium trifolii nodulation genes reveals a reiterated and potentially regulatory sequence preceding nodABC and nodFE.

    abstract::The Rhizobium trifolii nod genes required for host-specific nodulation of clovers are located on 14 kb of Sym (symbiotic) plasmid DNA. Analysis of the nucleotide sequence of a 3.7 kb portion of this region has revealed open reading frames corresponding to the nodABCDEF genes. A DNA sequencing technique, using primer e...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/14.7.2891

    authors: Schofield PR,Watson JM

    更新日期:1986-04-11 00:00:00

  • NUCPLOT: a program to generate schematic diagrams of protein-nucleic acid interactions.

    abstract::Proteins that bind to DNA are found in all areas of genetic activity within the cell. To help understand how these proteins perform their various functions, it is useful to analyse which residues are involved in binding to the DNA and how they interact with the bases and sugar-phosphate backbone of nucleic acids. Here...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/25.24.4940

    authors: Luscombe NM,Laskowski RA,Thornton JM

    更新日期:1997-12-15 00:00:00

  • Structural classification of zinc fingers: survey and summary.

    abstract::Zinc fingers are small protein domains in which zinc plays a structural role contributing to the stability of the domain. Zinc fingers are structurally diverse and are present among proteins that perform a broad range of functions in various cellular processes, such as replication and repair, transcription and transla...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg161

    authors: Krishna SS,Majumdar I,Grishin NV

    更新日期:2003-01-15 00:00:00

  • Infectious TYMV RNA from cloned cDNA: effects in vitro and in vivo of point substitutions in the initiation codons of two extensively overlapping ORFs.

    abstract::Full-length cDNA of the 6.3 kb turnip yellow mosaic virus (TYMV) genome was placed between a T7 promoter and a unique Hind III site. In vitro transcription of Hind III-linearized DNA of clone pTYMC yielded full-length RNA transcripts. In inoculations of Chinese cabbage protoplasts and plants, capped transcripts and vi...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/17.12.4675

    authors: Weiland JJ,Dreher TW

    更新日期:1989-06-26 00:00:00

  • A novel method for determining linkage between DNA sequences: hybridization to paired probe arrays.

    abstract::Cooperative hybridization has been used to establish physical linkage between two loci on a DNA strand. Linkage was detected by hybridization to a new type of high-density oligonucleotide array. Each synthesis location on the array contains a mixture of two different probe sequences. Each of the two probes can hybridi...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/27.6.1485

    authors: Gentalen E,Chee M

    更新日期:1999-03-15 00:00:00

  • ADA3-containing complexes associate with estrogen receptor alpha.

    abstract::Transcriptional repression and activation by nuclear receptors (NRs) are brought about by coregulator complexes. These complexes modify the chromatin environment of target genes and affect the activity of the basal transcription machinery. We have previously implicated the yeast ADA3 protein in transcriptional activat...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/30.11.2508

    authors: Benecke A,Gaudon C,Garnier JM,vom Baur E,Chambon P,Losson R

    更新日期:2002-06-01 00:00:00

  • SDPMOD: an automated comparative modeling server for small disulfide-bonded proteins.

    abstract::Small disulfide-bonded proteins (SDPs) are rich sources for therapeutic drugs. Designing drugs from these proteins requires three-dimensional structural information, which is only available for a subset of these proteins. SDPMOD addresses this deficit in structural information by providing a freely available automated...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkh394

    authors: Kong L,Lee BT,Tong JC,Tan TW,Ranganathan S

    更新日期:2004-07-01 00:00:00

  • An obligate intermediate along the slow folding pathway of a group II intron ribozyme.

    abstract::Most RNA molecules collapse rapidly and reach the native state through a pathway that contains numerous traps and unproductive intermediates. The D135 group II intron ribozyme is unusual in that it can fold slowly and directly to the native state, despite its large size and structural complexity. Here we use hydroxyl ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gki973

    authors: Su LJ,Waldsich C,Pyle AM

    更新日期:2005-11-27 00:00:00

  • Structure and mechanism of the 2',3' phosphatase component of the bacterial Pnkp-Hen1 RNA repair system.

    abstract::Pnkp is the end-healing and end-sealing component of an RNA repair system present in diverse bacteria from many phyla. Pnkp is composed of three catalytic modules: an N-terminal polynucleotide 5' kinase, a central 2',3' phosphatase and a C-terminal ligase. The phosphatase module is a Mn(2+)-dependent phosphodiesterase...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt221

    authors: Wang LK,Smith P,Shuman S

    更新日期:2013-06-01 00:00:00

  • Major capsid reinforcement by a minor protein in herpesviruses and phage.

    abstract::Herpes simplex type 1 virus (HSV-1) and bacteriophage λ capsids undergo considerable structural changes during self-assembly and DNA packaging. The initial steps of viral capsid self-assembly require weak, non-covalent interactions between the capsid subunits to ensure free energy minimization and error-free assembly....

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku634

    authors: Sae-Ueng U,Liu T,Catalano CE,Huffman JB,Homa FL,Evilevitch A

    更新日期:2014-08-01 00:00:00

  • KRAS promoter oligonucleotide with decoy activity dimerizes into a unique topology consisting of two G-quadruplex units.

    abstract::Mutations of the KRAS proto-oncogene are associated with several tumor types, which is why it is being considered as a target for anti-cancer drug development. The human KRAS promoter contains a nuclease hypersensitive element (NHE), which can bind to nuclear proteins and is believed to form G-quadruplex structures. P...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv1359

    authors: Podbevšek P,Plavec J

    更新日期:2016-01-29 00:00:00

  • U1-independent pre-mRNA splicing contributes to the regulation of alternative splicing.

    abstract::U1 snRNP plays a crucial role in the 5' splice site recognition during splicing. Here we report the first example of naturally occurring U1-independent U2-type splicing in humans. The U1 components were not included in the pre-spliceosomal E complex formed on the human F1gamma (hF1gamma) intron 9 in vitro. Moreover, h...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp050

    authors: Fukumura K,Taniguchi I,Sakamoto H,Ohno M,Inoue K

    更新日期:2009-04-01 00:00:00

  • Microchip electrophoresis: a method for high-speed SNP detection.

    abstract::As a trial practical application, we have applied optimized microfabricated electrophoresis devices, combined with enzymatic mutation detection methods, to the determination of single nucleotide polymorphism (SNP) sites in the p53 suppressor gene. Using clinical samples, we have achieved robust assays with quality fac...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/28.9.e43

    authors: Schmalzing D,Belenky A,Novotny MA,Koutny L,Salas-Solano O,El-Difrawy S,Adourian A,Matsudaira P,Ehrlich D

    更新日期:2000-05-01 00:00:00

  • The European Bioinformatics Institute in 2017: data coordination and integration.

    abstract::The European Bioinformatics Institute (EMBL-EBI) supports life-science research throughout the world by providing open data, open-source software and analytical tools, and technical infrastructure (https://www.ebi.ac.uk). We accommodate an increasingly diverse range of data types and integrate them, so that biologists...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkx1154

    authors: Cook CE,Bergman MT,Cochrane G,Apweiler R,Birney E

    更新日期:2018-01-04 00:00:00