Discovery and characterization of Alu repeat sequences via precise local read assembly.


:Alu insertions have contributed to >11% of the human genome and ∼30-35 Alu subfamilies remain actively mobile, yet the characterization of polymorphic Alu insertions from short-read data remains a challenge. We build on existing computational methods to combine Alu detection and de novo assembly of WGS data as a means to reconstruct the full sequence of insertion events from Illumina paired end reads. Comparison with published calls obtained using PacBio long-reads indicates a false discovery rate below 5%, at the cost of reduced sensitivity due to the colocation of reference and non-reference repeats. We generate a highly accurate call set of 1614 completely assembled Alu variants from 53 samples from the Human Genome Diversity Project (HGDP) panel. We utilize the reconstructed alternative insertion haplotypes to genotype 1010 fully assembled insertions, obtaining >99% agreement with genotypes obtained by PCR. In our assembled sequences, we find evidence of premature insertion mechanisms and observe 5' truncation in 16% of AluYa5 and AluYb8 insertions. The sites of truncation coincide with stem-loop structures and SRP9/14 binding sites in the Alu RNA, implicating L1 ORF2p pausing in the generation of 5' truncations. Additionally, we identified variable AluJ and AluS elements that likely arose due to non-retrotransposition mechanisms.


Nucleic Acids Res


Nucleic acids research


Wildschutte JH,Baron A,Diroff NM,Kidd JM




Has Abstract


2015-12-02 00:00:00














  • Advanced computational techniques for re-sequencing DNA with polymerase signaling assay arrays.

    abstract::Re-sequencing, the identification of the specific variants in a sequence of interest compared with a known genomic sequence, is a ubiquitous task in today's biology. Universal arrays, which interrogate all possible oligonucleotides of a certain length in a target sequence, have been suggested for computationally deter...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Pe'er I,Arbili N,Liu Y,Enck C,Gelfand CA,Shamir R

    更新日期:2003-10-01 00:00:00

  • Changes in methylation pattern of albumin and alpha-fetoprotein genes in developing rat liver and neoplasia.

    abstract::To determine whether methylation changes in specific DNA sequences of the albumin and AFP genes are implicated in the modulation of transcriptional activity during rat liver development and neoplasia we have analysed the methylation pattern of C-C-G-G sequences within these genes in DNA isolated from fetal and adult h...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Vedel M,Gomez-Garcia M,Sala M,Sala-Trepat JM

    更新日期:1983-07-11 00:00:00

  • The internal loops in the lower stem of primary microRNA transcripts facilitate single cleavage of human Microprocessor.

    abstract::The human Microprocessor complex cleaves primary microRNA (miRNA) transcripts (pri-miRNAs) to initiate miRNA synthesis. Microprocessor consists of DROSHA (an RNase III enzyme), and DGCR8. DROSHA contains two RNase III domains, RIIIDa and RIIIDb, which simultaneously cleave the 3p- and 5p-strands of pri-miRNAs, respect...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Nguyen TL,Nguyen TD,Bao S,Li S,Nguyen TA

    更新日期:2020-03-18 00:00:00

  • The Zebrafish Information Network: the zebrafish model organism database provides expanded support for genotypes and phenotypes.

    abstract::The Zebrafish Information Network (ZFIN,, the model organism database for zebrafish, provides the central location for curated zebrafish genetic, genomic and developmental data. Extensive data integration of mutant phenotypes, genes, expression patterns, sequences, genetic markers, morpholinos, map po...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Sprague J,Bayraktaroglu L,Bradford Y,Conlin T,Dunn N,Fashena D,Frazer K,Haendel M,Howe DG,Knight J,Mani P,Moxon SA,Pich C,Ramachandran S,Schaper K,Segerdell E,Shao X,Singer A,Song P,Sprunger B,Van Slyke CE,Weste

    更新日期:2008-01-01 00:00:00

  • Database resources of the National Center for Biotechnology Information.

    abstract::The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. The Entrez system provides search and re...

    journal_title:Nucleic acids research

    pub_type: 杂志文章,评审


    authors: Sayers EW,Beck J,Bolton EE,Bourexis D,Brister JR,Canese K,Comeau DC,Funk K,Kim S,Klimke W,Marchler-Bauer A,Landrum M,Lathrop S,Lu Z,Madden TL,O'Leary N,Phan L,Rangwala SH,Schneider VA,Skripchenko Y,Wang J,Ye J,

    更新日期:2021-01-08 00:00:00

  • DNA sequence of the 16S rRNA/23S rRNA intercistronic spacer of two rDNA operons of the archaebacterium Methanococcus vannielii.

    abstract::The DNA sequence of the spacer (plus flanking) regions separating the 16S rRNA and 23S rRNA genes of two presumptive rDNA operons of the archaebacterium Methanococcus vannielii was determined. The spacers are 156 and 242 base pairs in size and they share a sequence homology of 49 base pairs following the 3' terminus o...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Jarsch M,Böck A

    更新日期:1983-11-11 00:00:00

  • The major form of MeCP2 has a novel N-terminus generated by alternative splicing.

    abstract::MeCP2 is a methyl-CpG binding protein that can repress transcription of nearby genes. In humans, mutations in the MECP2 gene are the major cause of Rett syndrome. By searching expressed sequence tag (EST) databases we have found a novel MeCP2 splice isoform (MeCP2alpha) which encodes a distinct N-terminus. We demonstr...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Kriaucionis S,Bird A

    更新日期:2004-03-19 00:00:00

  • ZKSCAN3 counteracts cellular senescence by stabilizing heterochromatin.

    abstract::Zinc finger protein with KRAB and SCAN domains 3 (ZKSCAN3) has long been known as a master transcriptional repressor of autophagy. Here, we identify a novel role for ZKSCAN3 in alleviating senescence that is independent of its autophagy-related activity. Downregulation of ZKSCAN3 is observed in aged human mesenchymal ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Hu H,Ji Q,Song M,Ren J,Liu Z,Wang Z,Liu X,Yan K,Hu J,Jing Y,Wang S,Zhang W,Liu GH,Qu J

    更新日期:2020-06-19 00:00:00

  • Carbodiimide modification analysis of aminoacylated yeast phenylalanine tRNA: evidence for change in the apex region.

    abstract::The G- and U-specific reagent, carbodiimide was used to probe the solution structure of aminoacylated yeast phenylalanine tRNA. Both quantitative and qualitative changes in modification were observed when the modification patterns of tRNA-CCA(3'OH), tRNA-CCA(3'NH2) and phe-tRNA-CCA(3'NH2) were compared. Five nucleotid...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Fritzinger DC,Fournier MJ

    更新日期:1982-04-10 00:00:00

  • ATPase activity of the UvrA and UvrAB protein complexes of the Escherichia coli UvrABC endonuclease.

    abstract::We have analyzed the ATPase activity exhibited by the UvrABC DNA repair complex. The UvrA protein is an ATPase whose lack of DNA dependence may be related to the ATP induced monomer-dimer transitions. ATP induced dimerization may be responsible for the enhanced DNA binding activity observed in the presence of ATP. Alt...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Oh EY,Claassen L,Thiagalingam S,Mazur S,Grossman L

    更新日期:1989-06-12 00:00:00

  • Inhibition of DNA replication and repair by cadmium in mammalian cells. Protective interaction of zinc.

    abstract::The effects of the treatment of cultured human and simian cells with Cadmium (Cd), a toxic and carcinogenic metal, were first assayed on macromolecular synthesis. It was observed that DNA synthesis was inhibited by Cd concentrations considerably lower than those inhibiting protein and RNA synthesis. Because of the nec...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Nocentini S

    更新日期:1987-05-26 00:00:00

  • Splicing of the U6 RNA precursor is impaired in fission yeast pre-mRNA splicing mutants.

    abstract::U6 RNA is a member of a class of small abundant stable nuclear RNAs that are essential for splicing. In all species examined so far, the U6 RNA is a RNA polymerase III transcript. The U6 gene of the fission yeast Schizosaccharomyces pombe is unusual in that it is interrupted by an intron whose structure is similar to ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Potashkin J,Frendewey D

    更新日期:1989-10-11 00:00:00

  • Unique organization of the human BCR gene promoter.

    abstract::The promoter of the human BCR gene, regulating the transcription of the chimeric BCR/ABL mRNA in leukemia, has been isolated and characterized. A region of 1.1 kb immediately 5' to the transcription start site was analyzed in detail by sequencing, DNase 1 footprinting, gel retardation and functional studies. These exp...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Zhu QS,Heisterkamp N,Groffen J

    更新日期:1990-12-11 00:00:00

  • Interaction of N-terminal domain of U1A protein with an RNA stem/loop.

    abstract::The U1A protein is a sequence-specific RNA binding protein found in the U1 snRNP particle where it binds to stem/loop II of U1 snRNA. U1A contains two 'RNP' or 'RRM' (RNA Recognition Motif) domains, which are common to many RNA-binding proteins. The N-terminal RRM has been shown to bind specifically to the U1 RNA stem...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Hall KB,Stump WT

    更新日期:1992-08-25 00:00:00

  • Promoter and enhancer activities of long terminal repeats associated with cellular retrovirus-like (VL30) elements.

    abstract::LTR units associated with cellular retrovirus-like elements are abundantly present in chromosomal DNA of animal cells. We have analyzed the promoter and enhancer activities of diverse LTR units associated with different members of the murine retrovirus-like family known as VL30. We report here that the structurally he...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Rotman G,Itin A,Keshet E

    更新日期:1986-01-24 00:00:00

  • Recognition of distinct HLA-DQA1 promoter elements by a single nuclear factor containing Jun and Fos or antigenically related proteins.

    abstract::The activity of MHC class II promoters depends upon conserved regulatory signals one of which, the extended X-box, contains in its X2 subregion a sequence related to the cAMP response element, CRE and to the TPA response element, TRE. Accordingly, X2 is recognized by the AP-1 factor and by other c-Jun or c-Fos contain...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Neve Ombra M,Autiero M,DeLerma Barbaro A,Barretta R,Del Pozzo G,Guardiola J

    更新日期:1993-04-25 00:00:00

  • Role of the ITS2-proximal stem and evidence for indirect recognition of processing sites in pre-rRNA processing in yeast.

    abstract::Eucaryotic ribosome biogenesis involves many cis-acting sequences and trans-acting factors, including snoRNAS: We have used directed mutagenesis of rDNA plasmids in yeast to identify critical sequence and structural elements within and flanking the ITS2-proximal stem. This base paired structure, present in the mature ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Côté CA,Peculis BA

    更新日期:2001-05-15 00:00:00

  • On the mechanism of the modular primer effect.

    abstract::Modular primers are strings of three contiguously annealed unligated oligonucleotides (modules) as short as 5- or 6-mers, selected from a presynthesized library. It was previously found that such strings can prime DNA sequencing reactions specifically, thus eliminating the need for the primer synthesis step in DNA seq...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Beskin AD,Zevin-Sonkin D,Sobolev IA,Ulanovsky LE

    更新日期:1995-08-11 00:00:00

  • A variable tandem repeat locus mapped to chromosome band 10q26 is amplified and rearranged in leukocyte DNAs of two cancer patients.

    abstract::A highly polymorphic locus associated with the variable tandem repetition of a 35 bp consensus sequence was mapped to chromosome 10, band q26. Examination of leukocyte DNA from a cancer patient revealed the twenty-fold amplification of one allelic fragment of this locus, while the other allelic fragment demonstrated a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Colb M,Yang-Feng T,Francke U,Mermer B,Parkinson DR,Krontiris TG

    更新日期:1986-10-24 00:00:00

  • Stochastic noise in splicing machinery.

    abstract::The number of known alternative human isoforms has been increasing steadily with the amount of available transcription data. To date, over 100 000 isoforms have been detected in EST libraries, and at least 75% of human genes have at least one alternative isoform. In this paper, we propose that most alternative splicin...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Melamud E,Moult J

    更新日期:2009-08-01 00:00:00

  • G-quadruplex formation in human telomeric (TTAGGG)4 sequence with complementary strand in close vicinity under molecularly crowded condition.

    abstract::Chromosomes in vertebrates are protected at both ends by telomere DNA composed of tandem (TTAGGG)n repeats. DNA replication produces a blunt-ended leading strand telomere and a lagging strand telomere carrying a single-stranded G-rich overhang at its end. The G-rich strand can form G-quadruplex structure in the presen...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Kan ZY,Lin Y,Wang F,Zhuang XY,Zhao Y,Pang DW,Hao YH,Tan Z

    更新日期:2007-01-01 00:00:00

  • A processed chicken pseudogene (CPS1) related to the ras oncogene superfamily.

    abstract::We describe the first polyA-containing processed pseudogene reported in the chicken. It includes a 0.52 kb open reading frame which could encode a 175 amino acid protein. The putative protein shows extensive homology to the ras oncogene superfamily, being most closely related to the yeast protein YP2. It is one of the...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Alsip GR,Konkel DA

    更新日期:1986-03-11 00:00:00

  • Modification of the aminopyridine unit of 2'-deoxyaminopyridinyl-pseudocytidine allowing triplex formation at CG interruptions in homopurine sequences.

    abstract::The antigene strategy based on site-specific recognition of duplex DNA by triplex DNA formation has been exploited in a wide range of biological activities. However, specific triplex formation is mostly restricted to homo-purine strands within the target duplex DNA, due to the destabilizing effect of CG and TA inversi...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Wang L,Taniguchi Y,Okamura H,Sasaki S

    更新日期:2018-09-28 00:00:00

  • Two distinct overstretched DNA states.

    abstract::The DNA double helix undergoes an 'overstretching' transition in a narrow force range near 65 pN. Despite numerous studies the basic question of whether the strands are separated or not remains controversial. Here we show that overstretching in fact involves two distinct types of double-helix reorganization: slow hyst...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Fu H,Chen H,Marko JF,Yan J

    更新日期:2010-09-01 00:00:00

  • Direct sequencing of RAPD fragments using 3'-extended oligonucleotide primers and dye terminator cycle-sequencing.

    abstract::Random amplified polymorphic DNA (RAPD) markers are used widely to develop high resolution genetic maps and for genome fingerprinting. Typically, single oligomers of approximately 10 nucleotides are used to PCR amplify characteristic RAPD marker fragments. We describe an efficient method for the direct end-sequencing ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Mitchelson KR,Drenth J,Duong H,Chaparro JX

    更新日期:1999-10-01 00:00:00

  • A WW-like module in the RAG1 N-terminal domain contributes to previously unidentified protein-protein interactions.

    abstract::More than one-third of the RAG1 protein can be truncated from the N-terminus with only subtle effects on the products of V(D)J recombination in vitro or in a mouse. What, then, is the function of the N-terminal domain? We believe it to be regulatory. We determined, several years ago, that an included RING motif could ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Maitra R,Sadofsky MJ

    更新日期:2009-06-01 00:00:00

  • Unidirectional translocation from recognition site and a necessary interaction with DNA end for cleavage by Type III restriction enzyme.

    abstract::Type III restriction enzymes have been demonstrated to require two unmethylated asymmetric recognition sites oriented head-to-head to elicit double-strand break 25-27 bp downstream of one of the two sites. The proposed DNA cleavage mechanism involves ATP-dependent DNA translocation. The sequence context of the recogni...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Raghavendra NK,Rao DN

    更新日期:2004-10-22 00:00:00

  • A temperature-sensitive mutant of Escherichia coli affected in the alpha subunit of RNA polymerase.

    abstract::A temperature-sensitive mutant of Escherichia coli affected in the alpha subunit of RNA polymerase has been investigated. Gene mapping and complementation experiments placed the mutation to temperature-sensitivity within the alpha operon at 72 min. on the bacterial chromosome. The rate of RNA synthesis in vivo and the...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Mehrpouyan M,Champney WS

    更新日期:1990-06-25 00:00:00

  • The Pfam protein families database in 2019.

    abstract::The last few years have witnessed significant changes in Pfam ( The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfa...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: El-Gebali S,Mistry J,Bateman A,Eddy SR,Luciani A,Potter SC,Qureshi M,Richardson LJ,Salazar GA,Smart A,Sonnhammer ELL,Hirsh L,Paladin L,Piovesan D,Tosatto SCE,Finn RD

    更新日期:2019-01-08 00:00:00

  • MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity.

    abstract::MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by in...

    journal_title:Nucleic acids research

    pub_type: 杂志文章


    authors: Wang Y,Tang H,Debarry JD,Tan X,Li J,Wang X,Lee TH,Jin H,Marler B,Guo H,Kissinger JC,Paterson AH

    更新日期:2012-04-01 00:00:00