Genome classification improvements based on k-mer intervals in sequences.

Abstract:

:Given the vast amount of genomic data, alignment-free sequence comparison methods are required due to their low computational complexity. k-mer based methods can improve comparison accuracy by extracting an effective feature of the genome sequences. The aim of this paper is to extract k-mer intervals of a sequence as a feature of a genome for high comparison accuracy. In the proposed method, we calculated the distance between genome sequences by comparing the distribution of k-mer intervals. Then, we identified the classification results using phylogenetic trees. We used viral, mitochondrial (MT), microbial and mammalian genome sequences to perform classification for various genome sets. We confirmed that the proposed method provides a better classification result than other k-mer based methods. Furthermore, the proposed method could efficiently be applied to long sequences such as human and mouse genomes.

journal_name

Genomics

journal_title

Genomics

authors

Han GB,Cho DH

doi

10.1016/j.ygeno.2018.11.001

subject

Has Abstract

pub_date

2019-12-01 00:00:00

pages

1574-1582

issue

6

eissn

0888-7543

issn

1089-8646

pii

S0888-7543(18)30447-6

journal_volume

111

pub_type

杂志文章

相关文献

GENOMICS文献大全
  • DNA-methylation dependent regulation of embryo-specific 5S ribosomal DNA cluster transcription in adult tissues of sea urchin Paracentrotus lividus.

    abstract::We have previously reported a molecular and cytogenetic characterization of three different 5S rDNA clusters in the sea urchin Paracentrotus lividus and recently, demonstrated the presence of high heterogeneity in functional 5S rRNA. In this paper, we show some important distinctive data on 5S rRNA transcription for t...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2013.08.001

    authors: Bellavia D,Dimarco E,Naselli F,Caradonna F

    更新日期:2013-10-01 00:00:00

  • XY sex reversal associated with a nonsense mutation in SRY.

    abstract::Sex determination in humans is mediated through the expression of a testis-determining gene on the Y chromosome. In humans, a candidate gene for the testis-determining factor (TDF) that encodes a protein with a putative DNA-binding motif and has been isolated is termed SRY. Here we describe an XY sex-reversed female w...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(92)90164-n

    authors: McElreavey KD,Vilain E,Boucekkine C,Vidaud M,Jaubert F,Richaud F,Fellous M

    更新日期:1992-07-01 00:00:00

  • Genomic organization of the neurofibromatosis 1 gene (NF1).

    abstract::Neurofibromatosis 1 maps to chromosome band 17q11.2, and the NF1 locus has been partially characterized. Even though the full-length NF1 cDNA has been sequenced, the complete genomic structure of the NF1 gene has not been elucidated. The 5' end of NF1 is embedded in a CpG island containing a NotI restriction site, and...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(95)80104-t

    authors: Li Y,O'Connell P,Breidenbach HH,Cawthon R,Stevens J,Xu G,Neil S,Robertson M,White R,Viskochil D

    更新日期:1995-01-01 00:00:00

  • Striking bimodal methylation of the repeat unit of the tandem array encoding human U2 snRNA (the RNU2 locus).

    abstract::The genes encoding human U2 small nuclear RNA are arrayed in tandem (the RNU2 locus) and have undergone concerted evolution for >35 Myr. Tandem organization of repetitive sequences may facilitate recombination that underlies concerted evolution, but could risk instability. Since DNA methylation plays a crucial role in...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1999.6052

    authors: Jiang C,Liao D

    更新日期:1999-12-15 00:00:00

  • Characterization of a human glycoprotein with a potential role in sperm-egg fusion: cDNA cloning, immunohistochemical localization, and chromosomal assignment of the gene (AEGL1).

    abstract::Acidic epididymal glycoprotein (AEG), thus far identified only in rodents, is one of the sperm surface proteins involved in the fusion of the sperm and egg plasma membranes. In the present study, we describe the isolation and characterization of cDNA encoding a human glycoprotein related to AEG. Although this protein,...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1996.0131

    authors: Hayashi M,Fujimoto S,Takano H,Ushiki T,Abe K,Ishikura H,Yoshida MC,Kirchhoff C,Ishibashi T,Kasahara M

    更新日期:1996-03-15 00:00:00

  • Physical mapping and cloning of the proximal segment of the myotonic dystrophy gene region.

    abstract::The myotonic dystrophy (DM) region has been recently shown to be bracketed by two key recombinant events. One recombinant occurs in a Dutch DM family, which maps the DM locus distal to the ERCC1 gene and D19S115 (pE0.8). The other recombinant event is in a French Canadian DM family, which maps DM proximal to D19S51 (p...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(92)90119-d

    authors: Shutler G,Korneluk RG,Tsilfidis C,Mahadevan M,Bailly J,Smeets H,Jansen G,Wieringa B,Lohman F,Aslanidis C

    更新日期:1992-07-01 00:00:00

  • Differential regulation of the human gene DAB2IP in normal and malignant prostatic epithelia: cloning and characterization.

    abstract::Human DAB2IP (for DAB2 interaction protein) is a novel member of the RasGTPase-activating protein family. It interacts directly with DAB2, which suppresses growth of many cancer types. We demonstrated that DAB2IP is often downregulated in human prostate cancer cell lines. The predicted DAB2IP protein (967 amino acids)...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.2002.6739

    authors: Chen H,Pong RC,Wang Z,Hsieh JT

    更新日期:2002-04-01 00:00:00

  • Methylation dynamics of IG-DMR and Gtl2-DMR during murine embryonic and placental development.

    abstract::The Dlk1-Dio3 imprinted domain on mouse chromosome 12 contains IG-DMR and Gtl2-DMR, whose methylation patterns are established in the germline and after fertilization, respectively. In this study, we determine that acquisition of DNA methylation at the paternal allele of the Gtl2-DMR is initiated after the blastocyst ...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2011.05.003

    authors: Sato S,Yoshida W,Soejima H,Nakabayashi K,Hata K

    更新日期:2011-08-01 00:00:00

  • Microarray analysis of gene expression profile in resistant and susceptible Bombyx mori strains reveals resistance-related genes to nucleopolyhedrovirus.

    abstract::To investigate the molecular mechanism of silkworm resistance to BmNPV infection, we constructed a near-isogenic line (BC8) with BmNPV resistance using highly resistant (NB) and highly susceptible parental strains (306). We investigated variations in the gene expression in the midguts of BmNPV-infected BC8 and 306 at ...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2013.02.004

    authors: Zhou Y,Gao L,Shi H,Xia H,Gao L,Lian C,Chen L,Yao Q,Chen K,Liu X

    更新日期:2013-04-01 00:00:00

  • A novel androgen-regulated gene, PMEPA1, located on chromosome 20q13 exhibits high level expression in prostate.

    abstract::Biologic effects of androgen on target cells are mediated in part by transcriptional regulation of androgen-regulated genes (ARGs) by androgen receptor. Using serial analysis of gene expression (SAGE), we have identified a comprehensive repertoire of ARGs in LNCaP cells. One of the SAGE-derived tags exhibiting homolog...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.2000.6214

    authors: Xu LL,Shanmugam N,Segawa T,Sesterhenn IA,McLeod DG,Moul JW,Srivastava S

    更新日期:2000-06-15 00:00:00

  • Major rearrangements in the alpha 5(IV) collagen gene in three patients with Alport syndrome.

    abstract::The gene coding for the alpha 5 chian of type IV collagen (alpha 5(IV) collagen), which maps to Xq22, is a candidate gene for the X-linked dominant disease Alport syndrome (AS). Using three cDNA clones, covering the 3' end of the alpha 5(IV) collagen gene, 3 of 38 patients have been identified with mutations in this g...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(91)90040-l

    authors: Boye E,Vetrie D,Flinter F,Buckle B,Pihlajaniemi T,Hamalainen ER,Myers JC,Bobrow M,Harris A

    更新日期:1991-12-01 00:00:00

  • Isolation and characterization of a novel human paired-like homeodomain-containing transcription factor gene, VSX1, expressed in ocular tissues.

    abstract::Homeodomain transcription factors control cell fates during the development of all animals. The paired-like subfamily of homeodomain proteins has been particularly implicated in ocular development in different species. In this paper we report the cDNA sequence, genomic structure, localization, and expression data of a...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1999.6093

    authors: Semina EV,Mintz-Hittner HA,Murray JC

    更新日期:2000-01-15 00:00:00

  • Functional analysis of bacterial artificial chromosomes in mammalian cells: mouse Cdc6 is associated with the mitotic spindle apparatus.

    abstract::Bacterial artificial chromosomes (BACs) provide a well-characterized resource for studying the functional organization of genes and other large chromosomal domains. To facilitate functional studies in cell cultures, we have developed a simple approach for generating stable cell lines with variable copy numbers of any ...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/s0888-7543(03)00205-2

    authors: Illenye S,Heintz NH

    更新日期:2004-01-01 00:00:00

  • Accurate assessment of intragenic recombination frequency within the Duchenne muscular dystrophy gene.

    abstract::Polymorphic loci that lie at the two extremities of the Duchenne/Becker muscular dystrophy (DMD/BMD) gene have been used to estimate intragenic recombination rates. Multipoint linkage analysis of the CEPH panel of families suggests a total intragenic recombination frequency of nearly 0.12 (confidence intervals 0.041-0...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(90)90205-9

    authors: Abbs S,Roberts RG,Mathew CG,Bentley DR,Bobrow M

    更新日期:1990-08-01 00:00:00

  • CancerProView: a graphical image database of cancer-related genes and proteins.

    abstract::We have developed a graphical image database CancerProView (URL: http://cancerproview.dmb.med.keio.ac.jp/php/cpv.html) to assist the search for alterations of the motifs/domains in the cancer-related proteins that are caused by mutations in the corresponding genes. For the CancerProView, we have collected various kind...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2012.05.011

    authors: Mitsuyama S,Shimizu N

    更新日期:2012-08-01 00:00:00

  • Fragmented mitochondrial genomes evolved in opposite directions between closely related macaque louse Pedicinus obtusus and colobus louse Pedicinus badii.

    abstract::We report for the first time the fragmented mitochondrial (mt) genomes of two Pedicinus species: Pedicinus obtusus and Pedicinus badii, and compared them with the lice of humans and chimpanzees. Despite being congeneric, the two monkey lice are distinct from each other in mt karyotype. The variation in mt karyotype be...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2020.09.005

    authors: Fu YT,Dong Y,Wang W,Nie Y,Liu GH,Shao R

    更新日期:2020-11-01 00:00:00

  • Sensitivity and selectivity in protein similarity searches: a comparison of Smith-Waterman in hardware to BLAST and FASTA.

    abstract::To predict the functions of a possible protein product of any new or uncharacterized DNA sequence, it is important first to detect all significant similarities between the encoded amino acid sequence and any accumulated protein sequence data. We have implemented a set of queries and database sequences and proceeded to...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1996.0614

    authors: Shpaer EG,Robinson M,Yee D,Candlin JD,Mines R,Hunkapiller T

    更新日期:1996-12-01 00:00:00

  • A fine integrated map of the SPG4 locus excludes an expanded CAG repeat in chromosome 2p-linked autosomal dominant spastic paraplegia.

    abstract::Autosomal dominant hereditary spastic paraplegia (AD-HSP) is a genetically heterogeneous disorder characterized by progressive spasticity of the lower limbs. A major locus (SPG4) causing AD-HSP in about 40% of the families was mapped to chromosome 2p. The analysis of six SPG4-linked AD-HSP families using the RED proce...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1999.5932

    authors: Hazan J,Davoine CS,Mavel D,Fonknechten N,Paternotte C,Fizames C,Cruaud C,Samson D,Muselet D,Vega-Czarny N,Brice A,Gyapay G,Heilig R,Fontaine B,Weissenbach J

    更新日期:1999-09-15 00:00:00

  • Molecular and cytogenetic characterization of a Chinese hamster/human hybrid cell line containing a der (21)t(Ypter-->cenY::cen21-->21qter) chromosome.

    abstract::Human/rodent somatic cell hybrids have been exceedingly useful in assigning human genes and DNA sequences to specific human chromosomes. As new technologies for analyzing the human chromosome complement of such human/rodent hybrid cells become available, it is of critical importance that these be applied to enhance ch...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1993.1026

    authors: Patterson D,Hart I,Lai LW,Brahe C,Moscetti A,Tassone F,Raimondi E,Jones C

    更新日期:1993-01-01 00:00:00

  • The 2p21 deletion syndrome: characterization of the transcription content.

    abstract::The vast majority of small-deletion syndromes are caused by haploinsufficiency of one or several genes and are transmitted as dominant traits. We have previously identified a homozygous deletion of 179,311 bp on chromosome 2p21 as the cause of a unique syndrome, inherited in a recessive mode, consisting of cystinuria,...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2005.04.001

    authors: Parvari R,Gonen Y,Alshafee I,Buriakovsky S,Regev K,Hershkovitz E

    更新日期:2005-08-01 00:00:00

  • Targeted construction of a high-resolution, integrated, comprehensive, and comparative map for a region specific to bovine chromosome 6 based on radiation hybrid mapping.

    abstract::To resolve a candidate chromosome region on the middle part of bovine chromosome 6 (BTA6) containing several different quantitative trait locus (QTL) intervals, we constructed a high-resolution, integrated, comprehensive, and comparative map using a 12,000-rad, whole-genome, cattle-hamster radiation hybrid (RH) panel....

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.2002.6778

    authors: Weikard R,Kühn C,Goldammer T,Laurent P,Womack JE,Schwerin M

    更新日期:2002-06-01 00:00:00

  • Genomic sequence, organization, and chromosomal localization of human JAK3.

    abstract::Members of the Janus (JAK) protein tyrosine kinase family including JAK3 have recently emerged as important components in cytokine signal transduction. Mutations of JAK3 have been found in a number of patients who present with severe combined immunodeficiency. To facilitate the further identification of JAK3-SCID pati...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1996.0520

    authors: Riedy MC,Dutra AS,Blake TB,Modi W,Lal BK,Davis J,Bosse A,O'Shea JJ,Johnston JA

    更新日期:1996-10-01 00:00:00

  • An explanation for the phenotypic differences between patients bearing partial deletions of the DMD locus.

    abstract::Deletions giving rise to Duchenne muscular dystrophy (DMD) and the less severe Becker muscular dystrophy (BMD) occur in the same large gene on the short arm of the human X chromosome. We present a molecular mechanism to explain the clinical difference in severity between DMD and BMD patients who bear partial deletions...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(88)90113-9

    authors: Monaco AP,Bertelson CJ,Liechti-Gallati S,Moser H,Kunkel LM

    更新日期:1988-01-01 00:00:00

  • Long-range mapping of the gene for the human alpha 5(IV) collagen chain at Xq22-q23.

    abstract::The X-linked kidney disorder known as Alport syndrome (AS) has been shown to be due to mutations in the gene for an alpha 5 chain of type IV collagen that maps to Xq22-23. Using overlapping cDNA clones that represent approximately 90% of this gene and pulsed-field gel electrophoresis, we have constructed a 2.4-Mb long...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(92)90415-o

    authors: Vetrie D,Flinter F,Bobrow M,Harris A

    更新日期:1992-01-01 00:00:00

  • A novel human Mcm protein: homology to the yeast replication protein Mis5 and chromosomal location.

    abstract::Mcm proteins perform functions related to the regulation of eukaryotic genome replication. Previous work has shown that human cells contain at least five different Mcm proteins. We report now the amino acid sequence of an additional human Mcm protein, p105Mcm, and show that it is homologous to the Schizosaccharomyces ...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1996.0530

    authors: Holthoff HP,Hameister H,Knippers R

    更新日期:1996-10-01 00:00:00

  • HERV-K-T47D-Related long terminal repeats mediate polyadenylation of cellular transcripts.

    abstract::The human genome harbors thousands of long terminal repeats (LTRs) that are derived from endogenous retroviruses and contain elements able to regulate the expression of neighboring cellular genes. We have investigated the ability of human endogenous retroviral (HERV)-K LTRs to provide transcriptional processing signal...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.2000.6175

    authors: Baust C,Seifarth W,Germaier H,Hehlmann R,Leib-Mösch C

    更新日期:2000-05-15 00:00:00

  • A genetic map of mouse chromosome 1 near the Lsh-Ity-Bcg disease resistance locus.

    abstract::Isozyme and restriction fragment length polymorphism (RFLP) analyses of backcross progeny, recombinant inbred strains, and congenic strains of mice positioned eight genetic markers with respect to the Lsh-Ity-Bcg disease resistance locus. Allelic isoforms of Idh-1 and Pep-3 and RFLPs detected by Southern hybridization...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(90)90518-y

    authors: Mock B,Krall M,Blackwell J,O'Brien A,Schurr E,Gros P,Skamene E,Potter M

    更新日期:1990-05-01 00:00:00

  • Cloning and chromosome localization of the mouse Ews gene.

    abstract::The human EWS gene encodes a putative RNA binding protein. As a result of acquired chromosome rearrangement, the N-terminal portion of the EWS protein is fused to the DNA binding domain of either FLI-1 or ERG in the Ewing family of tumors and to the DNA binding domain of ATF1 in malignant melanoma of soft parts. We ha...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1006/geno.1994.1495

    authors: Plougastel B,Mattei MG,Thomas G,Delattre O

    更新日期:1994-09-01 00:00:00

  • The proximity of DNA sequences in interphase cell nuclei is correlated to genomic distance and permits ordering of cosmids spanning 250 kilobase pairs.

    abstract::The physical distance between DNA sequences in interphase nuclei was determined using eight cosmids containing fragments of the Chinese hamster genome that span 273 kb surrounding the dihydrofolate reductase (DHFR) gene. The distance between these sequences at the molecular level has been determined previously by rest...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/0888-7543(89)90112-2

    authors: Trask B,Pinkel D,van den Engh G

    更新日期:1989-11-01 00:00:00

  • Evidence of systematic expressed sequence tag IMAGE clone cross-hybridization on cDNA microarrays.

    abstract::We present evidence of a potentially serious source of error intrinsic to all spotted cDNA microarrays that use IMAGE clones of expressed sequence tags (ESTs). We found that a high proportion of these EST sequences contain 5'-end poly(dT) sequences that are remnants from the oligo(dT)-primed reverse transcription of p...

    journal_title:Genomics

    pub_type: 杂志文章

    doi:10.1016/j.ygeno.2003.12.010

    authors: Handley D,Serban N,Peters D,O'Doherty R,Field M,Wasserman L,Spirtes P,Scheines R,Glymour C

    更新日期:2004-06-01 00:00:00