Abstract:
:Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.
journal_name
Genomicsjournal_title
Genomicsauthors
Chen X,Ishwaran Hdoi
10.1016/j.ygeno.2012.04.003subject
Has Abstractpub_date
2012-06-01 00:00:00pages
323-9issue
6eissn
0888-7543issn
1089-8646pii
S0888-7543(12)00062-6journal_volume
99pub_type
杂志文章,评审相关文献
GENOMICS文献大全abstract::Although occasional DNA polymorphisms have been observed in inbred mice, CBA/J and C3H/HeN mice have two microsatellite alleles at over 1/3 of microsatellite loci tested. Since DNA polymorphisms were not detected in DBA/2J, C57BL/6J, and BALB/cJ, the frequency of microsatellite polymorphisms appears to be strain speci...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1996.0592
更新日期:1996-11-15 00:00:00
abstract::The regional assignments of 30 expressed sequence tags (ESTs) on human chromosome 7 were determined by studying the segregation of their PCR-amplified products in a panel of mouse somatic cell hybrids. ESTs are important molecular landmarks for physical mapping and can be considered as tags to candidate genes for gene...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1995.0021
更新日期:1995-11-01 00:00:00
abstract::Portions of 16 chromosome 21 NotI linking clones were sequenced. These linking clone sequences represent sequence-tagged restriction sites that are potentially useful for finding genes and for finer genome mapping and sequencing. All of the clones were G+C rich (54 to 83%). CpG and GpC dinucleotide frequencies were ve...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1993.1455
更新日期:1993-11-01 00:00:00
abstract::In this paper we describe a method that uses the nearly covalent strength biotin-streptavidin interaction to attach a paramagnetic bead of micrometer size to a DNA molecule of nanometer size, scaling up the spatial size of a query DNA strand by a factor of 1000, making it visible to the human eye. The use of magnetic ...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2007.07.014
更新日期:2007-12-01 00:00:00
abstract::Methyl-CpG binding domain proteins (MBD) can specifically bind to methylated CpG sites and play important roles in epigenetic gene regulation. Here, we identified and functionally characterized the MBD protein in Tribolium castaneum. T. castaneum genome encodes only one MBD protein: TcMBD2/3. RNA interference targetin...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2019.12.018
更新日期:2020-05-01 00:00:00
abstract::NotI and EagI boundary libraries were constructed for human chromosome 21. One hundred forty-seven clones were isolated from the somatic cell hybrid 72532X-6 and localized using a hybrid mapping panel. After identification of those clones, which were isolated more than once, as well as those probes derived from a prev...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/0888-7543(91)90497-3
更新日期:1991-05-01 00:00:00
abstract::A complete genetic linkage map of the soybean, in which sequence-based (SB) genetic markers are evenly distributed genomewide, was constructed from an F(12) population composed of 113 recombinant inbred lines derived from an interspecific cross involving Korean genotypes Hwangkeum and IT182932. Several approaches were...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2008.03.008
更新日期:2008-07-01 00:00:00
abstract::FISH techniques have opened new possibilities for high-resolution genome mapping. Effective utilization of these techniques for the rapid orientation and ordering of adjacent and overlapping probes as well as for the characterization of long-range genomic contigs would facilitate physical mapping and positional clonin...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1995.0005
更新日期:1995-11-01 00:00:00
abstract::The goat beta-globin cluster is composed of a triplicated four-gene set. A locus control region (LCR) containing elements homologous to 5'DNase I hypersensitive sites (HS) 1, 2, and 3 of the human beta-globin LCR has been identified at the 5' end of this locus. We determined 10.2 kb of nucleotide sequence from the goa...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/0888-7543(91)90415-b
更新日期:1991-03-01 00:00:00
abstract::Laboratory mouse strains are known to have emerged from recent interbreeding between individuals of Mus musculus isolated populations. As a result of this breeding history, the collection of polymorphisms observed between laboratory mouse strains is likely to harbor the effects of natural selection between reproductiv...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2010.02.004
更新日期:2010-04-01 00:00:00
abstract::3',5'-Cyclic guanosine monophosphate is the intracellular second messenger regulating phototransduction in mammals. The level of cGMP in photoreceptor cells is controlled by the cGMP-hydrolyzing enzyme cGMP phosphodiesterase and the cGMP-producing enzyme guanylate cyclase. Identification of a photoreceptor-specific gu...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1994.1415
更新日期:1994-07-15 00:00:00
abstract::We isolated a mouse cDNA encoding APEX2 protein and demonstrated that APEX2 binds to PCNA. The level of Apex2 mRNA was high in the thymus, bone marrow, spleen, and kidney in adult mice. Apex2 consists of six exons and is flanked on the 3' end by Alas2 on X chromosome 63.0. Furthermore, Apex2 is flanked on the 5' end b...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/s0888-7543(02)00009-5
更新日期:2003-01-01 00:00:00
abstract::The WD-repeat protein family consists of a large group of structurally related yet functionally diverse proteins found predominantly in eukaryotic cells. These factors contain several (4-16) copies of a recognizable amino-acid sequence motif (the WD unit) thought to be organized into a "propeller-like" structure invol...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.2001.6682
更新日期:2002-01-01 00:00:00
abstract::Thirty-five new, unique, DNA probes have been isolated and each has been assigned to one of five regions on chromosome 22. The distribution of probes along the chromosome is what would be expected based on the estimated size of each region with the exception of the short arm (22p). RFLP analysis was performed using 13...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/0888-7543(91)90190-p
更新日期:1991-08-01 00:00:00
abstract::In response to nutrient deprivation, the ubiquitous Gram-negative soil bacterium Myxococcus xanthus undergoes a well-characterized developmental response, resulting in the formation of a multicellular fruiting body. The center of the fruiting body consists of myxospores; surrounding this structure are rod-shaped perip...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2019.09.008
更新日期:2020-03-01 00:00:00
abstract::Large deletions in Xq21 often are associated with contiguous gene syndromes consisting of X-linked deafness type 3 (DFN3), mental retardation (MRX), and choroideremia (CHM). The identification of deletions associated with classic CHM or DFN3 facilitated the positional cloning of the underlying genes, REP-1 and POU3F4,...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1999.6004
更新日期:1999-12-15 00:00:00
abstract::A human corneal fibroblast cDNA library was screened with a bovine lumican cDNA probe to obtain three clones. Sequencing of the longest clone (1.75 kb) yielded an open reading frame of 1014 bp coding for a 338-amino-acid core protein. Amino acid sequencing of a tryptic peptide resulted in a 9-amino-acid match with the...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1995.1080
更新日期:1995-06-10 00:00:00
abstract::The general strategies of phototransduction in vertebrates and invertebrates share many similarities, but differ significantly in their underlying molecular machinery. The CDS gene encodes the CDP-diacylglycerol synthase (CDS) enzyme and is required for phototransduction in Drosophila. Using a bioinformatic approach, ...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1998.5610
更新日期:1999-01-01 00:00:00
abstract::Alternative splicing is an important mechanism mediating the function of genes in multicellular organisms. Recently, we discovered a new splicing-junction wobble mechanism that generates subtle alterations in mRNA by randomly selecting tandem 5' and 3' splicing-junction sites. Here we developed a sensitive approach to...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2006.07.004
更新日期:2006-12-01 00:00:00
abstract::We report a genetic linkage map of the pericentromeric region of the human X chromosome, extending from Xp11 to Xq13. Genetic analysis with five polymorphic markers, including centromeric alpha satellite DNA, spanned a distance of approximately 38 cM. Significant lod scores were obtained with linkage analysis in 26 fa...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/0888-7543(88)90017-1
更新日期:1988-05-01 00:00:00
abstract::The promoter is a regulatory DNA region and important for gene transcriptional regulation. It is located near the transcription start site (TSS) upstream of the corresponding gene. In the post-genomics era, the availability of data makes it possible to build computational models for robustly detecting the promoters as...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2019.08.009
更新日期:2020-03-01 00:00:00
abstract::SCG10 is a neuronal growth-associated protein that shares an amino acid sequence similarity with an 18- to 19-kDa phosphoprotein named stathmin (also called p19, p18, Op18, pp17, prosolin, pp20, 19K, and leukemia-associated phosphoprotein, Lap18), which is more broadly expressed in a variety of cell types of the neura...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1993.1477
更新日期:1993-11-01 00:00:00
abstract::The solute carrier family 22 (SLC22) is a large family of organic cation and anion transporters. These are transmembrane proteins expressed predominantly in kidneys and liver and mediate the uptake and excretion of environmental toxins, endogenous substances, and drugs from the body. Through a comprehensive database s...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2007.03.017
更新日期:2007-11-01 00:00:00
abstract::One of the major challenges in genome research is the identification of the complete set of genes in a genome. Alignments of expressed sequences (RNA and EST) with genomic sequences have been used to characterize genes. However, the number of alignments far exceeds the likely number of genes in a genome, suggesting th...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1016/j.ygeno.2003.07.003
更新日期:2004-04-01 00:00:00
abstract::We have determined the cDNA and genomic structure of a gene (-14 gene) that lies adjacent to the human alpha-globin cluster. Although it is expressed in a wide range of cell lines and tissues, a previously described erythroid-specific regulatory element that controls expression of the alpha-globin genes lies within in...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1995.9951
更新日期:1995-10-10 00:00:00
abstract::Deficiency of the lysosomal enzyme, N-acetylgalactosamine 6-sulfatase (GALNS;EC 3.1.6.4), results in the storage of the glycosaminoglycans, keratan sulfate and chondroitin 6-sulfate, which leads to the lysosomal storage disorder Morquio A syndrome. Four overlapping genomic clones derived from a chromosome 16-specific ...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1994.1443
更新日期:1994-08-01 00:00:00
abstract::The human genome contains a group of gene families whose members map within the same regions of chromosomes 1, 6, and 9. The number of gene families involved and their pronounced clustering to the same areas of the genome indicate that their mapping relationship is nonrandom. By combining mapping data and sequence inf...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1996.0328
更新日期:1996-07-01 00:00:00
abstract::The primary structure of the human microsomal glutathione S-transferase gene (GST12) was determined by genomic cloning. The gene structure of GST12 spans 12.8 kb and consists of four exons and three introns. The coding sequence resides on exons 2, 3, and 4. Sequencing of the exons revealed two nucleotide differences c...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1996.0429
更新日期:1996-08-15 00:00:00
abstract::CL-20 is a novel gene encoding a protein that is structurally related to but distinct from the peripheral myelin protein PMP22. Like PMP22, CL-20 is likely to play important roles in the regulation of cell proliferation, differentiation, and cell death. In this study, we describe the cloning and sequencing of a cDNA e...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1997.4524
更新日期:1997-04-01 00:00:00
abstract::To facilitate studies of the SRY gene, a 4741-bp portion of the sex-determining region of the human Y chromosome was sequenced and characterized. Two RNAs were found to hybridize to this genomic segment, one transcript deriving from SRY and the second cross-hybridizing to a pseudogene located 2.5 kb 5' of the SRY open...
journal_title:Genomics
pub_type: 杂志文章
doi:10.1006/geno.1993.1395
更新日期:1993-09-01 00:00:00