Spectral methods in machine learning and new strategies for very large datasets.

Abstract:

:Spectral methods are of fundamental importance in statistics and machine learning, because they underlie algorithms from classical principal components analysis to more recent approaches that exploit manifold structure. In most cases, the core technical problem can be reduced to computing a low-rank approximation to a positive-definite kernel. For the growing number of applications dealing with very large or high-dimensional datasets, however, the optimal approximation afforded by an exact spectral decomposition is too costly, because its complexity scales as the cube of either the number of training examples or their dimensionality. Motivated by such applications, we present here 2 new algorithms for the approximation of positive-semidefinite kernels, together with error bounds that improve on results in the literature. We approach this problem by seeking to determine, in an efficient manner, the most informative subset of our data relative to the kernel approximation task at hand. This leads to two new strategies based on the Nyström method that are directly applicable to massive datasets. The first of these-based on sampling-leads to a randomized algorithm whereupon the kernel induces a probability distribution on its set of partitions, whereas the latter approach-based on sorting-provides for the selection of a partition in a deterministic way. We detail their numerical implementation and provide simulation results for a variety of representative problems in statistical data analysis, each of which demonstrates the improved performance of our approach relative to existing methods.

authors

Belabbas MA,Wolfe PJ

doi

10.1073/pnas.0810600105

subject

Has Abstract

pub_date

2009-01-13 00:00:00

pages

369-74

issue

2

eissn

0027-8424

issn

1091-6490

pii

0810600105

journal_volume

106

pub_type

杂志文章
  • Expression of the mouse serum albumin gene introduced into differentiated and dedifferentiated rat hepatoma cells.

    abstract::A 23-kilobase-pair segment of DNA containing the entire mouse serum albumin gene as well as 2.2 kilobase pairs of 5' and 4.3 kilobase pairs of 3' flanking sequences has been introduced into pSV2dhfr, a plasmid in which expression of the mouse dihydrofolate reductase cDNA is under the control of simian virus 40 sequenc...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.82.3.765

    authors: Deschatrette J,Fougere-Deschatrette C,Corcos L,Schimke RT

    更新日期:1985-02-01 00:00:00

  • Most of the G1 period in hamster cells is eliminated by lengthening the S period.

    abstract::Two Chinese hamster cell lines, G1+-1 and CHO, have been grown in the presence of low concentrations of hydroxyurea to determine how a slowing DNA synthesis (i.e., a lengthening of the S period) affects the length of the G1 period. Hydroxyurea concentrations of approximately 10 microM do not alter the generation times...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.78.10.6295

    authors: Stancel GM,Prescott DM,Liskay RM

    更新日期:1981-10-01 00:00:00

  • Antagonistic nature of T helper 1/2 developmental programs in opposing peripheral induction of Foxp3+ regulatory T cells.

    abstract::Recent studies have highlighted the importance of peripheral induction of Foxp3-expressing regulatory T cells (Tregs) in the dominant control of immunological tolerance. However, Foxp3(+) Treg differentiation from naïve CD4(+) T cells occurs only under selective conditions, whereas the classical T helper (Th) 1 and 2 ...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.0703642104

    authors: Wei J,Duramad O,Perng OA,Reiner SL,Liu YJ,Qin FX

    更新日期:2007-11-13 00:00:00

  • The gene for the neuropeptide gonadotropin-releasing hormone is expressed in the mammary gland of lactating rats.

    abstract::The high concentration of gonadotropin-releasing hormone (GnRH) in milk of several species implies that the mammary gland is either a site of synthesis for this neuropeptide or that it is efficiently concentrated from plasma by this organ. By PCR amplification of mammary gland cDNA, we have demonstrated expression of ...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.91.11.4994

    authors: Palmon A,Ben Aroya N,Tel-Or S,Burstein Y,Fridkin M,Koch Y

    更新日期:1994-05-24 00:00:00

  • Anandamide transport is independent of fatty-acid amide hydrolase activity and is blocked by the hydrolysis-resistant inhibitor AM1172.

    abstract::The endogenous cannabinoid anandamide is removed from the synaptic space by a high-affinity transport system present in neurons and astrocytes, which is inhibited by N-(4-hydroxyphenyl)-arachidonamide (AM404). After internalization, anandamide is hydrolyzed by fatty-acid amide hydrolase (FAAH), an intracellular membra...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.0400997101

    authors: Fegley D,Kathuria S,Mercier R,Li C,Goutopoulos A,Makriyannis A,Piomelli D

    更新日期:2004-06-08 00:00:00

  • Atherosclerosis and hypertension induction by lead and cadmium ions: an effect prevented by calcium ion.

    abstract::In epidemiological studies, both positive and negative correlations have been found between cardiovascular disease and mortality and the presence of several inorganic ions in the drinking water. In an attempt to resolve this apparent disagreement, we exposed White Carneau pigeons to drinking water containing calcium (...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.78.10.6494

    authors: Revis NW,Zinsmeister AR,Bull R

    更新日期:1981-10-01 00:00:00

  • Three forms of cytochrome b 559 and their relation to the photosynthetic activity of chloroplasts.

    abstract::Spinach chloroplasts were found to contain three forms of cytochrome b(559) that have the same alpha-peak at 559 nm, but are distinguished from one another by their oxidation-reduction potentials. The high-potential (H) form (E(m) about 330-350 mV) is reducible by hydroquinone, the middle-potential (M) form (E(m) abou...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.68.12.3064

    authors: Wada K,Arnon DI

    更新日期:1971-12-01 00:00:00

  • The 2.8-A structure of rat liver F1-ATPase: configuration of a critical intermediate in ATP synthesis/hydrolysis.

    abstract::During mitochondrial ATP synthesis, F1-ATPase-the portion of the ATP synthase that contains the catalytic and regulatory nucleotide binding sites-undergoes a series of concerted conformational changes that couple proton translocation to the synthesis of the high levels of ATP required for cellular function. In the str...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.95.19.11065

    authors: Bianchet MA,Hullihen J,Pedersen PL,Amzel LM

    更新日期:1998-09-15 00:00:00

  • High-amplitude cofluctuations in cortical activity drive functional connectivity.

    abstract::Resting-state functional connectivity is used throughout neuroscience to study brain organization and to generate biomarkers of development, disease, and cognition. The processes that give rise to correlated activity are, however, poorly understood. Here we decompose resting-state functional connectivity using a tempo...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.2005531117

    authors: Zamani Esfahlani F,Jo Y,Faskowitz J,Byrge L,Kennedy DP,Sporns O,Betzel RF

    更新日期:2020-11-10 00:00:00

  • A cation binding motif stabilizes the compound I radical of cytochrome c peroxidase.

    abstract::Cytochrome c peroxidase reacts with peroxide to form compound I, which contains an oxyferryl heme and an indolyl radical at Trp-191. The indolyl free radical has a half-life of several hours at room temperature, and this remarkable stability is essential for the catalytic function of cytochrome c peroxidase. To probe ...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.91.23.11118

    authors: Miller MA,Han GW,Kraut J

    更新日期:1994-11-08 00:00:00

  • Amino acid sequence of a progesterone-binding protein.

    abstract::The amino acid sequence of blastokinin, also called uteroglobin, has been determined by a combined study of both the intact native molecule and the peptide fragments resulting from tryptic and chymotryptic digestions. Sequence analyses performed by automated methods and by sequential digestion with leucine aminopeptid...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.75.11.5516

    authors: Popp RA,Foresman KR,Wise LD,Daniel JC Jr

    更新日期:1978-11-01 00:00:00

  • Serine protease autotransporters from Shigella flexneri and pathogenic Escherichia coli target a broad range of leukocyte glycoproteins.

    abstract::The serine protease autotransporters of Enterobacteriaceae (SPATEs) are secreted by pathogenic Gram-negative bacteria through the autotransporter pathway. We previously classified SPATE proteins into two classes: cytotoxic (class 1) and noncytotoxic (class 2). Here, we show that Pic, a class 2 SPATE protein produced b...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.1101006108

    authors: Ruiz-Perez F,Wahid R,Faherty CS,Kolappaswamy K,Rodriguez L,Santiago A,Murphy E,Cross A,Sztein MB,Nataro JP

    更新日期:2011-08-02 00:00:00

  • Defining recovery neurobiology of injured spinal cord by synthetic matrix-assisted hMSC implantation.

    abstract::Mesenchymal stromal stem cells (MSCs) isolated from adult tissues offer tangible potential for regenerative medicine, given their feasibility for autologous transplantation. MSC research shows encouraging results in experimental stroke, amyotrophic lateral sclerosis, and neurotrauma models. However, further translatio...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.1616340114

    authors: Ropper AE,Thakor DK,Han I,Yu D,Zeng X,Anderson JE,Aljuboori Z,Kim SW,Wang H,Sidman RL,Zafonte RD,Teng YD

    更新日期:2017-01-31 00:00:00

  • Accelerated autoxidation and heme loss due to instability of sickle hemoglobin.

    abstract::The pleiotropic effect of the sickle gene suggests that factors in addition to polymerization of the mutant gene product might be involved in sickle disease pathobiology. We have examined rates of heme transfer to hemopexin from hemoglobin in dilute aqueous solution (0.5 mg of Hb per ml) at 37 degrees C. HbO2 S loses ...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.85.1.237

    authors: Hebbel RP,Morgan WT,Eaton JW,Hedlund BE

    更新日期:1988-01-01 00:00:00

  • A PIIB-type Ca2+-ATPase is essential for stress adaptation in Physcomitrella patens.

    abstract::Transient cytosolic Ca(2+) ([Ca(2+)](cyt)) elevations are early events in plant signaling pathways including those related to abiotic stress. The restoration of [Ca(2+)](cyt) to prestimulus levels involves ATP-driven Ca(2+) pumps, but direct evidence for an essential role of a plant Ca(2+)-ATPase in abiotic stress ada...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.0800864105

    authors: Qudeimat E,Faltusz AM,Wheeler G,Lang D,Holtorf H,Brownlee C,Reski R,Frank W

    更新日期:2008-12-09 00:00:00

  • Achievements and challenges in the biology of environmental effects.

    abstract::The starting point for the study of adverse experiences is that some have enduring consequences that continue after the period of exposure to the adversity. That raises four basic issues: whether social adversities can be considered homogeneous, whether the crucial effect lies in the "objective" or subjectively percei...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章,评审

    doi:10.1073/pnas.1121258109

    authors: Rutter M

    更新日期:2012-10-16 00:00:00

  • Beyond superquenching: hyper-efficient energy transfer from conjugated polymers to gold nanoparticles.

    abstract::Gold nanoparticles quench the fluorescence of cationic polyfluorene with Stern-Volmer constants (KSV) approaching 1011 M-1, several orders of magnitude larger than any previously reported conjugated polymer-quencher pair and 9-10 orders of magnitude larger than small molecule dye-quencher pairs. The dependence of KSV ...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.1132025100

    authors: Fan C,Wang S,Hong JW,Bazan GC,Plaxco KW,Heeger AJ

    更新日期:2003-05-27 00:00:00

  • Statistical methods for characterizing similarities and differences between semantic structures.

    abstract::This paper describes a variety of statistical methods for obtaining precise quantitative estimates of the similarities and differences in the structures of semantic domains in different languages. The methods include comparing mean correlations within and between groups, principal components analysis of interspeaker c...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.97.1.518

    authors: Romney AK,Moore CC,Batchelder WH,Hsia TL

    更新日期:2000-01-04 00:00:00

  • Sequence requirements for cleavage activation of influenza virus hemagglutinin expressed in mammalian cells.

    abstract::Cleavage of the hemagglutinin (HA) in tissue culture systems has been correlated with virulence of avian influenza viruses. To examine the structural requirements for cleavage of the HA, the HA gene from a virulent H5 influenza virus was expressed in mammalian cells (CV-1), and the cleavage site of the HA was explored...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.85.2.324

    authors: Kawaoka Y,Webster RG

    更新日期:1988-01-01 00:00:00

  • Repair of O6-ethylguanine in DNA by a chromatin fraction from rat liver: transfer of the ethyl group to an acceptor protein.

    abstract::Incubation of O6-[3H]ethylguanine-containing DNA with a rat liver chromatin fraction resulted in a decrease in the O6-ethylguanine content of the DNA. Analysis of the products of this reaction showed that the ethyl group had been transferred from the O6-ethylguanine to a protein acceptor. When the incubation mixture w...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.78.11.6766

    authors: Mehta JR,Ludlum DB,Renard A,Verly WG

    更新日期:1981-11-01 00:00:00

  • Agricultural intensification escalates future conservation costs.

    abstract::The supposition that agricultural intensification results in land sparing for conservation has become central to policy formulations across the tropics. However, underlying assumptions remain uncertain and have been little explored in the context of conservation incentive schemes such as policies for Reducing Emission...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.1220070110

    authors: Phelps J,Carrasco LR,Webb EL,Koh LP,Pascual U

    更新日期:2013-05-07 00:00:00

  • Polycystic disease caused by deficiency in xylosyltransferase 2, an initiating enzyme of glycosaminoglycan biosynthesis.

    abstract::The basic biochemical mechanisms underlying many heritable human polycystic diseases are unknown despite evidence that most cases are caused by mutations in members of several protein families, the most prominent being the polycystin gene family, whose products are found on the primary cilia, or due to mutations in po...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.0700908104

    authors: Condac E,Silasi-Mansat R,Kosanke S,Schoeb T,Towner R,Lupu F,Cummings RD,Hinsdale ME

    更新日期:2007-05-29 00:00:00

  • Sequential repression and activation of the CCAAT enhancer-binding protein-alpha (C/EBPalpha ) gene during adipogenesis.

    abstract::CCAAT enhancer-binding protein-alpha (C/EBPalpha) functions as a pleiotropic transcriptional activator of adipocyte genes during adipogenesis. Nuclear factor C/EBP undifferentiated protein (CUP), an isoform of activator protein-2alpha (AP-2alpha), binds to repressive elements in the C/EBPalpha gene promoter, silencing...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.220426097

    authors: Jiang MS,Lane MD

    更新日期:2000-11-07 00:00:00

  • Accumulation of formamide in hydrothermal pores to form prebiotic nucleobases.

    abstract::Formamide is one of the important compounds from which prebiotic molecules can be synthesized, provided that its concentration is sufficiently high. For nucleotides and short DNA strands, it has been shown that a high degree of accumulation in hydrothermal pores occurs, so that temperature gradients might play a role ...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.1600275113

    authors: Niether D,Afanasenkau D,Dhont JK,Wiegand S

    更新日期:2016-04-19 00:00:00

  • Interaction between amyloid precursor protein and presenilins in mammalian cells: implications for the pathogenesis of Alzheimer disease.

    abstract::Mutations in the presenilin 1 (PS1) and presenilin 2 (PS2) genes increase the production of the highly amyloidogenic 42-residue form of amyloid beta-protein (Abeta42) in a variety of cell lines and transgenic mice. To elucidate the molecular mechanism of this effect, wild-type (wt) or mutant PS1 and PS2 genes were sta...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.94.15.8208

    authors: Xia W,Zhang J,Perez R,Koo EH,Selkoe DJ

    更新日期:1997-07-22 00:00:00

  • Nucleosome positioning by genomic excluding-energy barriers.

    abstract::Recent genome-wide nucleosome mappings along with bioinformatics studies have confirmed that the DNA sequence plays a more important role in the collective organization of nucleosomes in vivo than previously thought. Yet in living cells, this organization also results from the action of various external factors like D...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.0909511106

    authors: Milani P,Chevereau G,Vaillant C,Audit B,Haftek-Terreau Z,Marilley M,Bouvet P,Argoul F,Arneodo A

    更新日期:2009-12-29 00:00:00

  • Complementation of areA- regulatory gene mutations of Aspergillus nidulans by the heterologous regulatory gene nit-2 of Neurospora crassa.

    abstract::Loss-of-function mutations in the regulatory gene areA of Aspergillus nidulans prevent the utilization of a wide variety of nitrogen sources. The phenotypes of nit-2 mutants of Neurospora crassa suggest that this gene may be analogous to the areA gene. Transformation has been used to introduce a plasmid containing the...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.84.11.3753

    authors: Davis MA,Hynes MJ

    更新日期:1987-06-01 00:00:00

  • Electron transfer pathways in a multiheme cytochrome MtrF.

    abstract::In MtrF, an outer-membrane multiheme cytochrome, the 10 heme groups are arranged in heme binding domains II and IV along the pseudo-C2 axis, forming the electron transfer (ET) pathways. Previous reports based on molecular dynamics simulations showed that the redox potential (Em) values for the heme pairs located in sy...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.1617615114

    authors: Watanabe HC,Yamashita Y,Ishikita H

    更新日期:2017-03-14 00:00:00

  • BACE2, a beta -secretase homolog, cleaves at the beta site and within the amyloid-beta region of the amyloid-beta precursor protein.

    abstract::Production of amyloid-beta protein (Abeta) is initiated by a beta-secretase that cleaves the Abeta precursor protein (APP) at the N terminus of Abeta (the beta site). A recently identified aspartyl protease, BACE, cleaves the beta site and at residue 11 within the Abeta region of APP. Here we show that BACE2, a BACE h...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.160115697

    authors: Farzan M,Schnitzler CE,Vasilieva N,Leung D,Choe H

    更新日期:2000-08-15 00:00:00

  • Hypothalamic extended synaptotagmin-3 contributes to the development of dietary obesity and related metabolic disorders.

    abstract::The C2 domain containing protein extended synaptotagmin (E-Syt) plays important roles in both lipid homeostasis and the intracellular signaling; however, its role in physiology remains largely unknown. Here, we show that hypothalamic E-Syt3 plays a critical role in diet-induced obesity (DIO). E-Syt3 is characteristica...

    journal_title:Proceedings of the National Academy of Sciences of the United States of America

    pub_type: 杂志文章

    doi:10.1073/pnas.2004392117

    authors: Zhang Y,Guan Y,Pan S,Yan L,Wang P,Chen Z,Shen Q,Zhao F,Zhang X,Li J,Li J,Cai D,Zhang G

    更新日期:2020-08-18 00:00:00