Component retention in principal component analysis with application to cDNA microarray data.

Abstract:

:Shannon entropy is used to provide an estimate of the number of interpretable components in a principal component analysis. In addition, several ad hoc stopping rules for dimension determination are reviewed and a modification of the broken stick model is presented. The modification incorporates a test for the presence of an "effective degeneracy" among the subspaces spanned by the eigenvectors of the correlation matrix of the data set then allocates the total variance among subspaces. A summary of the performance of the methods applied to both published microarray data sets and to simulated data is given.

journal_name

Biol Direct

journal_title

Biology direct

authors

Cangelosi R,Goriely A

doi

10.1186/1745-6150-2-2

subject

Has Abstract

pub_date

2007-01-17 00:00:00

pages

2

issn

1745-6150

pii

1745-6150-2-2

journal_volume

2

pub_type

杂志文章
  • Distinct groups of repetitive families preserved in mammals correspond to different periods of regulatory innovations in vertebrates.

    abstract:BACKGROUND:Mammalian genomes are repositories of repetitive DNA sequences derived from transposable elements (TEs). Typically, TEs generate multiple, mostly inactive copies of themselves, commonly known as repetitive families or families of repeats. Recently, we proposed that families of TEs originate in small populati...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-7-36

    authors: Jurka J,Bao W,Kojima KK,Kohany O,Yurka MG

    更新日期:2012-10-25 00:00:00

  • Putative adaptive inter-slope divergence of transposon frequency in fruit flies (Drosophila melanogaster) at "Evolution Canyon", Mount Carmel, Israel.

    abstract:BACKGROUND:The current analysis of transposon elements (TE) in Drosophila melanogaster at Evolution Canyon, (EC), Israel, is based on data and analysis done by our collaborators (Drs. J. Gonzalez, J. Martinez and W. Makalowski, this issue). They estimated the frequencies of 28 TEs (transposon elements) in fruit flies (...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-015-0074-5

    authors: Beiles A,Raz S,Ben-Abu Y,Nevo E

    更新日期:2015-10-14 00:00:00

  • Global analyses of Chromosome 17 and 18 genes of lung telocytes compared with mesenchymal stem cells, fibroblasts, alveolar type II cells, airway epithelial cells, and lymphocytes.

    abstract:BACKGROUND:Telocytes (TCs) is an interstitial cell with extremely long and thin telopodes (Tps) with thin segments (podomers) and dilations (podoms) to interact with neighboring cells. TCs have been found in different organs, while there is still a lack of TCs-specific biomarkers to distinguish TCs from the other cells...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-015-0042-0

    authors: Wang J,Ye L,Jin M,Wang X

    更新日期:2015-03-11 00:00:00

  • Episodic, transient systemic acidosis delays evolution of the malignant phenotype: Possible mechanism for cancer prevention by increased physical activity.

    abstract:BACKGROUND:The transition from premalignant to invasive tumour growth is a prolonged multistep process governed by phenotypic adaptation to changing microenvironmental selection pressures. Cancer prevention strategies are required to interrupt or delay somatic evolution of the malignant invasive phenotype. Empirical st...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-5-22

    authors: Smallbone K,Maini PK,Gatenby RA

    更新日期:2010-04-20 00:00:00

  • Infinitely long branches and an informal test of common ancestry.

    abstract:BACKGROUND:The evidence for universal common ancestry (UCA) is vast and persuasive. A phylogenetic test has been proposed for quantifying its odds against independently originated sequences based on the comparison between one versus several trees. This test was successfully applied to a well-supported homologous sequen...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-016-0120-y

    authors: de Oliveira Martins L,Posada D

    更新日期:2016-04-07 00:00:00

  • The origins of phagocytosis and eukaryogenesis.

    abstract:BACKGROUND:Phagocytosis, that is, engulfment of large particles by eukaryotic cells, is found in diverse organisms and is often thought to be central to the very origin of the eukaryotic cell, in particular, for the acquisition of bacterial endosymbionts including the ancestor of the mitochondrion. RESULTS:Comparisons...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-4-9

    authors: Yutin N,Wolf MY,Wolf YI,Koonin EV

    更新日期:2009-02-26 00:00:00

  • Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park.

    abstract:BACKGROUND:A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. RESULTS:The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a ...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-8-9

    authors: Podar M,Makarova KS,Graham DE,Wolf YI,Koonin EV,Reysenbach AL

    更新日期:2013-04-22 00:00:00

  • Why call it developmental bias when it is just development?

    abstract::The concept of developmental constraints has been central to understand the role of development in morphological evolution. Developmental constraints are classically defined as biases imposed by development on the distribution of morphological variation.This opinion article argues that the concepts of developmental co...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-020-00289-w

    authors: Salazar-Ciudad I

    更新日期:2021-01-09 00:00:00

  • Interplay of recombination and selection in the genomes of Chlamydia trachomatis.

    abstract:BACKGROUND:Chlamydia trachomatis is an obligate intracellular bacterial parasite, which causes several severe and debilitating diseases in humans. This study uses comparative genomic analyses of 12 complete published C. trachomatis genomes to assess the contribution of recombination and selection in this pathogen and t...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-6-28

    authors: Joseph SJ,Didelot X,Gandhi K,Dean D,Read TD

    更新日期:2011-05-26 00:00:00

  • Outer membrane protein genes and their small non-coding RNA regulator genes in Photorhabdus luminescens.

    abstract:INTRODUCTION:Three major outer membrane protein genes of Escherichia coli, ompF, ompC, and ompA respond to stress factors. Transcripts from these genes are regulated by the small non-coding RNAs micF, micC, and micA, respectively. Here we examine Photorhabdus luminescens, an organism that has a different habitat from E...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-1-12

    authors: Papamichail D,Delihas N

    更新日期:2006-05-22 00:00:00

  • Predictability of drug-induced liver injury by machine learning.

    abstract:BACKGROUND:Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis gro...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-020-0259-4

    authors: Chierici M,Francescatto M,Bussola N,Jurman G,Furlanello C

    更新日期:2020-02-13 00:00:00

  • The mechanistic and evolutionary aspects of the 2'- and 3'-OH paradigm in biosynthetic machinery.

    abstract:BACKGROUND:The translation machinery underlies a multitude of biological processes within the cell. The design and implementation of the modern translation apparatus on even the simplest course of action is extremely complex, and involves different RNA and protein factors. According to the "RNA world" idea, the critica...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-8-17

    authors: Safro M,Klipcan L

    更新日期:2013-07-08 00:00:00

  • Elusive data underlying debate at the prokaryote-eukaryote divide.

    abstract:BACKGROUND:The origin of eukaryotic cells was an important transition in evolution. The factors underlying the origin and evolutionary success of the eukaryote lineage are still discussed. One camp argues that mitochondria were essential for eukaryote origin because of the unique configuration of internalized bioenerge...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-018-0221-x

    authors: Gerlitz M,Knopp M,Kapust N,Xavier JC,Martin WF

    更新日期:2018-10-03 00:00:00

  • Origin of the nuclear proteome on the basis of pre-existing nuclear localization signals in prokaryotic proteins.

    abstract:BACKGROUND:The origin of the selective nuclear protein import machinery, which consists of nuclear pore complexes and adaptor molecules interacting with the nuclear localization signals (NLSs) of cargo molecules, is one of the most important events in the evolution of eukaryotic cells. How proteins were selected for im...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-020-00263-6

    authors: Lisitsyna OM,Kurnaeva MA,Arifulin EA,Shubina MY,Musinova YR,Mironov AA,Sheval EV

    更新日期:2020-04-28 00:00:00

  • On origin of genetic code and tRNA before translation.

    abstract:BACKGROUND:Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental ...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-6-14

    authors: Rodin AS,Szathmáry E,Rodin SN

    更新日期:2011-02-22 00:00:00

  • PEPstrMOD: structure prediction of peptides containing natural, non-natural and modified residues.

    abstract:BACKGROUND:In the past, many methods have been developed for peptide tertiary structure prediction but they are limited to peptides having natural amino acids. This study describes a method PEPstrMOD, which is an updated version of PEPstr, developed specifically for predicting the structure of peptides containing natur...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-015-0103-4

    authors: Singh S,Singh H,Tuknait A,Chaudhary K,Singh B,Kumaran S,Raghava GP

    更新日期:2015-12-21 00:00:00

  • Domain enhanced lookup time accelerated BLAST.

    abstract:BACKGROUND:BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-7-12

    authors: Boratyn GM,Schäffer AA,Agarwala R,Altschul SF,Lipman DJ,Madden TL

    更新日期:2012-04-17 00:00:00

  • Once upon a time the cell membranes: 175 years of cell boundary research.

    abstract::All modern cells are bounded by cell membranes best described by the fluid mosaic model. This statement is so widely accepted by biologists that little attention is generally given to the theoretical importance of cell membranes in describing the cell. This has not always been the case. When the Cell Theory was first ...

    journal_title:Biology direct

    pub_type: 历史文章,杂志文章,评审

    doi:10.1186/s13062-014-0032-7

    authors: Lombard J

    更新日期:2014-12-19 00:00:00

  • IPC - Isoelectric Point Calculator.

    abstract:BACKGROUND:Accurate estimation of the isoelectric point (pI) based on the amino acid sequence is useful for many analytical biochemistry and proteomics techniques such as 2-D polyacrylamide gel electrophoresis, or capillary isoelectric focusing used in combination with high-throughput mass spectrometry. Additionally, p...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-016-0159-9

    authors: Kozlowski LP

    更新日期:2016-10-21 00:00:00

  • Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements.

    abstract:BACKGROUND:In eukaryotes, RNA interference (RNAi) is a major mechanism of defense against viruses and transposable elements as well of regulating translation of endogenous mRNAs. The RNAi systems recognize the target RNA molecules via small guide RNAs that are completely or partially complementary to a region of the ta...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-4-29

    authors: Makarova KS,Wolf YI,van der Oost J,Koonin EV

    更新日期:2009-08-25 00:00:00

  • A web server for analysis, comparison and prediction of protein ligand binding sites.

    abstract:BACKGROUND:One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. RESULTS:In order to address this problem, we developed...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-016-0118-5

    authors: Singh H,Srivastava HK,Raghava GP

    更新日期:2016-03-25 00:00:00

  • Proteomic changes associated with deletion of the Magnaporthe oryzae conidial morphology-regulating gene COM1.

    abstract:BACKGROUND:The rice blast disease caused by Magnaporthe oryzae is a major constraint on world rice production. The conidia produced by this fungal pathogen are the main source of disease dissemination. The morphology of conidia may be a critical factor in the spore dispersal and virulence of M. oryzae in the field. Del...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-5-61

    authors: Bhadauria V,Wang LX,Peng YL

    更新日期:2010-11-02 00:00:00

  • Comparative genomic analysis of the DUF71/COG2102 family predicts roles in diphthamide biosynthesis and B12 salvage.

    abstract:BACKGROUND:The availability of over 3000 published genome sequences has enabled the use of comparative genomic approaches to drive the biological function discovery process. Classically, one used to link gene with function by genetic or biochemical approaches, a lengthy process that often took years. Phylogenetic distr...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-7-32

    authors: de Crécy-Lagard V,Forouhar F,Brochier-Armanet C,Tong L,Hunt JF

    更新日期:2012-09-26 00:00:00

  • Is pre-Darwinian evolution plausible?

    abstract:BACKGROUND:This essay highlights critical aspects of the plausibility of pre-Darwinian evolution. It is based on a critical review of some better-known open, far-from-equilibrium system-based scenarios supposed to explain processes that took place before Darwinian evolution had emerged and that resulted in the origin o...

    journal_title:Biology direct

    pub_type: 杂志文章,评审

    doi:10.1186/s13062-018-0216-7

    authors: Tessera M

    更新日期:2018-09-21 00:00:00

  • Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data.

    abstract:BACKGROUND:The advent of metagenomic sequencing provides microbial abundance patterns that can be leveraged for sample origin prediction. Supervised machine learning classification approaches have been reported to predict sample origin accurately when the origin has been previously sampled. Using metagenomic datasets p...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-020-00287-y

    authors: Chen JC,Tyler AD

    更新日期:2020-12-10 00:00:00

  • Activating and inhibiting connections in biological network dynamics.

    abstract:BACKGROUND:Many studies of biochemical networks have analyzed network topology. Such work has suggested that specific types of network wiring may increase network robustness and therefore confer a selective advantage. However, knowledge of network topology does not allow one to predict network dynamical behavior--for e...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-3-49

    authors: McDonald D,Waterbury L,Knight R,Betterton MD

    更新日期:2008-12-04 00:00:00

  • Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    abstract:BACKGROUND:H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homolo...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-9-5

    authors: Zhou H,Gao S,Nguyen NN,Fan M,Jin J,Liu B,Zhao L,Xiong G,Tan M,Li S,Wong L

    更新日期:2014-04-08 00:00:00

  • Stable feature selection and classification algorithms for multiclass microarray data.

    abstract:BACKGROUND:Recent studies suggest that gene expression profiles are a promising alternative for clinical cancer classification. One major problem in applying DNA microarrays for classification is the dimension of obtained data sets. In this paper we propose a multiclass gene selection method based on Partial Least Squa...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-7-33

    authors: Student S,Fujarewicz K

    更新日期:2012-10-02 00:00:00

  • The UBR-box and its relationship to binuclear RING-like treble clef zinc fingers.

    abstract:BACKGROUND:The N-end rule pathway is a part of the ubiquitin-dependent proteolytic system wherein N-recognin proteins recognize the amino terminal degradation signals (N-degrons) of the substrate. The type 1 N-degron recognizing UBR-box domain of the eukaryotic Arg/N-end rule pathway is known to possess a novel three-z...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/s13062-015-0066-5

    authors: Kaur G,Subramanian S

    更新日期:2015-07-17 00:00:00

  • Orphan SelD proteins and selenium-dependent molybdenum hydroxylases.

    abstract::Bacterial and Archaeal cells use selenium structurally in selenouridine-modified tRNAs, in proteins translated with selenocysteine, and in the selenium-dependent molybdenum hydroxylases (SDMH). The first two uses both require the selenophosphate synthetase gene, selD. Examining over 500 complete prokaryotic genomes fi...

    journal_title:Biology direct

    pub_type: 杂志文章

    doi:10.1186/1745-6150-3-4

    authors: Haft DH,Self WT

    更新日期:2008-02-20 00:00:00