Abstract:
BACKGROUND:The challenge of classifying cortical interneurons is yet to be solved. Data-driven classification into established morphological types may provide insight and practical value. RESULTS:We trained models using 217 high-quality morphologies of rat somatosensory neocortex interneurons reconstructed by a single laboratory and pre-classified into eight types. We quantified 103 axonal and dendritic morphometrics, including novel ones that capture features such as arbor orientation, extent in layer one, and dendritic polarity. We trained a one-versus-rest classifier for each type, combining well-known supervised classification algorithms with feature selection and over- and under-sampling. We accurately classified the nest basket, Martinotti, and basket cell types with the Martinotti model outperforming 39 out of 42 leading neuroscientists. We had moderate accuracy for the double bouquet, small and large basket types, and limited accuracy for the chandelier and bitufted types. We characterized the types with interpretable models or with up to ten morphometrics. CONCLUSION:Except for large basket, 50 high-quality reconstructions sufficed to learn an accurate model of a type. Improving these models may require quantifying complex arborization patterns and finding correlates of bouton-related features. Our study brings attention to practical aspects important for neuron classification and is readily reproducible, with all code and data available online.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Mihaljević B,Larrañaga P,Benavides-Piccione R,Hill S,DeFelipe J,Bielza Cdoi
10.1186/s12859-018-2470-1subject
Has Abstractpub_date
2018-12-17 00:00:00pages
511issue
1issn
1471-2105pii
10.1186/s12859-018-2470-1journal_volume
19pub_type
杂志文章abstract:BACKGROUND:Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree b...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-356
更新日期:2009-10-27 00:00:00
abstract:BACKGROUND:In many bacteria, intragenomic diversity in synonymous codon usage among genes has been reported. However, no quantitative attempt has been made to compare the diversity levels among different genomes. Here, we introduce a mean dissimilarity-based index (Dmean) for quantifying the level of diversity in synon...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-167
更新日期:2009-06-01 00:00:00
abstract:BACKGROUND:Reverse transcription followed by real-time PCR is widely used for quantification of specific mRNA, and with the use of double-stranded DNA binding dyes it is becoming a standard for microarray data validation. Despite the kinetic information generated by real-time PCR, most popular analysis methods assume c...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-85
更新日期:2007-03-09 00:00:00
abstract:BACKGROUND:Multiple co-inertia analysis (mCIA) is a multivariate analysis method that can assess relationships and trends in multiple datasets. Recently it has been used for integrative analysis of multiple high-dimensional -omics datasets. However, its estimated loading vectors are non-sparse, which presents challenge...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3455-4
更新日期:2020-04-15 00:00:00
abstract:BACKGROUND:Because loops connect regular secondary structures, analysis of the former depends directly on the definition of the latter. The numerous assignment methods, however, can offer different definitions. In a previous study, we defined a structural alphabet composed of 16 average protein fragments, which we call...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-58
更新日期:2004-05-12 00:00:00
abstract:BACKGROUND:Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Tes...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-431
更新日期:2007-11-07 00:00:00
abstract:BACKGROUND:Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to support the analysis of...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0374-2
更新日期:2014-12-10 00:00:00
abstract:BACKGROUND:Zebrafish is a widely used model organism for studying heart development and cardiac-related pathogenesis. With the ability of surviving without a functional circulation at larval stages, strong genetic similarity between zebrafish and mammals, prolific reproduction and optically transparent embryos, zebrafi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2166-6
更新日期:2018-05-09 00:00:00
abstract:BACKGROUND:Molecular biology data exist on diverse scales, from the level of molecules to -omics. At the same time, the data at each scale can be categorised into multiple layers, such as the genome, transcriptome, proteome, metabolome, and biochemical pathways. Due to the highly multi-layer and multi-dimensional natur...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-31
更新日期:2009-01-23 00:00:00
abstract:BACKGROUND:The explosive growth of biological data provides opportunities for new statistical and comparative analyses of large information sets, such as alignments comprising tens of thousands of sequences. In such studies, sequence annotations frequently play an essential role, and reliable results depend on metadata...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-S1-S7
更新日期:2008-01-01 00:00:00
abstract:BACKGROUND:Proteins having similar functions from different sources can be identified by the occurrence in their sequences, a conserved cluster of amino acids referred to as pattern, motif, signature or fingerprint. The wide usage of protein sequence analysis in par with the growth of databases signifies the importance...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-127
更新日期:2004-09-09 00:00:00
abstract:BACKGROUND:Great strides have been made in the effective treatment of HIV-1 with the development of second-generation protease inhibitors (PIs) that are effective against historically multi-PI-resistant HIV-1 variants. Nevertheless, mutation patterns that confer decreasing susceptibility to available PIs continue to ar...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-477
更新日期:2011-12-15 00:00:00
abstract:BACKGROUND:During the last few years, DNA sequence analysis has become one of the primary means of taxonomic identification of species, particularly so for species that are minute or otherwise lack distinct, readily obtainable morphological characters. Although the number of sequences available for comparison in public...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-178
更新日期:2005-07-18 00:00:00
abstract:BACKGROUND:In many laboratories, researchers store experimental data on their own workstation using spreadsheets. However, this approach poses a number of problems, ranging from sharing issues to inefficient data-mining. Standard spreadsheets are also error-prone, as data do not undergo any validation process. To overc...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-15
更新日期:2012-01-26 00:00:00
abstract:BACKGROUND:Many functional RNA molecules fold into pseudoknot structures, which are often essential for the formation of an RNA's 3D structure. Currently the design of RNA molecules, which fold into a specific structure (known as RNA inverse folding) within biotechnological applications, is lacking the feature of incor...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0815-6
更新日期:2015-11-18 00:00:00
abstract:BACKGROUND:Aiming to understand cellular responses to different perturbations, the NIH Common Fund Library of Integrated Network-based Cellular Signatures (LINCS) program involves many institutes and laboratories working on over a thousand cell lines. The community-based Cell Line Ontology (CLO) is selected as the defa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1981-5
更新日期:2017-12-21 00:00:00
abstract:BACKGROUND:Genome imputation, admixture resolution and genome-wide association analyses are timely and computationally intensive processes with many composite and requisite steps. Analysis time increases further when building and installing the run programs required for these analyses. For scientists that may not be as...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-2964-5
更新日期:2019-06-28 00:00:00
abstract:BACKGROUND:Protein quality assessment (QA) useful for ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1405-y
更新日期:2016-12-05 00:00:00
abstract:BACKGROUND:Biochemically detailed stoichiometric matrices have now been reconstructed for various bacteria, yeast, and for the human cardiac mitochondrion based on genomic and proteomic data. These networks have been manually curated based on legacy data and elementally and charge balanced. Comparative analysis of thes...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-111
更新日期:2006-03-06 00:00:00
abstract::Complexes of physically interacting proteins are one of the fundamental functional units responsible for driving key biological mechanisms within the cell. With the advent of high-throughput techniques, significant amount of protein interaction (PPI) data has been catalogued for organisms such as yeast, which has in t...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-S17-S16
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:Automated protein function prediction methods are needed to keep pace with high-throughput sequencing. With the existence of many programs and databases for inferring different protein functions, a pipeline that properly integrates these resources will benefit from the advantages of each method. However, int...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-52
更新日期:2008-01-25 00:00:00
abstract:BACKGROUND:As phenotypic features derived from heritable characters, the topologies of metabolic pathways contain both phylogenetic and phenetic components. In the post-genomic era, it is possible to measure the "phylophenetic" contents of different pathways topologies from a global perspective. RESULTS:We reconstruct...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-252
更新日期:2006-05-09 00:00:00
abstract:BACKGROUND:Severity gradation of missense mutations is a big challenge for exome annotation. Predictors of deleteriousness that are most frequently used to filter variants found by next generation sequencing, produce qualitative predictions, but also numerical scores. It has never been tested if these scores correlate ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2416-7
更新日期:2018-11-30 00:00:00
abstract:BACKGROUND:With the advent of high-throughput proteomic experiments such as arrays of purified proteins comes the need to analyse sets of proteins as an ensemble, as opposed to the traditional one-protein-at-a-time approach. Although there are several publicly available tools that facilitate the analysis of protein set...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-338
更新日期:2006-07-12 00:00:00
abstract:BACKGROUND:Reliable prediction of antibody, or B-cell, epitopes remains challenging yet highly desirable for the design of vaccines and immunodiagnostics. A correlation between antigenicity, solvent accessibility, and flexibility in proteins was demonstrated. Subsequently, Thornton and colleagues proposed a method for ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-9-514
更新日期:2008-12-02 00:00:00
abstract:BACKGROUND:DNA methylation changes are associated with a wide array of biological processes. Bisulfite conversion of DNA followed by high-throughput sequencing is increasingly being used to assess genome-wide methylation at single-base resolution. The relative slowness of most commonly used aligners for processing such...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-337
更新日期:2014-10-18 00:00:00
abstract:BACKGROUND:Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings i...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-10-421
更新日期:2009-12-15 00:00:00
abstract:BACKGROUND:The ability to confidently predict health outcomes from gene expression would catalyze a revolution in molecular diagnostics. Yet, the goal of developing actionable, robust, and reproducible predictive signatures of phenotypes such as clinical outcome has not been attained in almost any disease area. Here, w...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-020-3427-8
更新日期:2020-03-20 00:00:00
abstract:BACKGROUND:An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biolo...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-94
更新日期:2012-05-11 00:00:00
abstract:BACKGROUND:To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides inf...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-284
更新日期:2006-06-06 00:00:00