Abstract:
BACKGROUND:Chemical named entities represent an important facet of biomedical text. RESULTS:We have developed a system to use character-based n-grams, Maximum Entropy Markov Models and rescoring to recognise chemical names and other such entities, and to make confidence estimates for the extracted entities. An adjustable threshold allows the system to be tuned to high precision or high recall. At a threshold set for balanced precision and recall, we were able to extract named entities at an F score of 80.7% from chemistry papers and 83.2% from PubMed abstracts. Furthermore, we were able to achieve 57.6% and 60.3% recall at 95% precision, and 58.9% and 49.1% precision at 90% recall. CONCLUSION:These results show that chemical named entities can be extracted with good performance, and that the properties of the extraction can be tuned to suit the demands of the task.
journal_name
BMC Bioinformaticsjournal_title
BMC bioinformaticsauthors
Corbett P,Copestake Adoi
10.1186/1471-2105-9-S11-S4subject
Has Abstractpub_date
2008-11-19 00:00:00pages
S4issn
1471-2105pii
1471-2105-9-S11-S4journal_volume
9 Suppl 11pub_type
杂志文章abstract:BACKGROUND:Infections are often associated to comorbidity that increases the risk of medical conditions which can lead to further morbidity and mortality. SARS is a threat which is similar to MERS virus, but the comorbidity is the key aspect to underline their different impacts. One UK doctor says "I'd rather have HIV ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-333
更新日期:2014-10-24 00:00:00
abstract:BACKGROUND:Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-94
更新日期:2004-07-13 00:00:00
abstract:BACKGROUND:We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S9-S16
更新日期:2011-10-05 00:00:00
abstract:BACKGROUND:Meta-analysis (MA) is widely used to pool genome-wide association studies (GWASes) in order to a) increase the power to detect strong or weak genotype effects or b) as a result verification method. As a consequence of differing SNP panels among genotyping chips, imputation is the method of choice within GWAS...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-13-231
更新日期:2012-09-12 00:00:00
abstract:BACKGROUND:Mass spectrometry (MS) coupled with online separation methods is commonly applied for differential and quantitative profiling of biological samples in metabolomic as well as proteomic research. Such approaches are used for systems biology, functional genomics, and biomarker discovery, among others. An ongoin...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-395
更新日期:2010-07-23 00:00:00
abstract:BACKGROUND:Population stratification is a systematic difference in allele frequencies between subpopulations. This can lead to spurious association findings in the case-control genome wide association studies (GWASs) used to identify single nucleotide polymorphisms (SNPs) associated with disease-linked phenotypes. Meth...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-14-61
更新日期:2013-02-22 00:00:00
abstract:BACKGROUND:Hepatocellular carcinoma (HCC) is an aggressive epithelial tumor which shows very poor prognosis and high rate of recurrence, representing an urgent problem for public healthcare. MicroRNAs (miRNAs/miRs) are a class of small, non-coding RNAs that attract great attention because of their role in regulation of...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0836-1
更新日期:2015-12-10 00:00:00
abstract:BACKGROUND:The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-cou...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-254
更新日期:2006-05-15 00:00:00
abstract:BACKGROUND:Polymorphic variants and mutations disrupting canonical splicing isoforms are among the leading causes of human hereditary disorders. While there is a substantial evidence of aberrant splicing causing Mendelian diseases, the implication of such events in multi-genic disorders is yet to be well understood. We...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-22
更新日期:2010-01-12 00:00:00
abstract:BACKGROUND:In protein design, correct use of topology is among the initial and most critical feature. Meticulous selection of backbone topology aids in drastically reducing the structure search space. With ProLego, we present a server application to explore the component aspect of protein structures and provide an intu...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2171-9
更新日期:2018-05-04 00:00:00
abstract:BACKGROUND:Microarray technology allows the analysis of genomic aberrations at an ever increasing resolution, making functional interpretation of these vast amounts of data the main bottleneck in routine implementation of high resolution array platforms, and emphasising the need for a centralised and easy to use CNV da...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-4
更新日期:2011-01-05 00:00:00
abstract:BACKGROUND:The development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens. Omics studies are enabling genotype-phenotype association studies which identify genetic determinants of pathogen virulence and drug...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2580-9
更新日期:2019-01-07 00:00:00
abstract:BACKGROUND:Inflammation is a core element of many different, systemic and chronic diseases that usually involve an important autoimmune component. The clinical phase of inflammatory diseases is often the culmination of a long series of pathologic events that started years before. The systemic characteristics and relate...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-018-2413-x
更新日期:2018-11-30 00:00:00
abstract:BACKGROUND:In two-channel competitive genomic hybridization microarray experiments, the ratio of the two fluorescent signal intensities at each spot on the microarray is commonly used to infer the relative amounts of the test and reference sample DNA levels. This ratio may be influenced by systematic measurement effect...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-6-274
更新日期:2005-11-18 00:00:00
abstract:BACKGROUND:Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for exa...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-016-1254-8
更新日期:2016-10-18 00:00:00
abstract:BACKGROUND:Carbohydrates are a class of large and diverse biomolecules, ranging from a simple monosaccharide to large multi-branching glycan structures. The covalent linkage of a carbohydrate to the nitrogen atom of an asparagine, a process referred to as N-linked glycosylation, plays an important role in the physiolog...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-019-3097-6
更新日期:2019-10-22 00:00:00
abstract:BACKGROUND:Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods. Usually the MVs are not treated,...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-5-114
更新日期:2004-08-23 00:00:00
abstract:BACKGROUND:Endometrial cancers (ECs) are one of the most common types of malignant tumor in females. Substantial efforts had been made to identify significantly mutated genes (SMGs) in ECs and use them as biomarkers for the classification of histological subtypes and the prediction of clinical outcomes. However, the im...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1891-6
更新日期:2017-12-28 00:00:00
abstract:BACKGROUND:Protein sequence profile-profile alignment is an important approach to recognizing remote homologs and generating accurate pairwise alignments. It plays an important role in protein sequence database search, protein structure prediction, protein function prediction, and phylogenetic analysis. RESULTS:In thi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-252
更新日期:2014-07-25 00:00:00
abstract:BACKGROUND:In recent times, there has been an exponential rise in the number of protein structures in databases e.g. PDB. So, design of fast algorithms capable of querying such databases is becoming an increasingly important research issue. This paper reports an algorithm, motivated from spectral graph matching techniq...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-7-S5-S5
更新日期:2006-12-18 00:00:00
abstract:BACKGROUND:The total number of known three-dimensional protein structures is rapidly increasing. Consequently, the need for fast structural search against complete databases without a significant loss of accuracy is increasingly demanding. Recently, TopSearch, an ultra-fast method for finding rigid structural relations...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-015-0866-8
更新日期:2016-01-05 00:00:00
abstract:BACKGROUND:Biological data that are well-organized by an ontology, such as Gene Ontology, enables high-throughput availability of the semantic web. It can also be used to facilitate high throughput classification of biomedical information. However, to our knowledge, no evaluation has been published on automating classi...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-8-S3-S7
更新日期:2007-05-09 00:00:00
abstract:BACKGROUND:The Triplex cell vaccine is a cancer cellular vaccine that can prevent almost completely the mammary tumor onset in HER-2/neu transgenic mice. In a translational perspective, the activity of the Triplex vaccine was also investigated against lung metastases showing that the vaccine is an effective treatment a...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S7-S13
更新日期:2010-10-15 00:00:00
abstract:BACKGROUND:Identification of the recombination hot/cold spots is critical for understanding the mechanism of recombination as well as the genome evolution process. However, experimental identification of recombination spots is both time-consuming and costly. Developing an accurate and automated method for reliably and ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-15-340
更新日期:2014-11-20 00:00:00
abstract:BACKGROUND:Over the last few years transcriptome sequencing (RNA-Seq) has almost completely taken over microarrays for high-throughput studies of gene expression. Currently, the most popular use of RNA-Seq is to identify genes which are differentially expressed between two or more conditions. Despite the importance of ...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-014-0397-8
更新日期:2014-12-05 00:00:00
abstract:BACKGROUND:Prognosis is of critical interest in breast cancer research. Biomedical studies suggest that genomic measurements may have independent predictive power for prognosis. Gene profiling studies have been conducted to search for predictive genomic measurements. Genes have the inherent pathway structure, where pat...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-1
更新日期:2010-01-01 00:00:00
abstract:BACKGROUND:Most state-of-the-art biomedical entity normalization systems, such as rule-based systems, merely rely on morphological information of entity mentions, but rarely consider their semantic information. In this paper, we introduce a novel convolutional neural network (CNN) architecture that regards biomedical e...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/s12859-017-1805-7
更新日期:2017-10-03 00:00:00
abstract:BACKGROUND:Clustering of unannotated transcripts is an important task to identify novel families of noncoding RNAs (ncRNAs). Several hierarchical clustering methods have been developed using similarity measures based on the scores of structural alignment. However, the high computational cost of exact structural alignme...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-12-S1-S48
更新日期:2011-02-15 00:00:00
abstract:BACKGROUND:One of key issues in the post-genomic era is to assign functions to uncharacterized proteins. Since proteins seldom act alone; rather, they must interact with other biomolecular units to execute their functions. Thus, the functions of unknown proteins may be discovered through studying their interactions wit...
journal_title:BMC bioinformatics
pub_type: 杂志文章
doi:10.1186/1471-2105-11-S1-S64
更新日期:2010-01-18 00:00:00
abstract::Following publication of the original article [1], the author reported that there are several errors in the original article. ...
journal_title:BMC bioinformatics
pub_type: 杂志文章,已发布勘误
doi:10.1186/s12859-019-3318-z
更新日期:2020-01-22 00:00:00