Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

Abstract:

:As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone.

authors

Park B,Im J,Tuvshinjargal N,Lee W,Han K

doi

10.1016/j.cmpb.2014.07.009

subject

Has Abstract

pub_date

2014-11-01 00:00:00

pages

158-67

issue

2

eissn

0169-2607

issn

1872-7565

pii

S0169-2607(14)00297-1

journal_volume

117

pub_type

杂志文章
  • Comparative analysis of active contour and convolutional neural network in rapid left-ventricle volume quantification using echocardiographic imaging.

    abstract::In cardiology, ultrasound is often used to diagnose heart disease associated with myocardial infarction. This study aims to develop robust segmentation techniques for segmenting the left ventricle (LV) in ultrasound images to check myocardium movement during heartbeat. The proposed technique utilizes machine learning ...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2020.105914

    authors: Zhu X,Wei Y,Lu Y,Zhao M,Yang K,Wu S,Zhang H,Wong KKL

    更新日期:2020-12-17 00:00:00

  • A novel biomedical image indexing and retrieval system via deep preference learning.

    abstract:BACKGROUND AND OBJECTIVES:The traditional biomedical image retrieval methods as well as content-based image retrieval (CBIR) methods originally designed for non-biomedical images either only consider using pixel and low-level features to describe an image or use deep features to describe images but still leave a lot of...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2018.02.003

    authors: Pang S,Orgun MA,Yu Z

    更新日期:2018-05-01 00:00:00

  • A simple computer programme for biokinetic study of 99Tcm-radiopharmaceuticals.

    abstract::A simple programme has been written in GW BASIC to calculate the percentage activity of 99Tcm-radiopharmaceuticals in different tissues after biodistribution. The programme is efficient, easy to handle and produces a permanent record in terms of a final report. ...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(94)90139-2

    authors: Imran MB,Khurshid SJ,Anwar K

    更新日期:1994-02-01 00:00:00

  • compound.Cox: Univariate feature selection and compound covariate for predicting survival.

    abstract:BACKGROUND AND OBJECTIVE:Univariate feature selection is one of the simplest and most commonly used techniques to develop a multigene predictor for survival. Presently, there is no software tailored to perform univariate feature selection and predictor construction. METHODS:We develop the compound.Cox R package that i...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2018.10.020

    authors: Emura T,Matsui S,Chen HY

    更新日期:2019-01-01 00:00:00

  • Deep learning algorithms for detection of diabetic retinopathy in retinal fundus photographs: A systematic review and meta-analysis.

    abstract:BACKGROUND:Diabetic retinopathy (DR) is one of the leading causes of blindness globally. Earlier detection and timely treatment of DR are desirable to reduce the incidence and progression of vision loss. Currently, deep learning (DL) approaches have offered better performance in detecting DR from retinal fundus images....

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2020.105320

    authors: Islam MM,Yang HC,Poly TN,Jian WS,Jack Li YC

    更新日期:2020-07-01 00:00:00

  • Structure of the standardized computerized 24-h diet recall interview used as reference method in the 22 centers participating in the EPIC project. European Prospective Investigation into Cancer and Nutrition.

    abstract::A computerized 24-h diet recall interview program (EPIC-SOFT) was developed for use in a large European multi-center study, namely the European Prospective Investigation into Cancer and Nutrition (EPIC). This program, which was adapted for each participating country and translated into nine languages, was developed to...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章,多中心研究

    doi:10.1016/s0169-2607(98)00088-1

    authors: Slimani N,Deharveng G,Charrondière RU,van Kappel AL,Ocké MC,Welch A,Lagiou A,van Liere M,Agudo A,Pala V,Brandstetter B,Andren C,Stripp C,van Staveren WA,Riboli E

    更新日期:1999-03-01 00:00:00

  • MCML--Monte Carlo modeling of light transport in multi-layered tissues.

    abstract::A Monte Carlo model of steady-state light transport in multi-layered tissues (MCML) has been coded in ANSI Standard C; therefore, the program can be used on various computers. Dynamic data allocation is used for MCML, hence the number of tissue layers and grid elements of the grid system can be varied by users at run ...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(95)01640-f

    authors: Wang L,Jacques SL,Zheng L

    更新日期:1995-07-01 00:00:00

  • IntroStat: a hypertext-based design for an electronic textbook to introduce biomedical statistics.

    abstract::A hypertext-based system called IntroStat has been developed to introduce fundamental methods of biomedical statistics. The system has been developed on a Macintosh II using HyperCard. It is written mainly in Hypertalk, a scripting language of HyperCard. Being an electronic textbook of probability and statistics, the ...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(95)01623-2

    authors: Okada M,O'Brien M

    更新日期:1995-04-01 00:00:00

  • Development of a knowledge-base for automatic monitoring of renal function of intensive care patients over time.

    abstract::Renal dysfunction is a major problem in the management of critically ill patients. Monitoring of renal parameters over time is a prerequisite for detection of any significant deterioration of kidney function. Thus, we developed a knowledge-base for the dynamic monitoring of renal function of critically ill patients. A...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/s0169-2607(99)00044-9

    authors: Heindl B,Pollwein B,Schleutermann S,Haller M,Finsterer U

    更新日期:2000-05-01 00:00:00

  • A method for measuring and reporting manual data extraction reliability.

    abstract::As health care costs have risen dramatically, the use of clinical data to analyze the quality of health care provided has increased. Central to this analysis is the means by which the clinical data itself is obtained. The reliability and validity of the data must be established in order to insure credible use of the d...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(93)90063-q

    authors: Rozewski CM

    更新日期:1993-09-01 00:00:00

  • Benchmarking of a T-wave alternans detection method based on empirical mode decomposition.

    abstract:BACKGROUND AND OBJECTIVE:T-wave alternans (TWA) is a fluctuation of the ST-T complex occurring on an every-other-beat basis of the surface electrocardiogram (ECG). It has been shown to be an informative risk stratifier for sudden cardiac death, though the lack of gold standard to benchmark detection methods has promote...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2017.04.005

    authors: Blanco-Velasco M,Goya-Esteban R,Cruz-Roldán F,García-Alberola A,Rojo-Álvarez JL

    更新日期:2017-07-01 00:00:00

  • Computer-based assessment for facioscapulohumeral dystrophy diagnosis.

    abstract::The paper presents a computer-based assessment for facioscapulohumeral dystrophy (FSHD) diagnosis through characterisation of the fat and oedema percentages in the muscle region. A novel multi-slice method for the muscle-region segmentation in the T1-weighted magnetic resonance images is proposed using principles of t...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2015.03.006

    authors: Chambers O,Milenković J,Pražnikar A,Tasič JF

    更新日期:2015-06-01 00:00:00

  • A novel approach to speckle noise filtering based on Artificial Bee Colony algorithm: an ultrasound image application.

    abstract::In this study a novel approach based on 2D FIR filters is presented for denoising digital images. In this approach the filter coefficients of 2D FIR filters were optimized using the Artificial Bee Colony (ABC) algorithm. To obtain the best filter design, the filter coefficients were tested with different numbers (3×3,...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2013.05.009

    authors: Latifoğlu F

    更新日期:2013-09-01 00:00:00

  • TAQIH, a tool for tabular data quality assessment and improvement in the context of health data.

    abstract:BACKGROUND AND OBJECTIVES:Data curation is a tedious task but of paramount relevance for data analytics and more specially in the health context where data-driven decisions must be extremely accurate. The ambition of TAQIH is to support non-technical users on 1) the exploratory data analysis (EDA) process of tabular he...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2018.12.029

    authors: Álvarez Sánchez R,Beristain Iraola A,Epelde Unanue G,Carlin P

    更新日期:2019-11-01 00:00:00

  • Virtual reality-based measurement of ocular deviation in strabismus.

    abstract:BACKGROUND AND OBJECTIVE:Strabismus is an eye movement disorder in which shows the abnormal ocular deviation. Cover tests have mainly been used in the clinical diagnosis of strabismus for treatment. However, the whole process depends on the doctor's level of experience, which could be subjected to several factors. In t...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2019.105132

    authors: Miao Y,Jeon JY,Park G,Park SW,Heo H

    更新日期:2020-03-01 00:00:00

  • An optimal code for patient identifiers.

    abstract::How to distinguish 1 billion individuals by an identifier consisting of eight characters, allowing a reasonable amount of error detection or even error correction? Our solution of this problem is an optimal code over a 32-character alphabet that detects up to two errors and corrects one error as well as a transpositio...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2005.03.004

    authors: Faldum A,Pommerening K

    更新日期:2005-07-01 00:00:00

  • The role of AIDA in a primary care information system.

    abstract::In this article the development of a computer system for General Practice, ELIAS, is described. The use of the 4th-generation software toolkit AIDA proved to be very helpful in increasing the speed of development as well as the quality of the ELIAS software. The programming support that AIDA offered, not only in incre...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(87)90086-1

    authors: Boon WM,Westerhof HP,Duisterhout JS,Cromme PV

    更新日期:1987-11-01 00:00:00

  • predicting and improving the probability of live birth for women undergoing frozen-thawed embryo transfer: a data-driven estimation and simulation model.

    abstract:BACKGROUND AND OBJECTIVE:Frozen-thawed embryo transfer (FET) is now widely used for the treatment of infertility. For many couples and clinicians, concerns over the probability and how to increase the chance of a successful birth are very common. Currently, there is not a single model to predict the live birth outcomes...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2020.105780

    authors: Liang R,An J,Zheng Y,Li J,Wang Y,Jia Y,Zhang J,Lu Q

    更新日期:2021-01-01 00:00:00

  • Preparation of 2D sequences of corneal images for 3D model building.

    abstract::A confocal microscope provides a sequence of images, at incremental depths, of the various corneal layers and structures. From these, medical practioners can extract clinical information on the state of health of the patient's cornea. In this work we are addressing problems associated with capturing and processing the...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2014.01.009

    authors: Elbita A,Qahwaji R,Ipson S,Sharif MS,Ghanchi F

    更新日期:2014-04-01 00:00:00

  • Accelerated event-by-event Monte Carlo microdosimetric calculations of electrons and protons tracks on a multi-core CPU and a CUDA-enabled GPU.

    abstract::For microdosimetric calculations event-by-event Monte Carlo (MC) methods are considered the most accurate. The main shortcoming of those methods is the extensive requirement for computational time. In this work we present an event-by-event MC code of low projectile energy electron and proton tracks for accelerated mic...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2013.09.009

    authors: Kalantzis G,Tachibana H

    更新日期:2014-01-01 00:00:00

  • Integration of morphological preprocessing and fractal based feature extraction with recursive feature elimination for skin lesion types classification.

    abstract:BACKGROUND AND OBJECTIVE:Skin cancer is the commonest form of cancer in the worldwide population. Non-invasive and non-contact imaging modalities are being used for the screening of melanoma and other cutaneous malignancies to endorse early detection and prevention of the disease. Traditionally it has been a problem fo...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2019.06.018

    authors: Chatterjee S,Dey D,Munshi S

    更新日期:2019-09-01 00:00:00

  • Adaptive median binary patterns for fully automatic nerves tracking in ultrasound images.

    abstract:BACKGROUND AND OBJECTIVE:In the last decade, Ultrasound-Guided Regional Anesthesia (UGRA) gained importance in surgical procedures and pain management, due to its ability to perform target delivery of local anesthetics under direct sonographic visualization. However, practicing UGRA can be challenging, since it require...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2018.03.013

    authors: Alkhatib M,Hafiane A,Tahri O,Vieyres P,Delbos A

    更新日期:2018-07-01 00:00:00

  • Evaluation of Epilepsy Expert--a decision support system.

    abstract::Epilepsy Expert is a decision support system based on the International Classification of Epilepsies and Epileptic Syndromes (1989). The aim of this study was to evaluate the Epilepsy Expert. First the diagnostic performance was validated. This was done in 3 stages: collection of the patient cases, determination of th...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(94)90206-2

    authors: Korpinen L,Pietilä T,Peltola J,Nissilä M,Keränen T,Touvinen T,Falck B,Petránek ES,Frey H

    更新日期:1994-11-01 00:00:00

  • MSFCN-multiple supervised fully convolutional networks for the osteosarcoma segmentation of CT images.

    abstract:BACKGROUND AND OBJECTIVE:Automatic osteosarcoma tumor segmentation on computed tomography (CT) images is a challenging problem, as tumors have large spatial and structural variabilities. In this study, an automatic tumor segmentation method, which was based on a fully convolutional networks with multiple supervised sid...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2017.02.013

    authors: Huang L,Xia W,Zhang B,Qiu B,Gao X

    更新日期:2017-05-01 00:00:00

  • Library resources for problem-based learning: the program perspective.

    abstract::The impact of a problem based curriculum has been the subject of increasing interest, as evidenced by several recent articles on the subject [1-4]. McMaster was able to design its library to serve a problem-based curriculum, but since there had been no prior experience with such a curriculum, the library was designed ...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(94)90110-4

    authors: Blake J

    更新日期:1994-09-01 00:00:00

  • Cross-sectional photoacoustic tomography image reconstruction with a multi-curve integration model.

    abstract:BACKGROUND AND OBJECTIVE:In acoustic inversion of photoacoustic tomography (PAT), an imaging model that precisely describes both the ultrasonic wave propagation and the detector properties is of crucial importance. Inspired by the multi-stripe integration model in clinical X-ray computed tomography systems, in this wor...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2020.105731

    authors: Qi L,Huang S,Li X,Zhang S,Lu L,Feng Q,Chen W

    更新日期:2020-12-01 00:00:00

  • Analysis of sub-anatomic diffusion tensor imaging indices in white matter regions of Alzheimer with MMSE score.

    abstract::In this study, an attempt has been made to find the correlation between diffusion tensor imaging (DTI) indices of white matter (WM) regions and mini mental state examination (MMSE) score of Alzheimer patients. Diffusion weighted images are obtained from the ADNI database. These are preprocessed for eddy current correc...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2014.06.004

    authors: Patil RB,Ramakrishnan S

    更新日期:2014-10-01 00:00:00

  • The monitoring and managing application of cloud computing based on Internet of Things.

    abstract::Cloud computing and the Internet of Things are the two hot points in the Internet application field. The application of the two new technologies is in hot discussion and research, but quite less on the field of medical monitoring and managing application. Thus, in this paper, we study and analyze the application of cl...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2016.03.024

    authors: Luo S,Ren B

    更新日期:2016-07-01 00:00:00

  • Correlation method versus enhanced modified moving average method for automatic detection of T-wave alternans.

    abstract::Enhanced modified moving average method (EMMAM) and correlation method (CM) for microvolt TWA identification are compared by aid of simulated ECG tracings (cases of absence of TWA and presence of stationary or time-varying TWA) and ECG recordings from healthy subjects (H-group) and patients who survived an acute myoca...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/j.cmpb.2010.01.008

    authors: Burattini L,Bini S,Burattini R

    更新日期:2010-04-01 00:00:00

  • Fulfilling programmatic needs at the University of Maryland at Baltimore: the reality.

    abstract::This article will focus on the actual needs enumerated in the University of Maryland at Baltimore's Health Sciences Library/Information Services program entitled, Health Sciences Library: A New Building for Information Services Including the Health Sciences Library. These needs fall into one of 3 categories: collectio...

    journal_title:Computer methods and programs in biomedicine

    pub_type: 杂志文章

    doi:10.1016/0169-2607(94)90120-1

    authors: Tooey MJ

    更新日期:1994-09-01 00:00:00