Detection of DNA N6-Methyladenine Modification through SMRT-seq Features and Machine Learning Model
  • * (Excluding Mailing and Handling)

Abstract

Introduction: N6-methyldeoxyadenine (6mA) is the most prevalent DNA modification in both prokaryotes and eukaryotes. While single-molecule real-time sequencing (SMRT-seq) can detect 6mA events at the individual nucleotide level, its practical application is hindered by a high rate of false positives.

Methods: We propose a computational model for identifying DNA 6mA that incorporates comprehensive site features from SMRT-seq and employs machine learning classifiers.

Results: The results demonstrate that 99.54% and 96.55% of the identified DNA 6mA instances in C.reinhardtii correspond with motifs and peak regions identified by methylated DNA immunoprecipitation sequencing (MeDIP-seq), respectively. Compared to SMRT-seq, the proportion of predicted DNA 6mA instances within MeDIP-seq peak regions increases by 2% to 70% across the six bacterial strains.

Conclusion: Our proposed method effectively reduces the false-positive rate in DNA 6mA prediction.