Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Protecting the Future of Information: LOCO Coding With Error Detection for DNA Data Storage
Download
index.pdf
Date
2024-01-01
Author
İrimağzı, Canberk
Uslan, Yusuf
Hareedy, Ahmed
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
110
views
60
downloads
Cite This
From the information-theoretic perspective, DNA strands serve as a storage medium for 4-ary data over the alphabet A,T,G,C. DNA data storage promises formidable information density, long-term durability, and ease of replicability. However, information in this intriguing storage technology might be corrupted because of error-prone data sequences as well as insertion, deletion, and substitution errors. Experiments have revealed that DNA sequences with long homopolymers and/or with low GC-content are notably more subject to errors upon storage. In order to address this biochemical challenge, constrained codes are proposed for usage in DNA data storage systems, and they are studied in the literature accordingly. This paper investigates the utilization of the recently-introduced method for designing lexicographically-ordered constrained (LOCO) codes in DNA data storage to improve performance. LOCO codes offer capacity-achievability, low complexity, and ease of reconfigurability. This paper introduces novel constrained codes, namely DNA LOCO (D-LOCO) codes, over the alphabet A,T,G,C with limited runs of identical symbols. Due to their ordered structure, these codes come with an encoding-decoding rule we derive, which provides simple and affordable encoding-decoding algorithms. In terms of storage overhead, the proposed encoding-decoding algorithms outperform those in the existing literature. Our algorithms are based on small-size adders, and therefore they are readily reconfigurable. D-LOCO codes are intrinsically balanced, which allows us to achieve balanced AT-and GC-content over the entire DNA strand with minimal rate penalty. Moreover, we propose four schemes to bridge consecutive codewords, three of which guarantee single substitution error detection per codeword. We examine the probability of undetecting errors over a presumed symmetric DNA storage channel subject to substitution errors only. We also show that D-LOCO codes are capacity-achieving and that they offer remarkably high rates even at moderate lengths.
Subject Keywords
balancing
,
Codes
,
Constrained codes
,
DNA
,
DNA data storage
,
Encoding
,
error-detection
,
homopolymer run
,
LOCO codes
,
low-complexity algorithms
,
Memory
,
reconfigurable coding
,
Signal processing algorithms
,
Symbols
,
Table lookup
URI
https://hdl.handle.net/11511/110029
Journal
IEEE Transactions on Molecular, Biological, and Multi-Scale Communications
DOI
https://doi.org/10.1109/tmbmc.2024.3400794
Collections
Department of Mathematics, Article
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
C. İrimağzı, Y. Uslan, and A. Hareedy, “Protecting the Future of Information: LOCO Coding With Error Detection for DNA Data Storage,”
IEEE Transactions on Molecular, Biological, and Multi-Scale Communications
, pp. 0–0, 2024, Accessed: 00, 2024. [Online]. Available: https://hdl.handle.net/11511/110029.