Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Copy-fm: A tool for determination of the fraction of mosaicism in copy number variations
Download
HIBIT22_paper_38.pdf
Date
2022-10
Author
Acun, Melisa
Çetinkaya, Arda
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
260
views
87
downloads
Cite This
Copy number variations (CNVs) are >50bp structural chromosomal variants that represent a regional change in the normal diploid (2 copies) copy number (CN) of genomic regions. All CNs in a single cell are bound to be integers, however a population of cells with distinct subpopulations having different CNs may acquire non-integer copy number values for the total cell population. This is called genetic mosaicism and CNVs that result in such mosaicism are referred to as mosaic CNVs (mCNVs). mCNVs are encountered when a CNV is not germline but acquired later in life. Such acquired changes are frequently found in human cancer tissues and rarely in congenital genetic disorders. Determining the fraction of mosaicism (fm) is crucial in establishing disease severity, evaluating disease progression and response to treatment in individuals with cancer. Although some specific clinical genetic tests are available for determining fm of common cancer-associated structural variants, versatile methods for detecting fm for rare or novel mCNVs are yet to developed. Microarray has emerged as a commonly employed and reliable genome-wide method for detecting human CNVs which are usually undetectable by classical cytogenetic approaches. Here, we present a computational tool developed in R which we name as Copy-fm (Copy number variation – fraction of mosaicism), to address the need for detecting fm for large CNVs, using data obtained from SNP microarrays. The approach utilized by Copy-fm makes use of B Allele Frequency (BAF) data (Figure 1), one of the two fundamental values obtained from SNP microarrays for each oligonucleotide probe, the other being LRR (log2 R Ratio). Copy-fm algorithm relies on fitting cumulative distribution function (CDF) of heterozygous BAF values of given genomic regions in a sample suspected to harbor mCNVs to mCNV CDF models calculated from a set of control microarray data. The algorithm, then evaluates the goodness-of-fit by Kolmogorov-Smirnov (KS) Test to find the best fit. The algorithm of Copy-fm also tests several features in control and test data which would lead to failure of analysis or miscalculation of fm values (Figure 1). To determine the reliability of Copy-fm, we initially employed a series of experimentally generated CNV loss models for X chromosome with varying fm values using DNA samples from a mother-son pair. The fact that sons and mothers share a common X chromosome, and sons naturally have full deletion of a copy of chromosome X allows for preparation of any desired copy number value between 1 and 2 for X chromosome by mixing two samples. Using these models, we tested microarray data obtained from two commercially available platforms (Affymetrix CytoScan Optima Array and Illumina Infinium Human CytoSNP-12 v2.1 BeadChip) to compare real fm values with those determined by our algorithm. Copy-fm was able to call fm values for all of 11 Illumina-array-generated experimental models within a margin of 5% uncertainty (maximum deviation was 4.4%). However, the success of Copy-fm was lower for Affymetrix array generated data which predicted 7 of 11 experimental models within a margin of 5% uncertainty (maximum deviation was 9.3%) (Figure 1). The error margins were higher for lower fm values in both platforms. Furthermore, we tested Copy-fm using microarray data from real clinical peripheral blood samples that belongs to an individual with myelodysplastic syndrome containing a subpopulation of blood cells with two distinct and colocalizing mCNV loss regions on different chromosomes (chr5: 142,310,899–154,530,330 and chr12: 91,865,761–95,215,021). Affymetrix Optima data obtained at different time intervals revealed similar fm values for chromosome 5 (56.3%, 59.8%, 61.3%, respectively) and chromosome 12 (58.2%, 59.8%, 61.7%, respectively). As expected, the two distinct mCNVs had similar fm values at each time point (Figure 1). These results demonstrate that Copy-fm is successful in determining fm for loss mCNVs both in experimentally set mCNV models and clinical data within acceptable margins of uncertainty. In addition, fm values from example Affymetrix data sets (https://www.thermofisher.com/tr/en/home/life-science/microarray-analysis/microarray-data-analysis/microarrayanalysis-sample-data.html) for mCNV gain and loss are in agreement with those calculated by Copy-fm (Figure 1). Minimum sum of residuals has previously been used in a similar approach for calculating fm (PMID: 22277120), but usage of KS test for Copy-fm additionally provides confidence intervals for better evaluation and comparison. Without any user interference Copy-fm is able to consider loss-of-heterozygosity (LOH) and germline CNV status of a genomic region under evaluation which may lead to erroneous fm calculations. This provides an invaluable improvement for fm calculations as LOH regions in both control and test data are more common in highly inbred populations like Turkey. The approach put forward by Copy-fm to calculate fm for mCNVs is platform independent and can easily be adapted for next generation sequencing data. With adjustments, it can be utilized for genome-wide screening of mCNVs and calculating fm for uniparenteral disomy mosaicisms. Especially in cancer genetics, fm of frequently encountered mCNVs calculated by specialized locus-specific methods are being widely employed. With Copy-fm we offer a method for harnessing the mosaicism information from less frequent mCNVs encountered in microarrays, which can be utilized for uncovering unknown mCNVs, monitoring disease progression. This work was supported by TÜBİTAK (319S062) within the RiboEurope consortium and Hacettepe University Scientific Research Projects Coordination Unit (THD-2021-19532).
URI
https://hibit2022.ims.metu.edu.tr/
https://hdl.handle.net/11511/101330
Conference Name
The International Symposium on Health Informatics and Bioinformatics
Collections
Graduate School of Informatics, Conference / Seminar
Suggestions
OpenMETU
Core
Correlation distribution of a sequence family generalizing some sequences of trachtenberg
Özbudak, Ferruh (2021-08-01)
In this paper, we give a classification of a sequence family, over arbitrary characteristic, adding linear trace terms to the function g(x) = Tr(x(d)), where d = p(2k) - p(k) + 1, first introduced by Trachtenberg. The family has p(n) + 1 cyclically distinct sequences with period p(n) - 1. We compute the exact correlation distribution of the function g(x) with linear m-sequences and amongst themselves. The cross-correlation values are obtained as C-i,C-j(tau) is an element of {-1, -1 +/- p(n+e/2), -1 + p(n)}.
Mutation classes of finite type cluster algebras with principal coefficients
Seven, Ahmet İrfan (Elsevier BV, 2013-06-15)
Cluster algebras of finite type is a fundamental class of algebras whose classification is identical to the famous Cartan Killing classification. More recently, Fomin and Zelevinslcy introduced another central notion of cluster algebras with principal coefficients. These algebras are determined combinatorially by mutation classes of certain rectangular matrices. It was conjectured, by Fomin and Zelevinsky, that finite type cluster algebras with principal coefficients are characterized by the mutation classe...
CLUSTER ALGEBRAS AND SEMIPOSITIVE SYMMETRIZABLE MATRICES
Seven, Ahmet İrfan (American Mathematical Society (AMS), 2011-05-01)
There is a particular analogy between combinatorial aspects of cluster algebras and Kac-Moody algebras: roughly speaking, cluster algebras are associated with skew-symmetrizable matrices while Kac-Moody algebras correspond to (symmetrizable) generalized Cartan matrices. Both classes of algebras and the associated matrices have the same classification of finite type objects by the well-known Cartan-Killing types. In this paper, we study an extension of this correspondence to the affine type. In particular, w...
Factorization of some polynomials over finite local commutative rings and applications to certain self-dual and LCD codes
Koese, Seyda; Özbudak, Ferruh (2022-03-01)
We determine the unique factorization of some polynomials over a finite local commutative ring with identity explicitly. This solves and generalizes the main conjecture of Qian, Shi and Sole in [13]. We also give some applications to enumeration of certain generalized double circulant self-dual and linear complementary dual (LCD) codes over some finite rings together with an application in asymptotic coding theory.
Randomness properties of some vector sequences generated by multivariate polynomial iterations
Gürkan Balıkçıoğlu, Pınar; Diker Yücel, Melek; Department of Cryptography (2016)
We examine the randomness properties of the sequences generated by the multivariate polynomial iterations method proposed by Ostafe and Shparlinski, by using the six different choices of polynomials given by the same authors. Our analysis is based on two approaches: distributions of the periods and linear complexities of the produced vector sequences. We define the efficiency parameters, PE for “period efficiency” and LCE for “linear complexity efficiency”, so that the actual values of the period and linear com...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
M. Acun and A. Çetinkaya, “Copy-fm: A tool for determination of the fraction of mosaicism in copy number variations,” Erdemli, Mersin, TÜRKİYE, 2022, p. 2038, Accessed: 00, 2023. [Online]. Available: https://hibit2022.ims.metu.edu.tr/.