TY - JOUR
T1 - RAREsim
T2 - A simulation method for very rare genetic variants
AU - Null, Megan
AU - Dupuis, Josée
AU - Sheinidashtegol, Pezhman
AU - Layer, Ryan M.
AU - Gignoux, Christopher R.
AU - Hendricks, Audrey E.
N1 - Publisher Copyright:
© 2022 American Society of Human Genetics
PY - 2022/4/7
Y1 - 2022/4/7
N2 - Identification of rare-variant associations is crucial to full characterization of the genetic architecture of complex traits and diseases. Essential in this process is the evaluation of novel methods in simulated data that mirror the distribution of rare variants and haplotype structure in real data. Additionally, importing real-variant annotation enables in silico comparison of methods, such as rare-variant association tests and polygenic scoring methods, that focus on putative causal variants. Existing simulation methods are either unable to employ real-variant annotation or severely under- or overestimate the number of singletons and doubletons, thereby reducing the ability to generalize simulation results to real studies. We present RAREsim, a flexible and accurate rare-variant simulation algorithm. Using parameters and haplotypes derived from real sequencing data, RAREsim efficiently simulates the expected variant distribution and enables real-variant annotations. We highlight RAREsim's utility across various genetic regions, sample sizes, ancestries, and variant classes.
AB - Identification of rare-variant associations is crucial to full characterization of the genetic architecture of complex traits and diseases. Essential in this process is the evaluation of novel methods in simulated data that mirror the distribution of rare variants and haplotype structure in real data. Additionally, importing real-variant annotation enables in silico comparison of methods, such as rare-variant association tests and polygenic scoring methods, that focus on putative causal variants. Existing simulation methods are either unable to employ real-variant annotation or severely under- or overestimate the number of singletons and doubletons, thereby reducing the ability to generalize simulation results to real studies. We present RAREsim, a flexible and accurate rare-variant simulation algorithm. Using parameters and haplotypes derived from real sequencing data, RAREsim efficiently simulates the expected variant distribution and enables real-variant annotations. We highlight RAREsim's utility across various genetic regions, sample sizes, ancestries, and variant classes.
KW - rare variants
KW - RAREsim
KW - simulated data
KW - simulated genetic variants
UR - http://www.scopus.com/inward/record.url?scp=85127526149&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2022.02.009
DO - 10.1016/j.ajhg.2022.02.009
M3 - Article
C2 - 35298919
AN - SCOPUS:85127526149
SN - 0002-9297
VL - 109
SP - 680
EP - 691
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 4
ER -