RAREsim: A simulation method for very rare genetic variants

Megan Null, Josée Dupuis, Pezhman Sheinidashtegol, Ryan M. Layer, Christopher R. Gignoux, Audrey E. Hendricks

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Identification of rare-variant associations is crucial to full characterization of the genetic architecture of complex traits and diseases. Essential in this process is the evaluation of novel methods in simulated data that mirror the distribution of rare variants and haplotype structure in real data. Additionally, importing real-variant annotation enables in silico comparison of methods, such as rare-variant association tests and polygenic scoring methods, that focus on putative causal variants. Existing simulation methods are either unable to employ real-variant annotation or severely under- or overestimate the number of singletons and doubletons, thereby reducing the ability to generalize simulation results to real studies. We present RAREsim, a flexible and accurate rare-variant simulation algorithm. Using parameters and haplotypes derived from real sequencing data, RAREsim efficiently simulates the expected variant distribution and enables real-variant annotations. We highlight RAREsim's utility across various genetic regions, sample sizes, ancestries, and variant classes.

Original languageEnglish
Pages (from-to)680-691
Number of pages12
JournalAmerican Journal of Human Genetics
Volume109
Issue number4
DOIs
StatePublished - 7 Apr 2022

Keywords

  • rare variants
  • RAREsim
  • simulated data
  • simulated genetic variants

Fingerprint

Dive into the research topics of 'RAREsim: A simulation method for very rare genetic variants'. Together they form a unique fingerprint.

Cite this