New R tool for the comprehensive standardisation and quality control of GWAS summary statistics
We are delighted to announce the publication of MungeSumstats, a Bioconductor R package for the standardisation and quality control of many GWAS summary statistics in Bioinformatics.
The package was developed in the group by Alan Murphy, Brian Schilder and Nathan Skene to address the disparity across how the results of GWAS are stored and shared. The aim was to enable painless and rapid meta-analysis for researchers by ensuring whatever GWAS results they use can be standardised, avoiding spurious results.
The package can handle a variety of input file types and implements more than 35 checks on the inputted summary statistics including the use of reference genomes to ensure consistent allelic direction and correct annotation of SNPs. The package can also convert the reference genome of the dataset, offers a choice of output file types; tsv, LDSC ready, VCF & R native objects, and is integrated with IEU GWAS VCF to directly import and standardise their VCFs. More information on the package is available on its website.