FAQ of BIGA GWAS

1. How can I query summary statistics from the GWAS Catalog if they haven't been harmonized yet?

BIGA does not support query for unharmonized data from GWAS Catalog due to their diverse columns. But you can do BIGA analysis by uploading the data if the original summary statistics data is available. As shown in the figure below, the harmonised folder is missing but tsv file is available.

To use this summary statistics, coronary artery disease data, you can download the original tsv file. The file size is around 3GB, which exceeds the limit of 600 MB of BIGA's input file. We recommend you to use linux `awk` command to extract those necessary columns like snp, chromosome, position, a1, a2, eaf, se, odds_ratio, pvalue, se in the file.

The original file in coronary artery disease includes the following columns:

p_value	chromosome	base_pair_location	effect_allele	other_allele	effect_allele_frequencyodds_ratio	beta	standard_error	markername	freqse	minfreq	maxfreq	direction	hetisqhetchisq	hetdf	hetpval	cases	effective_cases	n	meta_analysis

You can use the following command to extract desired columns and gzip the file.

awk -F'\t' 'BEGIN {OFS="\t"} {if(NR==1) print "p_value", "base_pair_location", "effect_allele", "other_allele", "effect_allele_frequency", "odds_ratio", "beta", "standard_error", "n"; else print $1, $3, $4, $5, $6, $7, $8, $9, $NF}' your_file.tsv | gzip > output_file.tsv.gz

Explanation of the command:
-F'\t': Sets the field separator to a tab character for TSV files.
BEGIN {OFS="\t"}: Sets the output field separator to a tab character.
if(NR==1) ...: If processing the first row (header), it prints the column names you want to keep. else print $1, $3, $4, $5, $6, $7, $8, $9, $NF: For data rows, it prints the specified columns. Here, $NF represents the last field of each row, assuming 'n' is in the last column. If 'n' is not the last column, replace $NF with the correct column number for 'n'.
your_file.tsv: This should be replaced with the path to your TSV file.
| gzip > output_file.tsv.gz: This pipes the awk output to gzip, compressing it, and writes the compressed data to output_file.tsv.gz. Replace output_file.tsv.gz with your desired output file name.

Then you can upload the gzipped data to do a BIGA analysis.

2. How can I query summary statistics from IEU OpenGWAS if they are not available for download?

If the trait in IEU OpenGWAS does not have download buttons as in the figure below, it means that the data is not downloadable. BIGA is not able to query this kind of data.

However, the trait ID ebi-a-GCST90038632 shows it is sourced from GWAS Catalog, you can type in GCST90038632 in GWAS Catalog website to search for this trait. It might be harmonized already or the original file is able to be downloaded. For instance, in the following figure, it shows this trait has been harmonized by GWAS Catalog and thus is querable from GWAS Catalog.