Table 1.
Disease | Discovery GWAS ( n ) |
Prevalence in validation dataset |
Prevalence in testing dataset |
Polymorphisms in GPS |
Tuning parameter | AUC (95% CI) in validation dataset |
AUC (95% CI) in testing dataset |
---|---|---|---|---|---|---|---|
CAD | 60,801 cases; 123,504 controls16 | 3,963/120,280 (3.4%) | 8,676/288,978 (3.0%) | 6,630,150 | LDPred (ρ = 0.001) | 0.81 (0.80–0.81) | 0.81 (0.81–0.81) |
Atrial fibrillation |
17,931 cases; 115,142 controls30 | 2,024/120,280 (1.7%) | 4,576/288,978 (1.6%) | 6,730,541 | LDPred (ρ = 0.003) | 0.77 (0.76–0.78) | 0.77 (0.76–0.77) |
Type 2 diabetes |
26,676 cases; 132,532 controls31 | 2,785/120,280 (2.4%) | 5,853/288,978 (2.0%) | 6,917,436 | LDPred (ρ = 0.01) | 0.72 (0.72–0.73) | 0.73 (0.72–0.73) |
Inflammatory bowel disease |
12,882 cases; 21,770 controls32 | 1,360/120,280 (1.1%) | 3,102/288,978 (1.1%) | 6,907,112 | LDPred (ρ = 0.1) | 0.63 (0.62–0.65) | 0.63 (0.62–0.64) |
Breast cancer |
122,977 cases; 105,974 controls33 | 2,576/63,347 (4.1%) | 6,586/157,895 (4.2%) | 5,218 | Pruning and thresholding (r/2 < 0.2; P < 5 × 10−4) | 0.68 (0.67–0.69) | 0.69 (0.68–0.69) |
AUC was determined using a logistic regression model adjusted for age, sex, genotyping array, and the first four principal components of ancestry. The breast cancer analysis was restricted to female participants. For the LDPred algorithm, the tuning parameter ρ reflects the proportion of polymorphisms assumed to be causal for the disease. For the pruning and thresholding strategy, r2 reflects the degree of independence from other variants in the linkage disequilibrium reference panel, and P reflects the P value noted for a given variant in the discovery GWAS. CI, confidence interval.