π‘ Illustrative Examples of Evidence Matrix Columns
The tables below show examples of how PEG evidence matrix columns can be named and formatted.
These examples are not mandatory fields β they are provided to demonstrate recommended naming patterns, data formats, and reporting styles.
- Projects may define additional or alternative columns, we recommend following these general conventions.
- Metadata should provide comprehensive information to understand the data type, provenance, and scale used for each column.
- Variant-centric evidence example
- Gene-centric evidence example
- Integration example
Evidence Category | Column header | Data Format | Description | Requirement | Example data |
---|---|---|---|---|---|
GWAS | GWAS_pvalue | Exponent or βlog10 | P-value of the primary variant in the source GWAS. Specify whether exponent (e.g. 4Γ10β»βΉ ) or βlog10 scale in the metadata file. | optional | 4Γ10β»βΉ |
Proximity | PROX_nearest_gene | boolean | Indicates whether the variant is the nearest gene. Details on how distance is derived (e.g. to TSS, to gene footprint) should be documented in the metadata. | optional | N |
QTL | QTL_eQTL_pancreas_pvalue | exponent or -log10 | Significance value for eQTL association in pancreas tissue. | optional | 0.01 |
QTL | QTL_eQTL_pancreas_CI | range | Confidence interval for the eQTL effect. Define confidence level (e.g. 95%) in metadata. | optional | [1.2, 2.5] |
Functional | FUNC_CADD | float | CADD functional prediction score. Specify genome build and release in the metadata. | optional | 15.62 |
Fine-mapping | FM_credible_set_ID | string | Identifier of the credible set variant from fine-mapping. | optional | chr10:114754071:T:C |
Fine-mapping | FM_PIP | float | Posterior inclusion probability (PIP) from fine-mapping. | optional | 0.98 |
Coloc | COLOC_PPH4 | float | Colocalisation posterior probability that both traits share a causal variant (PPH4). | optional | 0.85 |
Evidence Category | Column header | Data Format | Description | Requirement | Example data |
---|---|---|---|---|---|
TWAS | TPWAS_TWAS_pvalue | float | Transcriptome-wide association study (TWAS) p-value linking gene expression to trait. | optional | 1Γ10β»β· |
Expression | EXP_Adipose_TPM | float | Expression level of the gene in adipose tissue, reported as Reads Per Million per Kilobase (RPMK) or Transcripts Per Million (TPM). | optional | 0.8 |
Expression | EXP_pancreas_TPM | float | Expression level of the gene in pancreas tissue, reported as RPMK or TPM. | optional | β |
Perturbation | PERTURB_mouse | Free text / ontology terms | Observed phenotype in mouse perturbation models (e.g., knockout, overexpression). Terms can be free text or ontology labels, defined in metadata. | optional | hypoglycemia | increased insulin secretion | impaired glucose tolerance |
Knowledge | KNOW | Narrative text | Expert or knowledge-base curation describing gene function and its relationship to phenotype or disease. | optional | ANGPTL4 inhibits lipoprotein lipase (LPL), increasing circulating triglycerides and reducing fatty acid uptake. |
Literature | LIT | Narrative text | Human-curated evidence from published studies linking the gene to relevant traits or disease mechanisms. | optional | Zebrafish Tcf7l2 mutant shows hyperglycemia, pancreatic and vascular defects, reduced regeneration. |
Literature | LIT_PMID | PMID list | PubMed identifiers (PMIDs) supporting literature evidence for the geneβtrait association. | optional | PMID_28851992 | PMID_31829936 |
Drug | DRUG | Drug name(s) | Drug(s) known to target or modulate the gene, separated by | (pipe). Reference databases (e.g., DrugBank) can be cited in metadata. | optional | METFORMIN | CYCLOSPORINE |
Column header | Format | Description | Requirement | Example data |
---|---|---|---|---|
INT_pops | float | Population count or weighted metric used in integration scoring. Define precise meaning and provenance in the metadata file. | optional | 9 |
INT_Combined_prediction_author_score | any | Author-provided integrated prediction score. Units, scale, or categories (e.g. STRONG , WEAK ) must be described in the metadata file. | optional | STRONG |