PEG List
The PEG List is a concise summary of author-prioritised genes, linked transparently to the evidence used to reach that conclusion. It distils the PEG Evidence Matrix into a compact output that reports the top gene for each locus according to an author-designated integration stream.
What the PEG List capturesβ
- Shows the gene (identifier and symbol) that the authors consider most likely to be the effector at each locus.
- Indicates which broad evidence categories were considered when reaching that conclusion, offering a fast overview of evidence types.
- Records which integration or scoring was used in the final concusion, so that the selection logic is transparent and reproducible.
How it is derivedβ
- Authors select the highest-ranked gene at each locus from the full Evidence Matrix using their specified integration approach (i.e. the method corresponding to the column labelled as
author_conclusion = TRUE). - An overview of the evidence and/or integration analyses supporting this conclusion are recorded in the PEG list, in addition to details in the Evidence Matrix, providing a clear link between the authorsβ conclusion and the underlying evidence for reanalysis and benchmarking.
Relationship to the Evidence Matrixβ
- The PEG List does NOT replace the Evidence Matrix; it communicates the authorβs prioritisation in a compact, interpretable form.
- Users needing detailed, directional or quantitative interpretation should refer to the underlying matrix, which retains all evaluated genes, evidence streams, and integration logic.
- Different user groups can work at the level that suits their needs: for example experimental or translational users might focus on the PEG List for hypothesis generation, while computational users may work directly with the matrix for detailed analysis.
Standard fieldsβ
| Column header | Data format | Description | Requirement | Example data |
|---|---|---|---|---|
| PrimaryVariantID | chr:pos:ref:alt | Variant identifier | Mandatory | chr10:114754071:T:C |
| GeneSymbol | HGNC | HGNC gene symbol associated with the variant | Mandatory | VTI1A |
| [Evidence_category_abbreviation] | boolean | The column name represents an evidence category abbreviation from our controlled vocabulary. Values indicate whether or not evidence from this category was considered in the author conclusion integration analysis. | Mandatory | TRUE |
| INT_AuthorConclusion | Bespoke (Any data type, as long as it is used consistently within the column.) | Integrated gene prioritisation outcome, derived from the integration column in the matrix with author_conclusion = TRUE in the metadata. | Mandatory | Strong |
Exampleβ
| PrimaryVariantID | GeneSymbol | GWAS | PROX | QTL | FUNC | FM | COLOC | TPWAS | EXP | PERTURB | KNOW | LIT | DRUG | INT_CombinedPrediction_AuthorScore |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| chr1:100000:T:C | VTI1A | STRONG | ||||||||||||
| chr2:20000:A:G | ABO | MODERATE |
Tick = data present. Blank = not assessed. Ticks do NOT imply supportive vs negative.