Skip to main content
Version: next

PEG List

The PEG List is a concise summary of author-prioritised genes, linked transparently to the evidence used to reach that conclusion. It distils the PEG Evidence Matrix into a compact output that reports the top gene for each locus according to an author-designated integration stream.

What the PEG List captures​

  • Shows the gene (identifier and symbol) that the authors consider most likely to be the effector at each locus.
  • Indicates which broad evidence categories were considered when reaching that conclusion, offering a fast overview of evidence types.
  • Records which integration or scoring was used in the final concusion, so that the selection logic is transparent and reproducible.

How it is derived​

  • Authors select the highest-ranked gene at each locus from the full Evidence Matrix using their specified integration approach (i.e. the method corresponding to the column labelled as author_conclusion = TRUE).
  • An overview of the evidence and/or integration analyses supporting this conclusion are recorded in the PEG list, in addition to details in the Evidence Matrix, providing a clear link between the authors’ conclusion and the underlying evidence for reanalysis and benchmarking.

Relationship to the Evidence Matrix​

  • The PEG List does NOT replace the Evidence Matrix; it communicates the author’s prioritisation in a compact, interpretable form.
  • Users needing detailed, directional or quantitative interpretation should refer to the underlying matrix, which retains all evaluated genes, evidence streams, and integration logic.
  • Different user groups can work at the level that suits their needs: for example experimental or translational users might focus on the PEG List for hypothesis generation, while computational users may work directly with the matrix for detailed analysis.

Standard fields​

Column headerData formatDescriptionRequirementExample data
PrimaryVariantIDchr:pos:ref:altVariant identifierMandatorychr10:114754071:T:C
GeneSymbolHGNCHGNC gene symbol associated with the variantMandatoryVTI1A
[Evidence_category_abbreviation]booleanThe column name represents an evidence category abbreviation from our controlled vocabulary.

Values indicate whether or not evidence from this category was considered in the author conclusion integration analysis.
MandatoryTRUE
INT_AuthorConclusionBespoke

(Any data type, as long as it is used consistently within the column.)
Integrated gene prioritisation outcome, derived from the integration column in the matrix with author_conclusion = TRUE in the metadata.MandatoryStrong

Example​

PrimaryVariantIDGeneSymbolGWASPROXQTLFUNCFMCOLOCTPWASEXPPERTURBKNOWLITDRUGINT_CombinedPrediction_AuthorScore
chr1:100000:T:CVTI1ASTRONG
chr2:20000:A:GABOMODERATE

Tick = data present. Blank = not assessed. Ticks do NOT imply supportive vs negative.