Skip to main content
Version: next

📋 PEG Evidence Matrix Standard

Genomic Identifiers

Column headerData formatDescriptionRequirementExample data
PrimaryVariantIDchr:bp:ref:alt
  • Is the variant to which variant-centric evidence relates.

  • Is the primary row ID.

  • Primary variant may be lead variant, variant in LD with lead, or fine mapped SNPs at locus - this should be defined in the metadata file.

Mandatorychr10:114754071:T:C
rsIDrs[]The rsID of the primary variant.Optionalrs1234
Var_[xyz]Bespoke

(Any data type, as long as it is used consistently within the column.)
Other columns relating to variant identification may be added, PEGASUS recommend using the format Var_[xyz] and should be defined in the metadata file.Optionalbespoke

Evidence — General Pattern

All variant-centric evidence columns are optional. However, PEGASUS suggest to include at least TWO pieces evidence to support variant-gene-phenotype relationship.

PEGASUS define a general reporting pattern:

Column headerData FormatDescriptionRequirementExample data
Category_(stream)_[details]Bespoke

(Any data type, as long as it is used consistently within the column.)
Headers follow the format Category_(stream)_[details].

Category: Use the abbreviated category name from the evidence categories listed in controlled list.

(stream) is optional and is only required when multiple evidence streams are used within a single category (e.g. QTL_eqtl).

[details] is a user-defined suffix that reflects the content of the data.

For any field consisting of multiple words, please use CamelCase. For example, credible set id should be written as CredibleSetID.

If no category in the list are applicable, please use Other_[CustomisedCategory]_(stream)_[details]
Optionalvariant-centric evidence examples;

gene-centric evidence examples

PEGASUS only define column name patterns and does not impose strict requirements on the data type. For guidance, PEGASUS provide reference guidelines for the general evidence categories. Each category — variant-centric, gene-centric, comes with suggested naming patterns and example formats.

Integration Evidence — General Pattern

Column headerData FormatDescriptionRequirementExample data
INT_[tag]_[details]Bespoke

(Any data type, as long as it is used consistently within the column.)

Headers may follow the format INT_[tag]_[details].

INT indicates integration evidence;

[tag] is customised label in the metadata;

[details] is a user-defined suffix that reflects the content of the data.

For multi-word in the [details], use CamelCase (e.g., CombinedPredictionAuthorScore).

OptionalIntegration evidence example