Version: 0.0.1

2025 ASHG Ancillary Session

Predicted Effector Gene (PEGASUS) Workshop

Aims of the Workshop

The workshop covers a recap of current landscape analyses, an overview of the new PEGASUS framework, and insights from a benchmarking exercise. Participants will engage in structured discussions using example datasets, provide feedback on the standard, and explore opportunities to adopt PEGASUS in research. The session aims to strengthen community alignment, improve data interoperability, and identify ambassadors to support broader adoption of the PEGASUS standard.

📍 Dates & Locations

Time: 11:45AM - 1:15PM, October 17, 2025
Venue: Room 259A, Thomas M. Menino Convention & Exhibition Center (MCEC, formerly the BCEC) 415 Summer St., Boston, MA 02210

ℹ️ Get more details at the communnity page

Agenda

11:45 – Assembly
Participants arrive and settle
11:50 – Welcome & Introduction
Noël Burtt, Knowledge Portal Network, Broad Institute [slides]
11:55 – Landscape Analysis Recap
Laura Harris, GWAS Catalog, EMBL-EBI [slides]
12:05 – PEGASUS Framework Overview
Aoife McMahon, Genetic Data Platform, EMBL-EBI [slides]
12:25 – Q&A (5 mins)
Facilitated by Aoife
12:30 – Benchmarking Overview (2 mins)
Aoife McMahon, Genetic Data Platform, EMBL-EBI
12:32 – Benchmarking Results (8 mins)
Matt Pahl, Children's Hospital of Philadelphia [slides]
12:40 – Discussion Session
Introduced by Laura Harris & Yue Ji
- Small-group table discussion with printed/QR materials | [Toy data, Real data example, Guiding questions]
- 10 minutes reflection
- Report-back summary — Laura & Aoife
13:10 – Wrap-up & Ambassador Call
Noël Burtt, Knowledge Portal Network, Broad Institute Timekeeping: Julie Jurgens, Knowledge Portal Network, Broad Institute

Ancillary session Materials

Slides: Public Share folder
Materials: Example data and guiding questions

📝 Ancillary session notes

Aoife Q&A (introduction)

Q: More details about manual / computational / combination A: Very important to know if evidence / integration is done purely computationally or a human has been involved (manual process)

Q: Will you use an ontology for evidence types? A: Good idea, we want to do it, but is a barrier for onboarding new folks and an initial standard

Q: Is PEGASUS a system of recommendations for publications going forward or will you curate/process existing data? Could this process be automated? A: We used existing lists to evaluate if PEGASUS could represent varied data types. It’s hard working with other people’s data. Not going to be an ongoing activity (one-off benchmarking). Data generators will hopefully adopt PEGASUS (it’s a system).

Matt Q&A (benchmarking)

Q: Would time spent have been much less if it was your own paper, instead of curating somebody else's data? A: Would have been simpler in some aspects, but still need to cross-reference with older datasets

Laura: Submitting GWAS summary stats is a lot of effort, but when you want your own stats you grab them from the Catalog, instead of hunting around a laptop / server. Hopefully same idea applies! Some effort to generate data, but long-term sustainable

Adam: Same true for PEG lists, grabbing lists from knowledge bases instead of a horrible supplementary table Incentivisation: we do these lists because we want people to use them. Some activation energy is required, but investment pays off.

Laura: Resources want to provide tools to submitters to simplify processes (e.g. YAML conversion / validation).

Discussion notes

Data user comments
- Useful framework / data!! 🎉
- Mapping against ontology terms is essential
- Potential for negative evidence
  - PEGASUS framework can grow / change over time, negative evidence is really hard but a good idea
  - You can dig into the matrix to infer negative evidence
Data generators
- Standard locus id can help authors integrate or organise their own tables / data
- We’d apply it if there’s an easy way to use it (data resource tooling)
Incentivise adoption of standard
- Journal / funder requirements
  - People do GWAS and might optionally make a PEG list. It’s not every author’s priority, so journals may amplify and encourage but not mandate.
- Provide evidence that structured lists get more attention / reuse / downloads is an incentive
- Authors could benefit from adopting the framework during their own analysis (e.g. addition of new data types can change the PEG list)
Distinction between list and matrix useful?
- Yes, matrix for advanced users and list for basic checks

(out of time)

Aims of the Workshop​

Agenda​

Ancillary session Materials​

📝 Ancillary session notes​

Aoife Q&A (introduction)​

Matt Q&A (benchmarking)​

Discussion notes​