Skip to content

HANCESTRO Ontology Documentation

Welcome to the HANCESTRO documentation!

The Human Ancestry Ontology (HAncestro) provides a systematic description of the ancestry concepts used in the NHGRI-EBI Catalog of published genome-wide association studies. It includes a list of countries, regions and major areas (essentially continents), as well as a fairly exhaustive list of defined ancestral categories, uncategorised ancestral categories and population isolates.

The following HANCESTRO release files are provided

Please see our FAQ. for some more information.

You can find descriptions of the standard ontology engineering workflows here.

What's the difference between ancestry and ethnicity?

Ancestry generally refers to an individual's objectivily verifiable genetic background, which can be clustered with reference populations from a certain region. Ethnicity on the other hand refers to an individual's self-identified cultural background. Most individuals have a strong alignment between their ancestry and ethnicity but it is possible for an individual to associate with an ethnic background that is not aligned with their genetic ancestry, for example due to gaps or misrepresentations in their family history.

For more information, see Mersha & Abebe, 2015 or https://health.clevelandclinic.org/how-your-ancestry-and-ethnicity-affect-your-health/

Ancestry categories in the Human Ancestry Ontology refer to genetic ancestry and have been defined in this context. They may also be used to annotate self-reported ethnicity data but with the clear caveat that annotation does not guarantee the accuracy of related genetic information.

HANCESTRO root level concepts

Ancestry category

The key purpose of HANCESTRO is the capture of the primary ancestry categories defined in Morales et al, 2018. Where additional genetic information for a smaller population category (eg by country) is available to assign it to one of the defined ancestry categories, this is also represented in the ontology hierarchy. However a lot of country-based categories are either uncategorised, ie the available ancestry information does not fall into one of the defined categories or they are undefined, ie insufficient information is available to allocate this population to a specific ancestral group.

The primary asserted relationship between ancestry categories and countries is via the isDemonymOf relationship, which identifies that individuals may be residents or natives of a particular place but makes no assertion regarding their genetic background.

In exceptional cases, ancestry categories are associated with countries via the hasCountryOfOrigin relationship. This is much more formal definition generally reserved for genetically isolated populations where objectively verifiable information is available to relate individuals from this population to a small geographic area within a certain country.

Ethnicity category

Many population descriptors do not fully match defined ancestry populations although individuals from these populations may show some levels of genetic similarity with a given ancestry category. Ethnicity descriptors are more likely to describe populations groups with a similar cultural, religious, tribal, geographic or other social background. This includes terms such as Jewish Israeli, Punjabi Sikh or Hispanic.

In order to make HANCESTRO more broadly useful for the annotation of both self-reported information, usually referred to as self-reported ethnicity as well as genetic ancestry inferred from the similarity of individuals' sequence data to reference populations, we created a new ethicity category (HANCESTRO:0601). This is a grouping class for all terms in HANCESTRO used to describe populations for which no genetic reference population is available and that are also not geographic descriptors.

Geographic category

One common way to describing individuals is by reference to the location where they were born, where they live or where their family origniated, such as national or regional demonyms. Demonyms are used to designate all people in relation to a particular place, regardless of ethnic, linguistic, religious or other cultural considerations.

In order to make it easier for HANCESTRO users to identify all geographical descriptors, we implemented a new geographic category (HANCESTRO:0602) that groups all country and regional demonyms.

Ancestry status

This class represents the general characterisation of the ancestry of a population or individual. Ancestry status can either be admixed or genetically isolated.

Continent

Formally defined major geographic area.

Country

Defined geographic area, generally corresponding to an independent state or nation.

Region

Convenience grouping of geographic areas larger than countries but smaller than continents. Regions may not have formal geographic definitions.