When we talk about the languages of the world, you will almost always come across the fact that most of the world’s languages are spoken by a small fraction of the world’s population, while a large proportion of the world’s population are speakers of a small fraction of the world’s languages. Across the world, you would encounter English, French, Spanish, Arabic, Russian, Hindi (or Urdu), and Chinese spoken in most countries, or at least to some extent. Furthermore, you would also be presented with a worrying statistic that most of the world’s languages are in danger of going extinct. UNESCO cites that 90% of the world’s language may be replaced by dominant languages by the end of the 21st century.
Language loss, without doubt, represents a loss of a unique aspect of a people group’s culture, and how they express and organise the world around them, alongside their histories and ecological knowledge. This underscores the need to document and evaluate which languages are endangered, in order for speech communities or organisations dedicated to language revitalisation to organise action to prevent the loss of their respective languages. Today, I want to look into how endangered languages are identified and answer this very titular question — How do we assess a language’s vitality?
According to UNESCO (2003), a language is considered endangered when it is on a path towards extinction. The typical terms describing the levels of endangerment you would encounter are:
- Not endangered (also safe, thriving, vigorous etc.)
- Vulnerable
- Definitely endangered
- Severely endangered
- Critically endangered (also moribund?)
- Extinct (and dormant)
However, as we will soon see, the different levels of endangerment entail meeting different criteria, which differ from method to method. As such, a language assessed to be ‘vigorous’ using one framework might even be classified as ‘endangered’ in another framework.
The GIDS
One of the first tools used in assessing language vitality is the Graded Intergenerational Disruption Scale, or GIDS, by Joshua Fishman in 1991. This assessed the extent of disruption in two dimensions of a language, which were the domains where a language of interest is used, and the extent to which the language is transmitted to the younger generations (intergenerational transmission). The more disruptions that happen to these domains, the more endangered the language is likely to be. This applied an eight-level scale, with level one being used in most social domains including work, education, and even government, while level eight having lost pretty much most of its usage, and its speakers are predominantly older adults, particularly of the grandparent generation.
This index, while rather simple in presentation, misses out all the possible statuses of a language, such as a dormant language. Additionally, Fishman’s index also outlines criteria for each level of disruption in a rather static fashion, and does not distinguish between language shift (to a more commonly-spoken language) and language development or revitalisation. Other drawbacks also include, ironically, the lack of elaboration of criteria for levels where the extent of disruption is the greatest, despite the scale placing particular focus on the extent of disruption to language use and transmission.
UNESCO’s Framework
In 2003, the UNESCO Intangible Cultural Heritage Section’s Ad Hoc Expert Group on Endangered Languages published a framework detailing the evaluation of a language’s endangerment and the urgency for language documentation. They acknowledged that a single factor is insufficient to make a proper assessment, and laid out nine factors that should be considered when evaluating a language’s current situation. These factors are:
- Intergenerational language transmission
- Absolute number of speakers
- Proportion of speakers in total population
- Shifts in domains of language use
- Response to new domains and media (E.g. broadcast media, Internet, schools, new work environments)
- Availability of materials for language education and literacy
- Governmental and institutional language attitudes and policies
- Community members’ attitudes towards their own language
- Type and quality of documentation
Amongst these factors, only the absolute number of speakers is assessed by real numbers instead of a scale of 0 to 5. However, the framework does not outline how an interpretation may be done, but more rather, highlighted the difficulty in doing so, while remarking that a smaller speech community is generally more at risk of endangerment than larger ones. For factors that apply a scale, there are criteria used for each grade, while some have assigned terms for degrees of endangerment.
| Factor | 0 | 1 | 2 | 3 | 4 | 5 |
| Intergenerational transmission | Extinct | Critically endangered | Severely endangered | Definitely endangered | Unsafe | Safe |
| Proportion of speakers in total population | Extinct | Critically endangered | Severely endangered | Definitely endangered | Unsafe | Safe |
| Shifts in domains of language use | Extinct | Highly limited domains | Limited or formal domains | Dwindling domains | Multilingual parity | Universal use |
| Response to new domains and media | Inactive | Minimal | Coping | Receptive | Robust / active | Dynamic |
| Governmental and institutional language attitudes and policies | Prohibition | Forced assimilation | Active assimilation | Passive assimilation | Differentiated support | Equal support |
| Type and quality of documentation | Undocumented | Inadequate | Fragmentary | Fair | Good | Superlative |
The UNESCO framework is unusual in that it includes the type and quality of documentation of the language as a factor of language vitality. This includes the availability of comprehensive grammars, dictionaries, and annotated audio and video recordings, down to grammar sketches and short word-lists. It does help to show which languages are in urgent need of documentation, or which languages have a strong potential to be revitalised, but there does not seem to be a correlation between the type or quality of language documentation and a language’s vitality. The Latin language has been cited as a counterexample in Lee and van Way’s 2016 publication, as it is particularly well documented in writing, yet no longer has any native speakers today.
While UNESCO’s framework treats the factors of endangerment differently, interpreting which languages are endangered is not really a straightforward thing. It does not give an overall score for a language’s vitality, as they do not combine the factors used in the framework to assess the overall level of endangerment. The framework does, however, suggest two examples of assessing a language’s vitality or endangerment. The first is a self-assessment by a speech community to assess the situation and determine if action is needed (and if so, what action). The second would involve external evaluation by organisations of an official or voluntary nature which are dedicated to language maintenance, revitalisation, or documentation.
This makes it difficult to make comparisons between languages that incorporate multiple factors of endangerment. An example of this is seen in the Atlas of the World’s Languages in Danger, where language endangerment was only determined using one factor of the UNESCO framework, which was the extent of intergenerational transmission of the language. This ironically goes against section 4.1 in the framework, which stated that “No single factor alone can be used to assess a language’s vitality or its need for documentation”. Another limitation is, this framework does not really address upward trends in language shifts as seen in ongoing language revitalisation movements. Additionally, it also does not distinguish between extinct and dormant languages in levels of endangerment, though it does factor in a people group’s attitude to their own language as a separate factor.
Ethnologue’s Former Framework
Before 2010, Ethnologue also used its own scale in classifying language vitality as well, according to Lewis and Simons (2010). The main difference here is, Ethnologue former scale primarily focused on the number of speakers who use a language as their first language. As such, this index is not all that informative on its own, as it leaves out many factors that impact language vitality, such as intergenerational transmission, a factor that is of particular focus in most other scales and indices discussed here.
Here, five categories are used, but many languages that would be otherwise classified as levels like ‘definitely endangered’ could be classified as ‘Living’ in Ethnologue’s former framework. However, there is a special category of languages called ‘Second Language Only’, which consists of languages that are ‘still in use’, but ‘not learned as a first language’. This would group together liturgical languages, some pidgins, and even cants and jargons, and in the 16th edition of Ethnologue, this would include languages that were formerly considered ‘extinct’, but are undergoing revitalisation. Additionally, the ‘Dormant’ category was added in the 16th edition, which distinguishes languages that are no longer spoken by the presence or absence of a self-identifying ethnic population to that respective language.
| Category | Description |
| Living | Significant population of first-language speakers |
| Second Language Only | Used only as a second language, may include emerging users |
| Nearly Extinct | <50 speakers, or very small and decreasing fraction of an ethnic population |
| Dormant | No known remaining speakers, but a population links its ethnic identity to the language |
| Extinct | No known remaining speakers, but no population links its ethnic identity to the language |
In spite of these limitations, Ethnologue’s former scale illustrates the development of an understanding of trends in language endangerment and revitalisation. However, these modifications to the scale are not really adequate to capture the bigger picture of language vitality, and this is why Ethnologue has since used another scale, which we will look at later here.
Expanding the GIDS
Recognising these limitations of the GIDS, the UNESCO Framework, and Ethnologue’s framework, M. Paul Lewis and Gary F. Simons worked to expand upon the GIDS, which still places emphasis on the domains of language usage, and the extent of language transmission to younger generations. This 13-level scale, published in 2010, is called the EGIDS, or Expanded GIDS:
| Level | Label | Description | UNESCO Equivalent |
| 0 | International | Internationally used for broad range of functions. | Safe |
| 1 | National | Used nationwide for education, work, mass media, and government. | Safe |
| 2 | Regional | Used in local and regional mass media, and governmental services. | Safe |
| 3 | Trade | Used in local and regional work by insiders and outsiders. | Safe |
| 4 | Educational | Literacy is transmitted through public education systems. | Safe |
| 5 | Written | Oral use of language by all generations, written form is used in parts of the community. | Safe |
| 6a | Vigorous | Oral use of language by all generations, learned by children as their first language. | Safe |
| 6b | Threatened | Oral use of language by all generations, but some of the child-bearing generation is transmitting the language to their children. | Vulnerable |
| 7 | Shifting | Child-bearing generation use the language amongst themselves, but no transmission to their children. | Definitely Endangered |
| 8a | Moribund | Only remaining active speakers are members of the grandparent generation. | Severely Endangered |
| 8b | Nearly Extinct | Only remaining active speakers are members of the grandparent generation, and have little opportunity to use the language. | Critically Endangered |
| 9 | Dormant | The language serves as a reminder of heritage identity for an ethnic community. No one has more than symbolic proficiency. | Extinct |
| 10 | Extinct | No one retains a sense of ethnic identity associated with the language, even for symbolic purposes. | Extinct |
An advantage of the EGIDS is that it describes the downward trend of language shift, something that is lacking in the original GIDS. Furthermore, the EGIDS also includes a section for language revitalisation, with an adaptation of levels 6a to 9 for this very upward trend. To place a language’s vitality on this scale, five key questions will have to be answer — (1) What is the current identity function of the language?, (2) What is the level of official use?, (3) Are all parents transmitting the language to their children?, (4) What is the literacy status?, (5) What is the youngest generation of proficient speakers? Responses to these questions would be fed into a decision tree, which would show where on the scale the language vitality status is. This scale is also currently used by Ethnologue as well, replacing the framework it used prior to 2009.
The Language Endangerment Index (LEI)
The Catalogue of Endangered Languages (ELCat) also applies its own framework of assessing how endangered a language is, called the Language Endangerment Index or the LEI. Like the UNESCO index, the LEI classifies languages into 6 categories (plus one for extinct), but based on a 0-5 scale with increasing values indicating greater endangerment. However, the LEI uses fewer domains, which are speaker numbers, transmission, trends, and domains of usage. These domains are weighted slightly differently — ELCat considers transmission of a language to younger generations to be the most crucial factor in language endangerment (or vitality), and hence allocates it twice the weight compared to the other factors.
| Factor | 5 (Critically endangered) | 4 (Severely endangered) | 3 (Endangered) | 2 (Threatened) | 1 (Vulnerable) | 0 (Safe) |
| Intergenerational transmission | Only speakers are of grandparent generation | Most grandparent generation but not younger people are speakers | Some adults and no children are speakers | Most adults but few or no children are speakers | Most adults, some children are speakers | All members of community including children speak the language |
| Absolute number of speakers | 1 – 9 | 10 – 99 | 100 – 999 | 1,000 – 9,999 | 10,000 – 99,999 | β₯100,000 |
| Speaker number trends | – Small percentage of community are speakers – Rapidly decreasing speaker numbers | – <50% of community are speakers – Accelerated decrease in numbers | – ~50% of community are speakers – Steadily but not accelerated decrease in numbers | – Majority of community are speakers – Gradually decreasing numbers | – Most community members are speakers – Very slowly decreasing numbers | – All community members are speakers – Stable or increasing numbers |
| Domains of use | Few specific domains e.g. songs, ceremonies, limited domestic activities | Mainly used at home / in family, may not be main language for many community members | Mainly used at home / in family, main language in many community members | Some non-official domains alongside other languages, main language in household use | Most domains, excluding official ones | Most domains, including official ones e.g. government, education, mass media |
To obtain the level of endangerment of a language, the sum of the scores assigned to each factor of endangerment is taken, with intergenerational transmission allocated twice the weight. This is then divided by the maximum possible total score for the domains assessed, and converted to a percentage. The range this percentage corresponds to would be the assigned level of endangerment in the LEI.
As not all of these domains need to be assessed, the LEI also considers the level of certainty, which takes into account the percentage of domains assessed based on the currently available evidence. Like the levels of endangerment, intergenerational transmission is also given twice the weight compared to the other factors. For example, the Sentinelese language, which has essentially little to no information other than population size estimates, would have a 20% certainty in the assigned level of endangerment, with a maximum of 5 of 25 possible points based on the assessment on one factor of endangerment.
An advantage of this index is that not all factors of endangerment need to assessed. As some languages, like those spoken by uncontacted people groups, can often lack information on some or most of these factors, some previously discussed methods might not be as applicable to languages with such gaps in information. In the LEI, an overall level of endangerment could still be obtained with missing information in one or multiple domains, though with lower levels of certainty based on the currently available information.
Perhaps one limitation that could be pointed out in the LEI is, for a language to be considered ‘safe’, it needs to score 0% in its assessment. A language that has a score of 1-20% would be considered ‘vulnerable’, which could place languages with 10,000 to 99,999 speakers as ‘vulnerable’, even when these languages are used in government, education, and many other domains, with strong intergenerational transmission. From this index, a language like Palauan, an official language of the Republic of Palau with around 17,000 to 18,000 native speakers, would be labeled ‘vulnerable’.
The authors of the publication outlining the LEI framework acknowledged the relatively liberal definition of what constitutes an endangered language. Under their definitions, a language cannot be considered ‘safe’ unless it has a score of 0 for each factor and a 100% score for level of certainty. Nevertheless, the authors concluded that this cautious approach in assessing language vitality is suitable.
The LEI does not assign an ‘extinct’ label to languages; it uses a level called ‘dormant’ instead, to respect the members of a community, learners, and potential learners of a language to encourage language revitalisation efforts. They argued that using the ‘extinct’ label could discourage language revitalisation, and would also become another source of trauma for people groups who are dealing with loss, suffering, and discrimination. It is true that language loss is an extremely sensitive subject, especially when languages are strongly intertwined with the cultures of the respective people group or people groups that use them. Like the EGIDS, this ‘dormant’ category would encompass languages that are now a reminder of a people group’s heritage and cultural identity. However, the ELCat does not include languages that have not been spoken since the distant past, like Ancient Egyptian
Building upon this, the LEI publication also addresses the case when languages are being revitalised. The level assigned to these languages would be ‘awakening’, more specifically defined as languages that are once considered ‘dormant’, but there is a targeted language revitalisation movement in the community, facilitated and overseen by a coherent group of interested people and organisations, with the primary objective being creating new speakers of the language. Unlike the EGIDS though, the LEI does not appear to have further detail in the upward trend of language shift which could reflect the current status of the language revitalisation process.
This list of frameworks and scales used to assess language vitality and endangerment is not exhaustive; there are other frameworks and scales that have been proposed, each with their own advantages and drawbacks. For instance Michael Krauss, in the 2007 book titled Language Diversity Endangered, suggested a schema or framework for such a purpose, which interestingly assigns a letter grade to each of the categories in the degree of viability, with a+ being ‘safe’, and e being ‘extinct’.
From these overviews, you might be tempted to draw a parallel with the IUCN Red List of Threatened Species. After all, this is a method meant to assess the risk of extinction for a given species, although most of the species this list pertains to are plants, vertebrates, molluscs, and arthropods. For reference, the levels of endangerment for biological species are:
- Least concern
- Near threatened
- Vulnerable
- Endangered
- Critically endangered
- Extinct in the wild
- Extinct
However, the criteria listed to assign IUCN categories consider different factors compared to those by UNESCO, for instance. Among these, the IUCN criteria include population size, range area (or habitat loss), and population declines, as well as a criterion on the probability of extinction in the wild within 20 years or 5 generations. These criteria cannot be applied to languages, as Amano et al. found in their 2014 study, when they applied the IUCN criteria to identify endangered languages.
For example, speaker population sizes alone is inadequate in assessing language vitality; as we have seen previously, the extent to which a language is transmitted to younger generations is considered to be a more major factor influencing language vitality. Furthermore, language range maps are not really helpful in elucidating language vitality, as communities may be multilingual, while range maps are usually drawn as exclusive areas where there is only one local or indigenous language spoken at any point in the range map. As such, while we may be tempted to draw parallels in how endangered languages are identified with how endangered species are identified, we must be aware of the differences in important factors that impact language vitality compared to those for species endangerment.
And so, this has been an introduction to the various notable frameworks and scales used to assess language vitality, and identify endangered languages. A commonality amongst most of these methods is the consideration of intergenerational transmission as a key factor influencing language vitality, and the application of criteria or decision trees to assign a language to a certain level of endangerment on the scale. As we have seen in Ethnologue’s modifications to the scale, and the expansion of the GIDS, there is an ongoing development in understanding the processes affecting language vitality, and we could see further developments to currently-used frameworks like the EGIDS and LEI to better consider these factors, and continue to provide a valuable implement to the study of language vitality and endangerment.
Further reading
Amano, T., Sandel, B., Eager, H., Bulteau, E., Svenning, J. C., Dalsgaard, B., Rahbek, C., Davies, R. G. & Sutherland, W. J. (2014) ‘Global distribution and drivers of language extinction risk’, Proceedings of the Royal Society B: Biological Sciences, 281(1793), pp. 20141574.
Bromham, L. (2022) ‘Language endangerment: Using analytical methods from conservation biology to illuminate loss of linguistic diversity’, Cambridge Prisms: Extinction, 1(e3), pp. 1-11.
Fishman, J. A. (1991) ‘Reversing language shift’, Clevedon, UK, Multilingual Matters Ltd.
Krauss, M. (2007) ‘Chapter 1. Classification and Terminology for Degrees of Language Endangerment’, In: Brenzinger, M. ed. Language Diversity Endangered, Berlin, New York: De Gruyter Mouton, pp. 1-8.
Lee, N. H. & van Way, J. (2016) ‘Assessing levels of endangerment in the Catalogue of Endangered Languages (ELCat) using the Language Endangerment Index (LEI)’, Language in Society, 45, pp. 271-292.
Lewis, M. P. & Simons, G. F. (2010) ‘Assessing endangerment: Expanding Fishman’s GIDS’, Revue roumaine de linguistique, 55(2), pp. 103-120.
Moseley, C. (ed.) (2010) ‘Atlas of the world’s languages in danger’, Paris: UNESCO Publishing.
UNESCO ad hoc expert group on endangered languages (Brenzinger, M., Dwyer, A. M., de Graaf, T., Grinevald, C., Krauss, M., Miyaoka, O., Ostler, N., Sakiyama, O., VillalΓ³n, M. E., Yamamoto, A. Y., Zapeda, O.) (2003) ‘Language vitality and endangerment’, Document submitted to the International Expert Meeting on UNESCO Programme Safeguarding of Endangered Languages, Paris.