Languages of Taiwan — Introduction to the Formosan Languages

Taiwan, the Republic of China (ROC), or Formosa, is often linguistically associated with Mandarin Chinese, Hokkien and Hakka today. It is understandable, given that about 95% of Taiwan’s population is Han Chinese. However, Taiwan is also known for something quite different; it is arguably the origin of the Austronesian languages, a language family widely spoken from Madagascar to Hawai’i. In fact, it has been suggested that the name of one of the indigenous people groups, the Taivoan, was the root of the name of the island we commonly see today (other than the Republic of China), Taiwan.

Through historical linguistics, population genetics, and perhaps tracing human migration patterns, many of the world’s Austronesian languages, from Te Reo Maori to Malagasy, from Tokelauan to Bahasa Melayu, have their origins traced back to this very island. Being great seafarers, the early Austronesians spread to the Philippines, Indonesia and the Malayan Peninsula, and later to Micronesia, Melanesia and Polynesia. Interestingly, the migration of the Austronesian people to the island of Madagascar preceded the Bantu migration in Africa, even though geographically speaking, Madagascar is way closer to Africa than it is to Indonesia.

We have often heard about several widely spoken Austronesian languages like Bahasa Indonesia, Bahasa Melayu, Tagalog and Cebuano, but not all of us are aware of how close such languages really are, let alone realise the widely agreed upon geographic origin of these languages. The Formosan languages, a branch of the Austronesian language family, is spoken by possibly less than 200 000 people today, all mainly on the island of Taiwan. A family of at least 26 languages, more than 10 of these are extinct, with another four classified as moribund. The remaining languages are not safe from extinction too, as they are considered, to varying degrees, endangered. This puts this branch of the Austronesian language family in relatively dire straits, creating the need for revitalisation efforts and programmes to preserve the linguistic diversity Taiwan has.

The distribution of the Formosan languages (and Tao) in Taiwan

However, it is important to note that it is languages like these that can drastically blur the lines between what constitutes a dialect, and what constitutes a language. Often one would loosely agree that a language can be defined here as a sort of dialect continuum, in which we have the putative 26. If our definitions are adjusted, we might see this number change, as languages like Ayatal have a diverse range in dialects which may not always be mutually intelligible. Linguists also proposed other challenges in determining the diversity of Formosan languages, as we cannot be certain about extinct or assimilated indigenous Formosan peoples. So, is the number of Formosan languages that exist today 26? We cannot be truly certain.

While we are familiar with the simple phonetic inventories that Maori, Hawaiian and Samoan have, the number of sounds that the Formosan languages have are significantly greater. For example, Tanan Rukai has the largest number of phonemes in the Formosan branch of languages, with 23 consonants and 4 vowels which are differentiated by vowel length, and Kanakanabu and Saaroa have the smallest phonetic inventories with 13 consonants and 4 vowels. Understanding the sounds of the Formosan languages can help evolutionary linguistics understand and reconstruct what Austronesian could have sounded like thousands of years ago, termed Proto-Austronesian. Ayatal, Thao, Saisiat and Pazih are all examples of Formosan languages which have helped build the reconstruction of Proto-Austronesian phonemes, specific to the Northwest Formosan languages. Other analyses like lenition patterns, how consonant sounds have changed in the languages, also help construct possible processes in Proto-Austronesian.

The word order of the Formosan languages is largely conserved, with most of them using a verb-initial word order, where the verb forms the first element in a clause. About half of the Formosan languages use a subject-second word order, a word order similar to that in Welsh and the Celtic languages, while the other half uses an object-second word order, the word order found in Malagasy. Languages like Saisiyat, Thao and Pazih may have received influence from Mandarin Chinese, as some of the word orders in clauses resemble a subject-verb-object order, used in languages like Mandarin Chinese and English.

To wrap it all up, the Formosan languages are indeed a mysterious branch of the Austronesian language family. With debate still going on about how many Formosan languages there are, and how to revive extinct languages or revitalise moribund languages, there are certainly a lot of room for discovery and research regarding these languages. In due time, we will take a dive into exploring more aspects of languages in the Formosan branch, where we share the special features of languages like Ayatal, Seediq or Paiwan.


This was intended to be a short post to introduce what the Formosan languages are. We have commonly heard of Malay, Indonesian, Minangkabau, Maori and Hawaiian all being grouped under the Austronesian language family, but how many of us realise that there are Austronesian languages spoken as far north as Taiwan? Writing about this definitely has made me realise the linguistic diversity of Taiwan, a region so often associated with the Mandarin Chinese, Hokkien, or Hakka-speaking people. I hope to write more about some of these languages in future posts, so stay tuned.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s