Sinhala as a living language in the Digital Age

Harsha Wijayawardhana B.Sc. in Biochemistry (Miami), CITP (UK), FBCS (UK)

Chairman, Local Language Working Group (LLWG) in collaboration with LK Domain Registry.


On the 21st of February, the world marked International Mother Language Day, and also, on the 2nd of March, Sri Lanka celebrated Sinhala Language Day. The United Nations (UN) first announced International Mother Language Day on the 17th of November, 1999. And the UN formally adopted UN resolution 56/262 in 2002 to declare International Mother Language Day to promote multilingualism and language diversity in the world. Sri Lanka celebrated International Mother Language Day in 2022 with the attendance of the Prime Minister, Hon. Mahinda Rajapaksa. Sri Lanka has several mother languages, two of which are official languages: Sinhala and Tamil. This article will delve into the impact of the Digitization of Sinhala for its long-term survival as a living language.   

The Origins and the present status of Sinhala

The island of Sri Lanka has twenty-one million people and seventeen million of which speak Sinhala. Sinhala is a language that belongs to the Indo-European family of languages and Indo Arya sub-family to which Persian, Urdu, and Hindi belong. Although Sinhala is not considered an endangered language according to UNESCO, some scholars believe that Sinhala would not last as a living language for long. Sinhala has a colorful history spanning more than two thousand five hundred years. According to Sri Lankan history, the language of the founder of the Sinhala people, Vijaya, was the ancestral language of Sinhala. Vijaya had spoken a language, which was akin to Magadi, the Prakirit language of the Buddha. With the scanty evidence presently available, scholars identify Prakrit Language, which gave birth to Sinhala as Elu. Among scholars, there exist two schools of thought on where Elu originally was spoken. One school claims that it was spoken in North India before it came to Sri Lanka with the Arya settlers. Other claims Elu originated in ancient Sri Lanka and was primarily spoken on the island. Pali Scholar Dr. Rhys David called Elu “the Prakrit of Ceylon”. All archaeologists, historians, and linguists agree that Elu evolved into Sinhala later between the fourth and sixth centuries. 

Sri Lanka also boasts a long human habitation since the time of Homo Sapiens’ venturing out from Africa nearly a hundred thousand years ago. Those original settlers on the island had spoken a language that had been lost in time. However, linguists guess that the remnant of the language spoken by those people is preserved to this day in modern Sinhala in words such as “Kata” (mouth), “Bada” (stomach), “Oluva” (head), “Aliya” (elephant), etc. Vedeha Thera, the author of Sidat Sagarawa, claims in his celebrated work on Sinhala Grammar that, since these words did not originate, either from Sanskrit or Tamil, those words lack a recognized source or origins (Nipan). Sidat Sagarawa, written in the 13th century, provides the first documented Sinhala Alphabet with thirty letters. Most scholars accept that the language of the original settlers mixed with Elu to create Proto Sinhala. Elu or Sihala, the ancestral language of Sinhala, also has the rare distinction being the language to which the commentaries (Atuwa) on Theravada Scriptures (Thripitakaya) were translated. During the Valagamba period in the first century, Sri Lankan monks carried out the unprecedented task of committing to Ola all Buddhist scriptures memorized by erudite monks. These scriptures included the Sihala Atuwa, which Arahath Mahinda translated from Magadhi commentaries. The recording of scriptures on Ola could be considered a monumental event in Buddhist history. Due to this, Sri Lanka became the primary center for Buddhist learning and teaching. Sri Lankan monks became highly influential in developing, writing scripts, and teaching Theravada Buddhism in East Asian countries. The Indian scholar-monk Buddhagosha Thera translated back those Sinhala Atuwa into Magadhi, which became the ecclesiastic language of the Buddhists, which came to know as Pali with a set of grammar rules codified in Sri Lanka. Sinhala monks contributed heavily to the development of Pali by codifying its grammar. Pali language became the Lingua Franca of the Theravada Buddhist countries.

The Language of Vedda people in Sri Lanka

Anthropologists group Vedda as aborigines of Sri Lanka. According to Sri Lankan history, Vaddas are descendants of the two children of King Vijaya, the North Indian Prince, and the local princes Quveni. Some groups of Veddas still live as hunter-gathers on the remote parts of the island. Many scholars who had studied their language conclude that it is a creole version of Sinhala. It is most probable that Veddas descend from the original settlers who arrived in Sri Lanka many thousand years back and are living descendants of Balangoda Man, who lived thirty-five thousand years ago. The language of Vedda, the creole version of Sinhala, comes under UNESCO classification as an endangered language in Asia.

Sinhala Script

Sinhala has the prettiest script in the world. Sinhala letters are round-shaped. In world classification of Scripts, Sinhala, and Scripts found in the Indian subcontinent are grouped under Abiguda or Alpasyllabry Scripts. Also, Ethiopian or Geez script comes under the same Abiguda group though it belongs to a non-Indic language family. Also, Khmer, Myanmar, Lao and Thai fall into Abiguda Script family and would have been created by Indian Hindu priests originally. Sinhala Script, like other Indic Script, had evolved from Brahmi Script. Devanagri Script had developed from Northern Brahmi while Sinhala and South Indic Scripts such as Tamil, Malayalam, Kanada, and Telugu from the Southern Brahmi. South Indian Pallava Grantha had an immense influence on the shapes of Sinhala and East Asian Scripts such as Khmer, Myanmar, and Thai Scripts later. Between the 4th and 6th centuries, Sinhala Script began taking cursive shape due to the adoption of Ola as the writing medium. Sinhala script differs from other Indic Scripts by having two vowel symbols ඇ(æ) and ඈ (ǣ) and Sanjaka letters or pre-nasalized consonants.

Brahmi found in Sri Lanka is no different from Indian Ashoka Brahmi. However, Sri Lankan Brahmi shows slight differences, such as Sri Lanka has seventeen rock inscriptions, written from right to left. Most Archaeologists claimed that Arahat Mahinda was responsible for introducing Ashoka Brahmi to Sri Lanka. According to those scholars, Ashoka Brahmi arrived in Sri Lanka in the 3rd Century BC. The discovery of two pieces of Pottery Sherds made a significant shift to the theory, which stipulates Ashoka Brahmi came with the introduction of Buddhism in the 3rd Century BC. The Pottery sherd discovered in Anuradhapura by Dr. Srian Deraniyagala was found to be older than the 3rd Century BC, dating much back between 600-500 BC using Carbon dating techniques. Due to these significant findings, Sri Lanka has the oldest Brahmi discovered much earlier than South Indian discoveries. Further, Sri Lankan Brahmi had used to inscribe Indo- European Prakrit words. The letters inscribed on the second Pottery Sherd, which came to be known as Thissamaharama Pottery Sherd had been identified as Tamil Brahmi by the South Indian experts such as Mahadevan. Recently, Prof. Raj Somadeva had disputed South Indian experts claiming that those letters indicate an Indo-European word.

Sinhala Digitization and Sinhala on the Internet

Presently, all users of digital devices input Sinhala without much hassle. Sri Lankans use Sinhala for writing blogs, Facebook posts, etc. All Operating Systems on digital devices support Sinhala Unicode at present. But it was not the same a few years back. Before Sinhala Unicode came into existence in the mid-2000, users typed Sinhala with propriety fonts. Tamil and Sinhala could not exist in a single document. To overcome these difficulties, in the early 1980s, CINTEC, the then apex government body for Information and Communication Technology (ICT), standardized Sinhala input by deriving 8-bit encoding for Sinhala and Tamil by introducing SLASCII similar to the Indian ISCII. During this standardization, CINTEC had to fix the number of letters in the Sinhala Alphabet since there had been several disputes among Sinhala experts on how many and what letters should be in the Sinhala Alphabet. Finally, the Sinhala experts agreed upon sixty letters which included the letter for ‘Fa’ sound. 

CINTEC took the leadership to encode Sinhala in Unicode, which gave every letter in all known Scripts in the world a two-byte code. Unicode provided a solution to have multiple characters in different languages in a single document. The late Prof. V. K. Samaranayake as the chairman of the CINTEC spearheaded the Unicode Encoding in ISO/IEC/10646 while Prof. J.B. Dissanayake and Dr. S.T.Nandsara became the technical experts. The two technical experts convinced the Unicode consortium to accept the Sri Lankan Government proposal for the encoding of Sinhala in the Unicode Version 3.0 in 1996, being present at ISO Working Group 2, which was held on the island of Crete, Greece.

Although encoded, the implementation of Sinhala Unicode took an unprecedented ten years. In the absence of Sinhala Unicode implementation, some Sri Lankan professionals advocated dropping Sinhala Script and adopting Latin Script to write Sinhala. Those, who suggested adopting Latin Script, justified dropping traditional Sinhala Script for the ease of computerization. Prof. Gihan Dias, Dr. Ruvan Weerasinghe, and Mr. Harsha Wijayawardhana led the implementation of Sinhala Unicode. In early 2000, these three, who represented two universities in Sri Lanka: the University of Colombo and the University of Moratuwa, with several others to standardize the SLS 1134 version-2 and later the third revision of SLS 1134 with the encoding of Sinhala numerals in Unicode. 


Although Sinhala is not classified as an endangered language yet, Sinhala could become extinct within the century, provided if people were reluctant to use Sinhala for day-to-day communication. Digitally, Sinhala has become a very viable language. Therefore, the danger of the disappearance of the Sinhala Script as a living script and language from the world is minimal. The latest observation is that Sinhala is widely used on Social Media and a new genre of literature and a writing style seem to emerge through the new digital media. The government must encourage new Translation tools from Sinhala to Tamil and English and vice versa using Speech to Text and Text to Speech. If machine translation tools were to be available for English to Sinhala, students who are not well versed in English could learn using the content available on the Internet.

It is encouraging that the government has taken the necessary steps to celebrate International Mother Language and Sinhala Language Day. It also must allocate sufficient funding to develop Sinhala and Tamil content on the Internet while making available software tools to carry out machine translation from English, German, Chinese to Sinhala and vice versa.