Jump to content

Which Are The Longest Thai Words ?


mni

Recommended Posts

In my search I could not find words containing more than 5 symbols.

(ex: สัตว์ , ปล่อย , พาณิช , เมล็ด)

unless the word is a combination of few words.

Are there any ?

Thanks

Link to comment
Share on other sites

Depends on how you define "word". There is no obvious answer.

There are many "words" that correspond to one English word, but several Thai words, such as ความน่าจะเป็น "probability". One Thai word or four?

One could make the case that the longest Thai words are compounds from Sanskrit or Pali, especially those which have some change in pronunciation or spelling that makes them slightly different from the sum of thier parts. ประชาธิปไตย "democracy" is one, with 11 characters.

Ultimately, it comes down to the fact that Thai is (historically) an isolating language, which means the majority of morphemes -- meaningful units in the language -- are their own "words", and not just prefixes and suffixes that attach to other words (like, say, plural -s in English, or past tense -ed).

The influence of foreign languages has caused the introduction of many more polysyllabic words and bound prefixes and suffixes that are used to coin technical terminology, etc.

English is also a relatively isolating language. By comparison, so-called synthetic languages create single phonological "words" that would correspond to whole English phrases and clauses. This is the reason behind the commonly believed myth that "Eskimo" languages have many, many words for snow (the number tends to be picked out of a hat--17, 50, whatever). In fact, they don't, it's just that in their language "freshly fallen snow" and "somewhat slushy snow" and "nearly melted snow" would be single "words", but built from the same root word for snow.

Edited by Rikker
Link to comment
Share on other sites

Rikker

Thank you for your explanation.

I can only guess that ประชาธิปไตย can be divided to ประ ชาธิ ปไตย

I have two questions:

If we exclude Sanskrit or Pali words, can you think of any Thai word longer than 5 symbols ?

Is there a place where I can find all Sanskrit or Pali origin words ?

What I am trying to do is a small application which will divide Thai sentences to single core words. Example:

เขากำลังพูดโทรศัพท์อยู่ >>> เขา กำลัง พูด โทรศัพท์ อยู่

To do this, I need to know the longest Thai word (including symbols) and, as I can see now, some longer Sanskrit or Pali words.

Thanks for your help

Link to comment
Share on other sites

ประชาธิปไตย is a compound of ประชา "public" + อธิปไตย "sovereignty". You can't split the word up without knowing its constituent words and the Indic compounding rules that are used to create it. Which is why it's easy to argue that it's one long word, while ความน่าจะเป็น seems more like several.

Thai words longer than five symbols, if we include tone marks, would include things like เปลือย (6) or เปลื้อง, เกลี้ยง (7).

I don't know of any simple way to show "all" Sanskrit/Pali words in Thai, except that I could pull out a list from files I have and send it to you.

Tools to segment a Thai text into component "words" include Mike's site, thai2english, Glenn Slayden's bulk lookup on thai-language.com, and a standalone program for this created by Dr. Wirote Aroonmanakun of Chulalongkorn University.

Alternately, you can do this on a new site on SEAlang, the Reader's Helper. It's part of a set of new experimental tools to assist in self-guided improvement of Thai reading, writing, and vocab skills. (Disclosure: I work for SEAlang, and was involved in preparing the underlying data for the project.)

If you go to Reader's Helper, you can use the upload box in the upper right frame to upload an arbitrary text file (or use an internet URL, or any of the provided texts).

After choosing a text, clicking the "R" button to the left will load it into the Reader's Helper. Then use the SEGMENTATION control on the left to segment into "words" "compounds" or "phrases".

The programming on the site is still rough, so expect bugs, but it's meant as a demonstration of ideas for self-study. There are lots of features, but I'll avoid sounding like I'm trying to "sell" the site (it's a free site developed with U.S. tax dollars, though). Frankly, I think it's not-quite-ready-for-prime-time, but many of the features work. :o

Edited by Rikker
Link to comment
Share on other sites

Rikker

Thanks again for the info.

My project is small and handy, just input a sentence and get it splited. I is not perfect yet because I built it on maximum 5 symbols (including tones) and still have some problems to solve (like words that include การ >>> พิการ).

In case you remember some more long words (5+), I will appreciate if you can post them here.

My project will be free but it will take some time to complete.

Thanks again

mni

Link to comment
Share on other sites

Perhaps not a word?

กรุงเทพมหานครอมรรัตนโกสินทร์มหินทรายุธยามหาดิลกภพนพรัตน์ราชธานีบุรีรมย์อุดมราชนิเวศน์มหาสถานอมรพิมานอวตารสถิตสักกะทัตติยะวิษณุกรรมประสิทธิ์

:o

Patrick

Link to comment
Share on other sites

Perhaps not a word?

กรุงเทพมหานครอมรรัตนโกสินทร์มหินทรายุธยามหาดิลกภพนพรัตน์ราชธานีบุรีรมย์อุดมราชนิเวศน์มหาสถานอมรพิมานอวตารสถิตสักกะทัตติยะวิษณุกรรมประสิทธิ์

:o

Patrick

This compound word will be divided to:

กรุง เทพ มหา นคร อม รัตนโกสินทร์ ยา มหา ดิลก ภพ นพรัตน์ ราช ธานี บุรี รมย์ อุดม สถาน พิมาน อวตาร สัก กะ ติ ยะ วิษณุ กรรม ประ สิทธิ์

Link to comment
Share on other sites

Perhaps not a word?

กรุงเทพมหานครอมรรัตนโกสินทร์มหินทรายุธยามหาดิลกภพนพรัตน์ราชธานีบุรีรมย์อุดมราชนิเวศน์มหาสถานอมรพิมานอวตารสถิตสักกะทัตติยะวิษณุกรรมประสิทธิ์

:o

Patrick

This compound word will be divided to:

กรุง เทพ มหา นคร อม รัตนโกสินทร์ ยา มหา ดิลก ภพ นพรัตน์ ราช ธานี บุรี รมย์ อุดม สถาน พิมาน อวตาร สัก กะ ติ ยะ วิษณุ กรรม ประ สิทธิ์

... and at least one bit into quite a few more than 5 characters ......

Link to comment
Share on other sites

Yes, I know it is not perfect and probably will not be but I will try to make it as good as I can. In many cases it is difficult to determind when a letter belongs to a previous or next word.

Link to comment
Share on other sites

Hmmm The name of King Rama I is much longer than Bangkok name in Thai.

----

how about รัฐประศาสนศาสตร์ (Public Administration) and รัตนโกสินทรศก- Ra ta na ko sin sok(Ratanakosin Era)?

and some other words for royalty such as พระบรมฉายาัลักษณ์ and พระมหากรุณาธิคุณ

Edited by thithi
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.










×
×
  • Create New...