Buddhism, HTML and diacritics

If you want to impress your friends (or your blog readers…*ahem*) when you talk about Buddhism, why not use some HTML diacritics?

You see, most of the Buddhist terms you read about derive from one or more non-European langauges:

  • Sanskrit: the holy language used in Hinduism, religious literature. Now a dead language.
  • Pali: an ancient language in India, mostly used for trade. It was popular as a lingua franca. Also a dead language.
  • Classical Chinese: this is how Chinese was in the olden days. There are more Buddhist texts preserved in Classical Chinese than any other language.
  • Japanese: actually, most Japanese Buddhist terms are really just Classical Chinese with Japanese pronunciations, as was the style back then.

None of these languages natively use a Romanized script like Western European languages do, so it’s up to translators to figure out how to Romanize things. So, to capture all the sounds that don’t exist in English, linguistics experts recycle Roman letters, but add extra characters: diacritics.

Until real recently, it was pretty difficult to print non-standard Roman characters on a webpage. Back then, users had to download special fonts, and your browser had to be able to read them.
Now though, as the Internet becomes more international, you can pretty much print any Romanized character you want using special “extended-ASCII” codes in HTML.

For example, let’s say I want to print an ā character. In the old days, I could use a Character Palette program on Windows or Mac to copy/paste it (if I could find it), but now I can just use the HTML extended-ASCII code & # 257 ;. This is, all one word, an ampersand, a pound sign, the HTML code number and a semi-colon. If you put these together the web browser will automatically translate it into the right letter you want.

All extended-ASCII letters in HTML have the format of:


So, the trick is just remembering what number you want, and fill in the blanks. Remember that you have to do this for each special letter you want to print.

Here’s a helpful chart for some commonly used diacritics and letters for Buddhist terms. Most are for Pali/Sanskrit, but for Japanese, the long vowel sounds are used too (ā, ī, ō, ū):

  • á – 225, the a with an acute mark
  • é – 233, the e with an acute mark
  • ñ – 241, the n with a tilde over it
  • ú – 250, the u with an acute mark
  • ā – 257, the long “ah” sound
  • ī – 299, the long “ee” sound
  • ō – 333, the long “oh” sound
  • ś – 347 (346 for upper case), the s with an acute mark. In practice, this is functionally the same as ṣ but written different in Sanskrit.
  • ū – 363, the long “oo” sound
  • ḍ – 7693, a “d” sound in Sanskrit
  • ḥ – 7717, a breathy “h” at the end
  • ḷ – 7735, the nasal “l” sound
  • ṁ – 7745, a soft “m” sound
  • ṃ – 7747, the “ng” sound
  • ṅ – 7749, another “ng” sound
  • ṇ – 7751, the soft “n” sound
  • ḍ – 7693, the nasal “d” sound
  • ṛ – 7771, the deep “r” sound in the back of the throat.
  • ṣ – 7779 (7778 for upper case), the emphatic “s” sound
  • ṭ – 7789, the nasal “t” sound

Try it out on your webpages and see if it works well for you. After a few times, it gets much easier to accurate represent Buddhist terms in English. Good luck and happy blogging!

Published by Doug

🎵Toss a coin to your Buddhist-Philhellenic-D&D-playing-Japanese-studying-dad-joke-telling-Trekker, O Valley of Plentyyy!🎵He/him

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: