Notes on Viewing and Creating Devanagari Documents with Unicode Support

Learn Sanskrit

[Quick Start | Viewing Documents | Creating Documents | Microsoft Indic IME | fonts | Unicode | Tips | Notes from Blog | Links]

Quick Start

To view Devanagari documents on this Sanskrit Learning site choose the Unicode option for viewing documents in your browser via View->Encoding->Unicode(UTF-8). This will use the default Unicode fonts.

To view the documents best please install Sanskrit 2003 fonts. To create Devanagari documents install the program Itranslator 2003. You can download other Unicode fonts which support Devanagari Script. The fonts that should work are: Sanskrit 2003, Sanskrit 99, Chandas, Uttara, Code2000.

The fonts and programs used to create the documents on this Sanskrit Learning site can be obtained from:
Swami Satchidananda, Omkarananda Ashram Himalayas http://www.omkarananda-ashram.org

Unless you wish to do something very fancy use Itranslator2003 and forget about everything else. You can create Devanagari text with Itranslator2003, which supports Unicode, and then for fancy formatting copy-and-paste the text in any other application of choice. This is the easiest way. I know because I have spent a lot of time trying various options and finally I have come to the conclusion that Itranslator2003 is the best. Itranslator also has a great manual prepared by Ulrich Stiehl.

You will need to enable "International Language Support" in Windows systems. Please see the details in the Itranslator 2003 help file.

If you wish to create documents in other Indian Scripts then please read on and maybe try Braha or Microsoft Indic IME with Word. The following part of this document lists other software to view and create Devanagari documents. Please remember that http://www.sanskritdocuments.org is a rich source of information for Sanskrit related things.

Viewing Documents in Devanagari

All Windows XP machines come with MS Arial font which has support for Unicode. So even if you don't have a Sanskrit or Devanagari specific font, you should be able to view Devanagari characters. If you cannot, please make sure that you have enabled Unicode via View -> Encoding -> Unicode selection. Basically for XP machines you should be able to see Devanagari characters without much difficulty. To best see the pages on this site please install Sanskrit 2003 font. Please note any Unicode font with Devanagari support will do; it's not necessary (but only recommended) to have Sanskrit 2003 font to see the Devanagari text in the pages on this site.

I think so far Macintosh doesn't come with a default font which supports Devanagari Unicode but as Alan Wood writes, "Mac OS X 10 can use fonts intended for Windows, and comes with an increasing range of Mac Unicode fonts that allow a variety of scripts to be edited and displayed."

Please let me know how the pages on this site are seen in other browsers and platforms.

Avinash Chopde (avinash@acm.org) writes:
Browsers have a setting that says which font to use for which encoding. Firefox has ability to have different fonts for each language, in each encoding. Go to Tools -> Options -> Content, click on Advanced, and in there, for example, you can choose the encoding (Unicode in this case), and then the language ("Devanagari"), and select the correct Unicode font you want.

In Internet Explorer, you can use Tools ->  Internet Options, General tab, click on Font, and then  select Language Script and font as needed.

These techniques apply only when no font is defined in the web page - if the web page specifies "Arial Unicode MS", then that setting will always (most of the time!) be used...
Also see the section HTML Unicode (UTF-8) Encoding at http://www.aczoom.com/itrans/online/

Creating Document in Devanagari

Windows XP machines come with MS Arial and Mangal font which has support for Devanagari. If someone sends you a document with Devanagari text in utf-8 encoding then it should be displayed correctly by Outlook Express or Notepad or MS Word or for that matter most microsoft programs.

To create a document in Devanagari needs some support. Fortunately good programs are available to help you:

  1. Itranslator2003 program which creates Devanagari which you can copy-and-paste in your emails or other documents directly.
  2. Mudgala IME - This is one of the simplest program to input Devanagari, Roman Transliteration, Bengali, Grantha, Gujarati, Kannada, Malyalam, Oriya, Punjabi, Telgu, and standard Roman characters in all the applications. Try it out, it's the simplest tool available. Please read the FAQ. There is an easy to remember mapping between the typed Roman characters and displayed Devanagari characters.
  3. Aksharamala - This software is now a freeware and can be downloaded from the Aksharamala Google groups page. There is an easy to remember mapping between the typed Roman characters and displayed Devanagari characters. This software processes your keyboard input and converts it to Indian script characters. Once enabled, whatever you type (irrespective of the applicaiton you are using) you get the corresponding Indian script character display. For example, when you type 'sh' it displays श। It doesn't have its own editor, it processes all keystrokes for the active application. It can be used with all Unicode-complaint applicaitons. The Help has a list of Aksharamala compatible applications. It is very general and it's an excellent software. It uses ITRANS -> Unicode scheme and this means that it will work for ALL applications which support Unicode encoding, which I suppose is most applications. The free version lets you install only one Indic script keyboard at a time. The commercial version lets you choose between multiple keyboards. इदम् अक्षरमालया लिखितम्| Once the software in installed CTRL+Shlift+T toggles between the standard keyboard and the chosen aksharamala keyboard.
  4. Baraha - A excellent software package for Indian Scripts (Local copy of the Readme file.)
  5. Kamban Software - A universal editor for English, Hindi, Kannada, Malayalam, Sinhalese, Tamil & Telugu with Unicode base. Help says, "Except English, all other languages have three keyboard layouts - DOE  (Department of Electronics, India) standard Layout,  Typewriter layout (based on Godrej Type writer, India) standard, and Phonetic Layout." You can choose amongst the three layouts and the keyboard layout can be displayed which makes it very easy to type.
  6. Chandas IME (Input Method Editor) and fonts.
  7. Sarasvati IME
  8. Indic IME provided by Microsoft. (Local copy downloaded in March 2006.)
  9. Tranliterator - A Java program for typing Devanagari using Itrans or SLP coding scheme written by Chetan Pandey. The output can be saved in rtf format.
  10. Technology Development for Indian Languages - I cannot workout how to use their software. It seems very powerful but I suspect one needs a card called GIST card but I cannot be sure.

Please see help at Wikipedia or Devanagarii.net on how to do a setup for Devanagari and other Indian Scripts. Please also see below Nandakishorji's Blog entry on other software available to create documents in Indian Scripts.

There is a lot of discussion on Phonetic keyboard but in my extensive search on the Interent I couldn't find even one Phonetic keyboard layout. I have saved the Phonetic keyboard layout from Kamban software and if you have a need or are just curious, please have a look. The other popular layouts like TypeWriter, Transliteration, and Inscript can be seen in the Microsoft Indic IME Help. Askharamala Help is also excellent about ITRANS encoding scheme.

Devanagari with Microsoft Indic IME

Input Method Editors are basically programs to read your input keystrokes and convert them to appropriate Devanagari or other Indian Script characters. First please read the documentation and installation instructions and help which Microsoft IME or Chandas or Sarasvati have supplied. If you are happy with the fonts which come with windows then try Indic IME as your first choice. The entire process shouldn't take more than fifteen minutes or so. Experiment and very soon you can start creating your own documents effortlessly.

When you install Indic IME from Microsoft Word a help item will be created in Start->All Programs->Indic IME Help. Please read it carefully and do everything it says. From my experience I can say that (a) It has everything and (b) on first reading not everything is clear, hence I give some additional notes which will make Indic IME help clear.

After installing Indic IME an icon will appear in the System Tray on the right side of the Windows taskbar with either EN or HI or GJ or SA depending on what you have installed, etc. This is what Microsoft calls the language indicator. If you don't see the language indicator, right-click on the windows taskbar, select Toolbars menu and then select language bar. This indicator can be floated in the main window or it can appear on the taskbar of applications like Word. Please experiment. There is a down-arrow in that language indicator, clicking on it will give you an option to select the "Settings..."

In the Settings you can choose Key Settings to toggle between different language settings. The standard toggle key combination is Left ALT+Shift. This is a very useful feature. This can help you change your keyboards easily.

Another useful feature is the shortcut toolbar or the status bar which will appear whenever you choose the Indic IME in Microsoft Office Applications. This might appear on the lower right corner or upper, please look for it. This feature is not available in applications outside MS Office which is a pity and for this reason Aksharmala is very powerful. Free version of Aksharmala lets you input only one Script but Indic IME lets you input presently four scripts.

Status Bar facilitates the use of options provided by Indic IME. The various options available are:

Options Descriptions
Language Toggle To toggle between the Hindi and English Language.
Keyboard To switch over between different keyboards
AutoText To enable or disable type ahead options
On-the-fly-Help To enable or disable character Hel p
Customized Word List To Add /Update/Delete user-defined words for auto Text display

This status bar will enable you to choose one of the several keyboard options and also let you display the keyboard on the screen. It can do automatic word completion and many useful things. Please experiment with it. I find the transliteration mode of the Hindi IME especially useful. The IME help file gives the key combinations to produce the required Devanagari characters.

The good thing about transliteration mode is that it's easy to learn--it is just the way you would write Hindi or Sanskrit in Roman Script--and it can be used for all Indian languages in exactly the same way. For those who have used ITRANS, this keyboard is very close to that. If you enable On-the-fly-Help a small window is opened which will dispaly all the possible keys following the key you just pressed which will make meaningful Devanagari characters.

Top

Fonts

sanskrit99, Sanskrit 2003, chandas, URW Palladio ITU, paitub, paitubi, paitui, paitur, uttara, Code2000

Top

Unicode

Unicode is a four digit hexadecimal representation of characters. For Devanagari, Unicode space U+0900 to U+097F is reserved.

Unicode for स is U+0938 and the hexadecimal representation for स in HTML is स and the same स in decimal representation for HTML is स

There is a mapping from Unicode to UTF-8 (Unicode Transformation Format) encoding. The UTF-8 encoding is understood by most editors and browsers. The mapping from Unicode to UTF-8 is a bit involved. A general discussion about Unicode to UTF-8 mapping can be found in FAQ which is useful to understand concepts in this mapping.

There is another mapping from Unicode to HTML entities. Please see the table which gives HTML equivalents for Devanagari Unicodes. This table is a local copy of the original document in UK. When Itranslator saves Devanagari in html format, Devanagari characters are saved in HTML decimal equivalents of Unicode. The Itranslator Manual also gives the mapping between Unicode and its decimal representation for Devanagari.

Tips

Direct UTF-8 in html files

Use charset=utf-8 in the document tag and forget about including specific font names. There are files from Mcdonell's dictionary which have this charset=utf-8 statement and then no other font statement. They have characters also in binary form which is in Unicode.

In 2006-03-09.txt there is a wrapper which can be used to generate html files (see 2006-03-09.html ) in UTF-8 encoding directly using the itrans command
itrans -i 2006-03-09.txt -o 2006-03-09.html
See the file 2006-03-09.html in notepad (via View -> source) and you will still see all the Devanagari characters. UTF-8 encoding is understood by most editors.
A better idea on how to fully get the wrapper done can be had from ex_utf8.itx which comes with the Itrans documentation.

Alternate fonts can be specified as:

HR { color: #000000}
BODY, TABLE /* Devanagari */
{
font-size: 16pt;
font-style: normal;
font-weight: normal;
color: #000000;
text-decoration: none;
font-family:"Sanskrit 2003","Sanskrit 99",Uttara;
}

A URL from where the font can be downloaded can also be specified:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"> <HTML> <HEAD> <TITLE>Font test</TITLE> <STYLE TYPE="text/css" MEDIA="screen, print"> @font-face { font-family: "Robson Celtic"; src: url("http://site/fonts/rob-celt") } H1 { font-family: "Robson Celtic", serif } </STYLE> </HEAD> <BODY> <H1> This heading is displayed using Robson Celtic</H1> </BODY> </HTML>

Top

Notes from Sanskrit-Links' Blog of Wednesday, June 15, 2005

Sanskrit Wikipedia Information and Links in Sanskrit to many areas of study. The page is translated in Sanskrit-Devanagari. "vikipIDiyaa ekaH bahubhAShAsu muktaH vishvakoshaH asti | sa.nskR^iktabhAShAyaaH vikipIDiyaa juuna 2004 tame shubhArambhitaH |" This is an interesting development relying on the participation of countless volunteers. The information is expanded in true web manner and can be distracting at times. Volunteers need to train themselves in using unicode editors.

There are quite a few Devanagari unicode generators available these days. Use itrans , Devanagari Generator using Itrans online , aksharamala , Itranslator , Javascript limited Hindi-unicode , chhahAri Unicode-based Nepali/Devnagari Editor , Hindi keyboard , baraha , Database of Indian Sacred Scriptures site, and TDIL . All these have convenient interfaces. See additional information about unicode devanagari at Alan Wood's Unicode Resources Test for Unicode support in Web browsers Devanagari , Devanagari writing , and Hindi translation of what is Unicode . Additional links are given in Sanskrit FAQ .

Links

Ulrich Stiehl's Sanskrit Website http://www.sanskritweb.net/
Google Transliteration Tool
South Asian Language Resrouce Centre Resource for fonts and Input software
Unicode Support in HTML, HTML Editors and Web Browsers - Alan Wood's Unicode Support Pages
W3C I18N Tutorial: Character sets & encodings in XHTML, HTML and CSS
Devanagari on Wikipedia - with keyboard layouts, etc.
Amazing Mozilla Utility to convert between Indian Scripts for Mozilla Firefox
Indic Font List
Unicode help page for Display problems
The Indian Script Converter
Indian Script Input System - From Hyderabad
Keyman – keyboard mapping software from Tavultesoft (A general software providing support for several languages)
Hindi Writer


Himanshu Pota [ Home | Personal page | Learn Sanskrit ]
Last modified: Thursday May 8, 2008 10:56 AM