Monday, May 27, 2013

XeLaTeX for Sanskrit: update

In a post on 5 July 2010, I gave an example of how to use XeLaTeX with various fonts and various ways of inputting text.   Some time later, the commands in the Fontspec and Polyglossia packages were updated, and my example didn't work as advertised any more.  Here is an update that works again.



Here's the output:


And here's the updated input:


% This is a Unicode file.
\documentclass[12pt]{article}
\usepackage{multicol} % just to get narrow columns on one page
\usepackage{polyglossia} % the multilingual support package
% for XeLaTeX - includes Sanskrit.
% Next, from the polyglossia manual:
\setdefaultlanguage{english} % this is mostly going to be English text, with
\setotherlanguage{sanskrit} % some Sanskrit embedded in it.
% These will call appropriate hyphenation.
\usepackage{xltxtra} % standard for nearly all XeLaTeX documents
\defaultfontfeatures{Mapping=tex-text} % ditto
\setmainfont{Gandhari Unicode} %could be any Unicode font

% Now define some Devanagari fonts:
% At least *one* font family must be called \sanskritfont or \devanagarifont,
% so that XeTeX loads all hyphenation and other stuff for Sanskrit. 
% Once the Sanskrit ``intelligence'' is loaded, it can be invoked at
% other places as needed using \selectlanguage{sanskrit}
%
% John Smith's Nakula, input using Velthuis transliteration
\newfontfamily
\sanskritfont [Script=Devanagari,Mapping=velthuis-sanskrit]{Nakula}
% John Smith's Sahadeva, input using Velthuis transliteration
\newfontfamily
\sahadevafont [Script=Devanagari,Mapping=velthuis-sanskrit]{Sahadeva}
% John's Sahadeva, input using scholarly romanisation in Unicode
\newfontfamily
\sahadevaunicodefont [Script=Devanagari,Mapping=iast]{Sahadeva}
% Microsoft's Mangal font (ugh!), input using standard romanisation in Unicode.
\newfontfamily
\mangalfont [Script=Devanagari,Mapping=iast]{Mangal}
% Daniel Stender's iast.map is used above to get the mapping
% from Unicode to Devanāgarī. Zdenek Wagner's velthuis-sanskrit.map
% is used to get the Velthuis to Devanāgarī mapping. These are the files
% that XeTeX uses to make all the conjunct consonants without needing
% any external preprocessor (like the old devnag.c program).
% % Set up the font commands:
%
\newcommand{\velt}[1]{{{\selectlanguage{sanskrit}\sanskritfont #1}}}
\newcommand{\saha}[1]{{{\selectlanguage{sanskrit}\sahadevafont#1}}}
\newcommand{\sahauni}[1]{{{\selectlanguage{sanskrit}\sahadevaunicodefont #1}}}
\newcommand{\mangaluni}[1]{{{\selectlanguage{sanskrit}\mangalfont #1}}}
% \textssanskrit, above, is a Polyglossia command that gets Sanskrit hyphenation right.
% ... and here we go!
\begin{document}
\begin{multicols}{2} % narrow cols to force plenty of hyphenation
\large % ditto.
\begin{enumerate}
\item[1]
With Xe\LaTeX\ it's easy to typeset Velthuis encoded Devanagari like the following example, without using a preprocessor:
\velt{sugataan sasutaan sadharmakaayaan pra.nipatyaadarato 'khilaa.m"sca vandyaan|
sugataatmajasa.mvaraavataara.m kathayi.syaami yathaagama.m samaasaat||} Bodhicaryāvatāra 1,1.
NB: automatic hyphenation.
\item[2]
A different Devanāgarī font:
\saha{sugataan sasutaan sadharmakaayaan pra.nipatyaadarato 'khilaa.m"sca vandyaan|
sugataatmajasa.mvaraavataara.m kathayi.syaami yathaagama.m samaasaat||} Bodhicaryāvatāra 1,1.
\item[3]
Another sentence: \saha{ratnojjvalastambhamanorame.su muktaamayodbhaasivitaanake.su|
svacchojjvalasphaa.tikaku.t.time.su sungandhi.su snaanag.rhe.su te.su||} 2,10.
\item[4]
Now, thanks to Stender's iast.map, we can input in Unicode, using standard scholarly transliteration, and get Devanāgarī generated for us automatically:
\sahauni{āsīdrājā nalo nāma vīrsenasuto balī||\par }
\item[5]
Plain Unicode input, no tricks:
āsīdrājā nalo nāma vīrsenasuto balī||
\item[6]
Plain Unicode romanisation input, no tricks:
\mangaluni{āsīdrājā nalo nāma vīrsenasuto balī||\par }
Plain Unicode Devanāgarī input, no tricks:
{\mangalfont आसीद्राजा नलो नाम वीरसेनसुतो बली|\par}
\end{enumerate}
\end{multicols}

\noindent
English and Devanāgarī are both doing okay. The only thing that isn't hyphenating well yet is Sanskrit in roman transliteration.

Other nice stuff becomes easy. E.g., define a command \verb|\example| that prints a romanised word in Nāgarī, and then repeats it in romanisation, in parentheses:

\verb|\newcommand\example[1]{\sahauni{#1}~(\emph{#1})}|\newcommand\example[1]{\sahauni{#1}~(\emph{#1})}

Input: \verb|\example{ekadhā}|

Output: \example{ekadhā}
\end{document}
%that's all folks!