Tuesday, July 06, 2010

Switching from Devanāgarī to Roman with a single command

I have to admit even I am startled by the success of this.
In the input file below, I changed the single command:
  • \setdefaultlanguage{sanskrit}

to

  • \setdefaultlanguage{english}
and the result was the following:

How do I install RomDev mapping for XeLaTeX (Unicode transliteration -> Devanāgarī)?

[Update, February 2011: Somdev has moved his blog to http://pratibham.blogspot.com/.]

Somdev Vasudev's RomDev mapping is installed as follows:
  1. The actual mapping file is published by Somdev in his blog, here:
    http://sarasvatam.blogspot.com/2010/03/updated-teckit-romdev.html 
    [Update Feb 2011: now at http://pratibham.blogspot.com/2010/03/updated-teckit-romdev.html; update March 2012: now at https://github.com/somadeva/RomDev]
  2. Cut and paste this text, and save it in a Unicode file called RomDev.map.  Save that file in a place which XeTeX can "see," e.g., something like local/texmf/fonts/misc/xetex/fontmapping/
  3. You now need to compile the human-readable *.map file into a binary *.tec file, so that XeTeX can read it directly.  This is done by the program Teckit, which you can get here:
    http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=TECkitDownloads
  4. I'm working with Ubuntu GNU/Linux.  For me, the command is,

    teckit_compile RomDev.map -o RomDev.tec

    I'm afraid I don't know the Windows or Mac command invocation.

  5. Now you have a file in a place like
    local/texmf/fonts/misc/xetex/fontmapping/RomDev.tec

  6. Run the command that rebuilds the database of files that TeX knows about.  In Linux it's
    sudo mktexlsr
  7. That's it!  XeTeX and XeLaTeX can now see, and make use of the RomDev mapping, that converts Unicode transliteration into Devanāgarī, as exemplified in my earlier blog posts below. 

    A minimal edition of a Sanskrit verse, using XeLaTeX and Ledmac


    And here's the input for the above (tested and working in September 2019):


    \documentclass{book}
    % Set up things for XeLaTeX, and Devanagari.
    % Simplified version of http://cikitsa.blogspot.com/2010/07/xelatex-for-sanskrit.html

    \usepackage{polyglossia} % the multilingual support package
    % Next, from the polyglossia manual:
    \setdefaultlanguage{sanskrit} % this is mostly going to be Sanskrit,
    \setotherlanguage{french} % with some French embedded in it,
    \setotherlanguage{english} % and some English.
    % These will call appropriate hyphenation.
    \usepackage{xltxtra} % standard for nearly all XeLaTeX documents
    \defaultfontfeatures{Mapping=tex-text} % ditto
    \setmainfont{Gandhari Unicode} % could be any Unicode font
    % Now define the Devanagari font:
    % John Smith's Sahadeva, input using standard UTF8 transliteration
    \newfontfamily\sanskritfont [Script=Devanagari,Mapping=RomDev]{Sahadeva}

    % Now come the commands for the critical edition formatting:
    \usepackage[noeledmac]{ledmac} %"noeledmac" stops some annoying messages
    % customizations to Ledmac, and macros to make life easier.
    \def\Variant#1{\Afootnote{\relax#1}}
    \def\Lemma#1{\lemma{\relax#1}}
    \let\Reference=\Bfootnote
    \let\Grammatical=\Cfootnote
    \let\Tibetan=\Dfootnote
    % in a real edition, I'd probably also make
    % abbreviations for \textfrench (perhaps \tf) etc.
    \def\Omission#1{$\langle$#1$\rangle$}
    \def\ScribalDeletion#1{{\rm[\kern-.15em[}#1{\rm]\kern-.15em]}}
    \def\hardspace{\texttt{\char`\ }}
    \def\And{{\rm\penalty-1\quad$\mid\mid$~}} % divider between variants to the same lemma
    % more customizations: make the A notes
    % (\Variants and \Lemmas)into two-column format,
    % and make the B notes (\Reference) normal footnotes.
    %
    % changes to stuff cut-and-pasted from ledmac.sty:
    \makeatletter
    \renewcommand*{\twocolfootfmt}[3]{%
    \normal@pars
    % \hsize .45\hsize
    \hsize .49\hsize
    \parindent=0pt
    \tolerance=5000
    \raggedright
    \leavevmode\hangindent1.5em\hangafter1
    \strut{\notenumfont\printlines#1|}\enspace
    {\select@lemmafont#1|#2}\rbracket\enskip
    #3\strut\par\allowbreak}
    \foottwocol{A}
    \renewcommand*{\normalfootfmt}[3]{%
    \normal@pars
    \parindent=0pt \parfillskip=0pt plus 1fil
    \hangindent1.5em\hangafter1
    {\notenumfont\printlines#1|}\strut\enspace
    {\select@lemmafont#1|#2}\rbracket\enskip#3\strut\par}
    \footnormal{B}
    \makeatother
    \firstlinenum{1}
    \linenumincrement{1}


    % and here begins the edition:
    %
    \begin{document}
    \chapter*{yogaśatakam}
    \large


    \section*{\textenglish{The example verse by itself}}

    \textenglish{From \emph{Yogaśataka: Texte m\'edical attribu\'e
    \`a Nāgārjuna\ldots par Jean Filliozat} (Pondich\'ery, 1979), pp.\,1, 59:\par}

    \bigskip

    kṛtsnasya tantrasya gṛhītadhāmna-\\
    ścikitsitādviprasṛtasya dūram|
    vidagthavaidyapratipūjitasya\\
    kariṣyate yogaśatasya bandhaḥ|| 1||

    \bigskip

    \section*{\textenglish{The example verse, with apparatus}}
    % we could use the \stanza command, but I haven't bothered.

    %
    % I find that the judicious use of indentation
    % and newlines helps enormously to see what's what.
    % Using a good "folding editor" would be even better.
    %

    \begingroup
    \beginnumbering
    \autopar
    \edtext{
    \edtext{kṛtsnasya}{
    \Variant{%
    \textfrench{N1 détruit, C1 }kṛtas tasya,
    \textfrench{C2 }kṛtasya.}
    \Tibetan{\textfrench{T \emph{mth'yas}, ``sans limite, immense''
    traduit }kṛtsnasya.}}
    tantrasya
    \edtext{gṛhītadhāmna-}{
    \Variant{\textfrench{Ca, JK }dhamnā.}}\\
    \edtext{ścikitsitā}{
    \Lemma{cikitsitād} % not ``ścikitsitā'', of course. We're preserving
    the sandhyakṣaras.
    \Variant{\textfrench{C1, C2 } cikitsitāt.}
    \Tibetan{\textfrench{T \emph{gso-spyad} ''pratique de la
    thérapeutique''. Ordinairement
      \emph{gso spyad} est ``investigation del la th.''}}}% comment sign to stop a break after the conjunct
    \edtext{dviprasṛtasya}{
    \Lemma{viprasṛtasya} % as above with cikitsitād.
    \Variant{\textfrench{Ca} cikitsitārthaprasṛtasya, \textfrench{C1, C2}
    viprasutasya.}}
    \edtext{dūram}{
    \Variant{\textfrench{Ca} dūrāt}}|
    \\ \indent
    %
    % the above line is annoying. Because the whole verse is
    % inside an \edtext{} macro, in order to get the
    % \Grammatical note naming the upajāti verse, we have to
    % avoid having paragraph breaks, which are not allowed
    % inside \edtext{}.
    % instead, we use \\ (newline) and \indent (paragraph indent)
    % to get the same visual effect. A nasty kludge.
    %
    vidagdhavaidyapratipūjitasya\\
    \edtext{kariṣyate}{
    \Variant{\textfrench{N1} karikṣete.}}
    yogaśatasya bandhaḥ|| 1||
    }{\Lemma{}\Grammatical{Upajāti.}}
    \par % necessary to stop \autopar complaining. Thanks to Alessandro Graheli.
    \endgroup
    \end{document}

    Monday, July 05, 2010

    XeLaTeX for Sanskrit

    This example worked well in July 2010, but some TeX packages have since been updated slightly.  See the new, updated version of this example, posted on 27 May 2013.