YAMultilingualMarkdown
- English | Japanese
YAMultilingualMarkdown is a utility to convert Yet Another Multilingual Markdown to HTML.
Yet Another Multilingual Markdown (format) is a Markdown dialect designed for hosting multilingual content. YAMultilingualMarkdown (tool) converts Yet Another Multilingual Markdown to HTML while extracting only the content in specified language(s).
Usage
Synopsis
ya_multilingual_markdown [OPTIONS] [FILE]Options
Type ya_multilingual_markdown --help to show command line options.
Convert Yet Another Multilingual Markdown to HTML
Usage:
ya_multilingual_markdown [OPTIONS] [FILE]
Options:
--langs=LANG1,LANG2,... Languages to be included in the output (default: ""; empty implies all)
--lang-attr-name=STRING Attribute name for language (default: "lang")
--heading-lang-sep=STRING Languages separator in headings (default: " / ")
--output-type=auto|fragment|document
Output type (default: "auto")
--template-file=FILE Document template file in eRuby format to embed contents
--link-suffixes=FROM:TO,... Link suffixes to rewrite (default: ".md:.html")
--show-default-template Show default document template
--log-level=unknown|fatal|error|warn|info|debug
Log level (default: warn)
--help Show help message
--version Show version number
Examples
Multilingual contents: headings, paragraphs, and other elements
A simple Yet Aonther Multilingual Markdown document looks like the following (snow_white.md):
# Schneeweißchen
{: lang="de"}
# Little Snow-white
{: lang="en"}
Es war einmal mitten im Winter,...
{: lang="de"}
Once upon a time in the middle of winter,...
{: lang="en"}(You can use PHP Markdown Extra style extended syntax ({: name="value"}) to add attributes to block elements.)
Keep all languages
Without language-related options, the output will contain all languages.
ya_multilingual_markdown snow_white.mdExcerpt from the output:
<h1><span lang="de">Schneeweißchen</span> / <span lang="en">Little Snow-white</span></h1>
<p lang="de">Es war einmal mitten im Winter,...</p>
<p lang="en">Once upon a time in the middle of winter,...</p>In a browser, above output may look like the following:
Schneeweißchen / Little Snow-white
Es war einmal mitten im Winter, ...
Once upon a time in the middle of winter, ...
Extract single language
With option --langs=en, the output will contain only the elements with lang whose value is set to en (and elements without lang attribute).
ya_multilingual_markdown --langs=en snow_white.mdExcerpt from the output:
<h1><span lang="en">Little Snow-white</span></h1>
<p lang="en">Once upon a time in the middle of winter, ...</p>In a browser, above output may look like the following:
Little Snow-white
Once upon a time in the middle of winter, ...
Extract multiple languages
With option --langs=de,en, the output will contain elements with lang set to de or en (and elements without lang).
ya_multilingual_markdown --langs=de,en snow_white.mdExcerpt from the output:
<h1><span lang="de">Schneeweißchen</span> / <span lang="en">Little Snow-white</span></h1>
<p lang="de">Es war einmal mitten im Winter,...</p>
<p lang="en">Once upon a time in the middle of winter,...</p>In a browser, the output may look like the following:
Schneeweißchen / Little Snow-white
Es war einmal mitten im Winter, ...
Once upon a time in the middle of winter, ...
Metadata in YAML front matter
Document metadata can be stored in the document using Jekyll-style YAML front matter.
A simple Yet Aonther Multilingual Markdown document with YAML front matter looks like the following (snow_white_with_metadata.md):
---
title: Little Snow-white
author:
- Jacob Ludwig Karl Grimm
- Wilhelm Carl Grimm
meta:
- name: original title
content: Schneeweißchen
lang: de
- name: translator
content: Margaret Hunt
lang: en
---
...
(The key author is a shortcut to <meta name="author" .../>.)
Let us include all languages in the output:
ya_multilingual_markdown snow_white_with_metadata.mdExcerpt from the output:
<title>Little Snow-white</title>
<meta name="author" content="Jacob Ludwig Karl Grimm" />
<meta name="author" content="Wilhelm Carl Grimm" />
<meta name="original title" content="Schneeweißchen" lang="de" />
<meta name="translator" content="Margaret Hunt" lang="en" />
<p>...</p>You can filter metadata based on their languages.
Let us include en only (thus exclude de) in the output:
ya_multilingual_markdown --langs=en snow_white_with_metadata.mdExcerpt from the output:
<title>Little Snow-white</title>
<meta name="author" content="Jacob Ludwig Karl Grimm" />
<meta name="author" content="Wilhelm Carl Grimm" />
<meta name="translator" content="Margaret Hunt" lang="en" />
<p>...</p>Output complete document
Use --output-type=document to print complete HTML document rather than HTML fragments.
Input:
---
title: Little Snow-white
---
Once upon a time in the middle of winter, ...
Command line:
ya_multilingual_markdown --output-type=document snow_white_with_title.mdOutput:
<!DOCTYPE html>
<html>
<head>
<title>Little Snow-white</title>
</head>
<body>
<p>Once upon a time in the middle of winter, ...</p>
</body>
</html>--output-type=auto will automatically choose document if the input inludes a front matter.
You can provide a custom template using --template-file=FILE. Templates must be in eRuby format. Use --show-default-template to see the built-in default template.
Installation
gem install ya_multilingual_markdown
or
git clone https://github.com/hisashim/ya_multilingual_markdown
cd ya_multilingual_markdown
rake install
Requirements
Runtime requirements:
Development requirements (in addition to runtime requirements):
Notes
Limitations and known problems
-
Only a small subset of kramdown's extended syntax is supported, although YAMultilingualMarkdown is built upon kramdown.
-
As for multilingual headings, ALD (Attribute List Definition) for each heading must be placed only after the heading.
Supported:
# Schneeweißchen {: lang="de"} # Little Snow-white {: lang="en"}Not supported:
{: lang="de"} # Schneeweißchen {: lang="en"} # Little Snow-whiteThis compromise allows us to write id at the beginning of headings as well as at the end, with less code.
{: #title} # Schneeweißchen {: lang="de"} # Little Snow-white {: lang="en"}# Schneeweißchen {: lang="de"} # Little Snow-white {: lang="en"} {: #title}
Motivation
Yet Aonther Multilingual Markdown and its processor were born out of the need for a manuscript format for translated books.
Having a side-by-side version of the galley proof that includes both the original and translated texts helps translators review their work. Being able to search and edit manuscripts in a (sort of) side-by-side format is also useful.
While placing translated text in separate files from the original is a common and effective approach for localization/multilingualization projects, a format allowing multiple languages within a single file comes in handy for small projects with just a few languages. Yet Aonther Multilingual Markdown is an attempt to develop a proof of concept for such a format and a processing tool.
See also
-
Requirements for Japanese Text Layout is an excellent example of a multilingual document in HTML format.
-
Lightweight text formats and processing tools that allow multiple languages to be written in a single file (not necessarily feature or aim at extracting or representing multiple languages side-by-side):
License
This software is distributed under the terms of the MIT license.
Acknowledgments
Many thanks to:
- Koichi Sasada, whose manuscript preprocessor inspired me to come up with a lightweight markup format that features multilingualization.
Contributors
- Hisashi Morita - creator and maintainer