Skip to content

Mathjax v2.2 feature discussion

fred-wang edited this page Nov 29, 2012 · 22 revisions

For the broader document see Mathjax v2.x potential features discussion

Localization

For the full requirements and discussion, see MathJax v2.x potential feature: localization

Let's try to boil down how far we can address the requirements in v2.2

  • How does MathJax decide which language to use?
    • How much control should the author have?
    • How much control should the visitor have?
    • Can MathJax automatically decide the language?
    • Can other software communicate preferences to MathJax (e.g., AT software, dynamic l11n code)?
    • What happens when the required language is not available?

MathJax will have additional configuration options as well as cookie information on user preferences for locales. Storage and loading of locales will follow the model already in place for configuration files.

The configuration options allows authors to specify a) what language is the default for a page b) if they want to activate a submenu so users switch languages (they can specify a subset of languages) c) if they want to forcibly override user preferences stored in the MathJax cookie.

We will not attempt to auto-detect locales as this is currently not reliably possible with javascript.

Since we want to allow authors to allow users to switch locales, an API will be needed to allow third parties to switch these as well (e.g., in a dynamic "change locale" situation of webcontent).

English will be the fallback language.

  • How do we guarantee availability of international characters?
    • How do we deal with non-roman languages?
    • How do we deal with non-Arabic numerals?
    • How do we deal with vertical writing?
    • Do we allow style changes in translations (cf. HTML snippets)? (different language, different need)

Eventually, we will add font-checking features, but for the initial implementation we will rely on system fonts with a wide unicode coverage.

We will need to investigate non-Arabic numerals.

Vertical writing will not be supported for now since vertical writing systems often have horizontal alternatives, e.g., Asian scripts.

  • Who makes the translation?
    • Do we have enough complexity to need professional translations?
    • how do we communicate to translators what needs to be translate?

Internally, json seems to be the natural way to deal with the locale information.

Externally (for contributors), we will make sure to offer a file format that is compatible with crowd-sourced platforms such as Transifex or Translate Wiki. The natural candidate is XLIFF; alternative could be PO (common in php projects). For more inspiration see this StackOverflow thread.

  • What languages should we deliver out of the box?

We will provide an initial set of translations as large as possible. French and German will be easy via the MathJax Team. The interest of the community (in particular, assuming help from Wikipedia and TranslateWiki) will help us provide a wide range of translations as soon as we've created the format and during the beta cycle.

  • How should MathJax extensions be localized?

While an API for extensions will be part of a future release (and hence influence the defensive implementation for v2.2), this is outside the scope of v2.2.

  • Can we add translations without making a new release?
    • Where can you find translations?
    • Can you provide private translations?
    • How do we encourage giving translation back?
    • Do we allow alternative or additional translations? (say, redirect to tech support)
    • Do we force CDN users to give their translations back?

The implementation should allow adding translations on the fly. From a practical point of view, we can easily add translations to the latest branch and the CDN, following the vN.Ma release schema.

Locale files will be part of the main MathJax repository but just like configuration files can be loaded from arbitrary locations.

We will encourage contributions by using webservices such as TranslateWiki.

Technical considerations:

  • 1 file per language vs multiple files per language

As with the idea of an API for extensions and their locales, MathJax will have to deal with multiple files eventually. The MathJax core locale should be small enough (<100 stringes) to fit in a single file without performance issues.

  • defensive programming aspects Apply to
    • MathJax UI strings added in the future
    • API for extensions and their locales
    • security concerns with 3rd party translations

Fonts & Characters

  • Add STIX and Asana support
    • Create fontdata and webfonts
    • investigate potential for crowdsourcing fontdata generation
  • Investigate font mixing/switching, using document font for alpha-numeric characters.
  • Investigate Deja Vu fonts (planned "MathML 2.0" support).

TeX input enhancements

  • incorporate Davide's amscd code.
  • investigate a LaTeX2e extra symbols extension.
  • offer instiki syntax as input -> not a priority (instiki never exposes its TeX syntax but converts to MathML). fred: actually instiki just uses itex2MML and I think has an option to use blahtex. Both tools are open source so the syntax is known. peter: true but we should probably open up another section for "input processors"? fred: it's probably overkill to implement input processors for itex2MML or blahtex. They basically rely on the same LaTeX-like syntax as MathJax TeX input processor. Extensions to add LaTeX commands that are specific to these languages and not included in the default TeX input Jax sound more appropriate. peter: good point. I was also thinking about possibly different syntax, say maple, but that's out of the scope anyway.

MathML support

Missing features (from the documentation)

  • elementary math tags: mstack, mlongdiv, msgroup, msrow, mscarries, and mscarry.
  • alignment groups in tables
  • right-to-left rendering
  • annotation-xml (to include non-mathml content, e.g., svg , (in particular in epub?))
  • complete table attributes (e.g., columnspan and rowspan)

Other topics

  • Improve line-breaking
  • Investigate Davide Carlisle's Content to Presentation xslt/javascript solution. fred: Perhaps we could do the conversion Content MathML => Presentation MathML in the MathML input Jax processor (maybe making this optional) and not necessarily using the XSLT (not sure whether it is available to all browsers). peter: sure, I don't know where it fits best. From what I can google, all our supported browsers do xslt 1.0 but I have no idea how performance is anyway. fred: if they all support XSLT 1.0 then I think it would be very easy to add a configuration option in the MathML input Jax to call David Carlisle's stylesheet before processing them. Given that content MathML is not a top priority, that would just give a convenient option for authors that need it on their pages without impacting other users. And we don't have to worry too much about performance or imperfect content to presentation mapping, I guess. The idea would be to modify the DOM to have: <math><semantics>[presentation MathML ouput]<annotation-xml encoding="application/mathml-content+xml">[content MathML output]</annotation-xml></semantics></math> (no need to create a new semantics if it already exists).
Clone this wiki locally