Project

mml

0.0
The project is in a healthy, maintained state
MathML parser and builder used in Plurimath.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

~> 0.8.0
>= 0
 Project Readme

Mml: MathML parser and builder

Purpose

Mml provides MathML 3 and MathML 4 XML parsing and serialization for Ruby. It maps the full MathML element set into Ruby model classes using the lutaml-model framework and is used by Plurimath for mathematical formula representation.

Key features:

  • Dual MathML version support: Separate class hierarchies for MathML 3 and MathML 4 with explicit version selection

  • Round-trip fidelity: Parse XML to an object graph, modify, and serialize back

  • Namespace handling: Default xmlns, prefixed mml:, and namespace-less input

  • Opal support: Runs in the browser via Ruby-to-JavaScript compilation

Installation

gem 'mml'
$ bundle install
# or
$ gem install mml

Quick start

require "mml"

# Parse MathML with explicit version
math = Mml.parse('<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math>', version: 3)
math4 = Mml.parse(input, version: 4)

# Serialize back to XML
math.to_xml

# Or use versioned modules directly
Mml::V3.parse(input)
Mml::V4.parse(input)

MathML version architecture

Mml maintains three parallel class hierarchies under Mml::V2, Mml::V3, and Mml::V4. All versions share the same namespace URI (http://www.w3.org/1998/Math/MathML).

                ┌───────────────────────────────────────────┐
                │                    Mml                    │
                │   parse() delegates to V2 / V3 / V4       │
                └────────────────────┬──────────────────────┘
                                     │
           ┌─────────────────────────┼─────────────────────────┐
           │                         │                         │
      ┌────┴────┐               ┌────┴────┐               ┌────┴────┐
      │ Mml::V2 │               │ Mml::V3 │               │ Mml::V4 │
      └────┬────┘               └────┬────┘               └────┬────┘
           │                         │                         │
┌──────────┴──────────┐   ┌──────────┴──────────┐   ┌──────────┴──────────┐
│   V2-only classes   │   │     Shared          │   │   V4-only classes   │
│ (declare, reln, …)  │   │   Base::modules     │   │ + intent, arg,      │
└─────────────────────┘   └─────────────────────┘   │   displaystyle,     │
                                                    │   scriptlevel,      │
                                                    │   + <a> element     │
                                                    └─────────────────────┘

                    ┌───────────────────────────────┐
                    │          lib/mml/base/        │
                    │     (shared attributes)       │
                    └───────────────────────────────┘

V2 (lib/mml/v2/): Standalone class hierarchy with full Content MathML support. Includes deprecated elements like declare and reln not present in V3/V4.

V3 (lib/mml/v3/): Uses shared lib/mml/base/ modules for presentation elements. Adds overflow attribute and V3-specific features.

V4 (lib/mml/v4/): Uses shared lib/mml/base/ modules with V4-only attributes (intent, arg, displaystyle, scriptlevel) and the <a> hyperlink element.

Version selection

Mml.parse(input)              # Default: MathML 3 (Mml::V3)
Mml.parse(input, version: 2)  # Explicit MathML 2
Mml.parse(input, version: 3)  # Explicit MathML 3
Mml.parse(input, version: 4)  # Explicit MathML 4
Mml::V2.parse(input)          # Direct v2 parsing
Mml::V3.parse(input)          # Direct v3 parsing
Mml::V4.parse(input)          # Direct v4 parsing

Key differences between MathML 2, 3 and 4

Feature MathML 3 additions MathML 4 additions

V3-only attributes

overflow, linebreakmultchar

-

V4-universal attributes

-

intent, arg, displaystyle, scriptlevel

V4 hyperlink element

-

<a> with href, hreflang

V2 deprecated elements

declare, reln, fn

declare, reln, fn

Deprecated font attrs

fontfamily, fontweight, etc.

removed from strict V4

Migration from previous versions

require and configuration

The Mml module no longer aliases versioned constants. Use the explicit version namespace:

# Before (no longer supported)
require "mml/configuration"
Mml::Configuration.adapter = :nokogiri
Mml::Configuration.create_context(id: :custom_v3)
Mml::Math.new(...)

# After — explicit version
require "mml"
Mml::V3::Configuration.adapter = :nokogiri
Mml::V3::Configuration.create_context(
  id: :custom_v3,
  substitutions: [
    { from_type: Mml::V3::Mi, to_type: MyCustomMi }
  ]
)
Mml::V3.parse(input, context: :custom_v3)

# Or for MathML 4
Mml::V4::Configuration.adapter = :nokogiri
Mml::V4::Configuration.create_context(
  id: :custom_v4,
  substitutions: [
    { from_type: Mml::V4::Mi, to_type: MyCustomMi }
  ]
)
Mml::V4.parse(input, context: :custom_v4)

Element class references

All element classes live under their version namespace:

# Before (no longer supported)
Mml::Mi.new(value: "x")
Mml::Mrow.new(mi_value: [...])

# After
Mml::V3::Mi.new(value: "x")
Mml::V3::Mrow.new(mi_value: [...])

# Or for MathML 4
Mml::V4::Mi.new(value: "x", intent: "$x")
Mml::V4::Mrow.new(mi_value: [...])

Parsing

Mml.parse still works with a version: keyword, defaulting to version 3:

Mml.parse(input)              # unchanged — defaults to MathML 3
Mml.parse(input, version: 4)  # use MathML 4

Parsing and serialization

Parsing

# Default namespace
Mml::V3.parse('<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math>')

# Prefixed namespace
Mml::V3.parse('<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>x</mml:mi></mml:math>')

# No namespace (namespace injected internally)
Mml::V3.parse("<math><mi>x</mi></math>", namespace_exist: false)

# MathML 4
Mml::V4.parse(input)

Serialization

math.to_xml
# => "<math xmlns=\"http://www.w3.org/1998/Math/MathML\"><mi>x</mi></math>"

math.to_xml(prefix: true)
# => "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"><mml:mi>x</mml:mi></mml:math>"

math.to_xml(declaration: false)
# => "<math xmlns=\"...\"><mi>x</mi></math>"

Round-trip (parse, modify, serialize)

math = Mml.parse('<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math>')
math.display = "block"
math.to_xml
# => "<math xmlns=\"http://www.w3.org/1998/Math/MathML\" display=\"block\"><mi>x</mi></math>"

Element reference

Element types

Token elements: mi, mn, mo, ms, mtext, mspace, mglyph

General layout: mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom, mfenced, menclose, maction

Script elements: msub, msup, msubsup, munder, mover, munderover, mmultiscripts, mprescripts

Table elements: mtable, mtr, mtd

Row and stack elements: mstack, msrow, mscarries, mscarry, msline, msgroup, mlongdiv

Semantic elements: mfraction, semantics

v4 only: a (hyperlink)

Deprecated: mlabeledtr, none (classes exist but hidden from CommonElements in v4)

Token elements (leaf nodes)

Token elements hold text content in the value attribute:

Mml::V3::Mi.new(value: "x")
Mml::V3::Mn.new(value: "42")
Mml::V3::Mo.new(value: "+")
Mml::V3::Ms.new(value: "text")
Mml::V3::Mtext.new(value: "label")
Mml::V3::Mspace.new(width: "1em")
Mml::V3::Mglyph.new(alt: "symbol")

Container elements

Container elements hold child elements via #{tag}_value collection attributes:

Mml::V3::Mrow.new(
  lutaml_register: Mml::V3::Configuration.context_id,
  mi_value: [Mml::V3::Mi.new(value: "x")],
  mo_value: [Mml::V3::Mo.new(value: "+")],
  mn_value: [Mml::V3::Mn.new(value: "1")],
)
# => <mrow><mi>x</mi><mo>+</mo><mn>1</mn></mrow>

Composing expressions

Build an expression tree by nesting elements:

Mml::V3::Math.new(
  lutaml_register: Mml::V3::Configuration.context_id,
  mfrac_value: [
    Mml::V3::Mfrac.new(
      lutaml_register: Mml::V3::Configuration.context_id,
      mi_value: [Mml::V3::Mi.new(value: "a"), Mml::V3::Mi.new(value: "b")],
    ),
  ],
)
# => <math><mfrac><mi>a</mi><mi>b</mi></mfrac></math>

Tables

Mml::V3::Mtable.new(
  lutaml_register: Mml::V3::Configuration.context_id,
  mtr_value: [
    Mml::V3::Mtr.new(
      lutaml_register: Mml::V3::Configuration.context_id,
      mtd_value: [
        Mml::V3::Mtd.new(
          lutaml_register: Mml::V3::Configuration.context_id,
          mi_value: [Mml::V3::Mi.new(value: "a")]
        ),
        Mml::V3::Mtd.new(
          lutaml_register: Mml::V3::Configuration.context_id,
          mi_value: [Mml::V3::Mi.new(value: "b")]
        ),
      ],
    ),
  ],
)
Mml::V4::A.new(
  lutaml_register: Mml::V4::Configuration.context_id,
  href: "https://example.com",
  hreflang: "en",
  mi_value: [Mml::V4::Mi.new(value: "click")]
)
# => <a href="https://example.com" hreflang="en"><mi>click</mi></a>

MathML V2 Support

V2 is a standalone implementation with its own class hierarchy in lib/mml/v2/. It includes full Content MathML support with elements not present in V3/V4:

# Parse MathML 2
Mml::V2.parse('<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi></math>')

# Content elements available in V2 (deprecated in V3/V4)
Mml::V2::Declare.new(...)
Mml::V2::Reln.new(...)
Mml::V2::Fn.new(...)

V2 uses Mml::V2::Configuration for context management:

Mml::V2::Configuration.context_id  # => :mml_v2
Mml::V2::Configuration.context
Mml::V2::Configuration.populate_context!

MathML V4 Compliance

This implementation has been audited against the MathML 4 W3C Recommendation.

Universal V4 Attributes

All MathML 4 presentation elements include intent, arg, displaystyle, scriptlevel, mathcolor, and mathbackground via the shared Base::V4Attributes module.

Legacy Schema Support

For backwards compatibility with existing MathML content, this gem supports both the strict V4 schema and the legacy schema:

Feature Status

Universal V4 attributes (intent, arg, displaystyle, scriptlevel)

Full support

mathcolor, mathbackground on all presentation elements

Full support

<a> hyperlink element

Full support

Deprecated font attributes (fontfamily, fontweight, etc.)

Legacy support (V3 + V4 legacy schema)

fence, separator on <mo>

Legacy support (removed from default V4 schema)

none element

Deprecated in V4 (empty <mrow> recommended)

mlabeledtr element

Legacy support (removed from default V4 schema)

Internal architecture

Element class patterns

Shared attributes and mappings live in Base:: modules (lib/mml/base/). V3 and V4 classes include these modules independently — no cross-version inheritance.

  • Leaf elements: inherit Lutaml::Model::Serializable, include Base::ElementName

  • Container elements: inherit CommonElements, include Base::ElementName

Each element self-registers in its version’s built-in GlobalContext context.

# Shared attributes (lib/mml/base/mi.rb)
module Base::Mi
  def self.included(klass)
    klass.class_eval do
      attribute :value, :string
      xml do
        element "mi"
        map_content to: :value
      end
    end
  end
end

# V3 leaf
class V3::Mi < Lutaml::Model::Serializable
  include Base::Mi
end

# V4 leaf — adds V4-only attributes
class V4::Mi < Lutaml::Model::Serializable
  include Base::Mi
  attribute :intent, :string
end

# V3 container
class V3::Mrow < CommonElements
  include Base::Mrow
end

CommonElements

Container elements inherit CommonElements, which defines #{tag}_value collection attributes for all supported child elements. Attribute types use symbols (e.g., :mi, :mfrac) resolved through Lutaml::Model::GlobalContext.

V4’s CommonElements extends the base with the <a> hyperlink element.

Context and type resolution

When calling from_xml directly (outside of Mml.parse or Mml::V3.parse), pass the version-specific context id for correct type resolution.

Note
lutaml-model still uses the keyword name register: in low-level APIs. In MML, the value passed to that keyword should be a context id.
Mml::V3::Math.from_xml(input, register: Mml::V3::Configuration.context_id)
Mml::V4::Math.from_xml(input, register: Mml::V4::Configuration.context_id)

The parse methods handle this automatically.

When constructing container elements directly, also pass the context id on the instance via lutaml_register: so symbolic child types resolve in the right versioned context:

math = Mml::V3::Math.new(
  lutaml_register: Mml::V3::Configuration.context_id,
  mi_value: [Mml::V3::Mi.new(value: "x")]
)

math.to_xml

Namespace

All elements use the MathML namespace URI (http://www.w3.org/1998/Math/MathML). Three input forms are supported:

  • Default namespace: <math xmlns="http://www.w3.org/1998/Math/MathML">

  • Prefixed: <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">

  • No namespace: namespace is injected before parsing when namespace_exist: false

Configuration

# Switch XML adapter (default: :ox, :oga on Opal)
Mml::V3::Configuration.adapter = :nokogiri

# Access the built-in version-specific contexts
Mml::V3::Configuration.context_id # => :mml_v3
Mml::V4::Configuration.context_id # => :mml_v4
Mml::V3::Configuration.context
Mml::V4::Configuration.context

# Rebuild a built-in context after an explicit GlobalContext.reset!
Mml::V3::Configuration.populate_context!
Mml::V4::Configuration.populate_context!

# Create a derived context with substitutions
Mml::V3::Configuration.create_context(
  id: :custom_v3,
  substitutions: [
    { from_type: Mml::V3::Mi, to_type: MyCustomMi }
  ]
)

# Parse using the custom context
Mml::V3.parse(input, context: :custom_v3)

# Low-level APIs still use the upstream keyword name `register:`
Mml::V3::Math.from_xml(input, register: :custom_v3)

The context: keyword is the preferred MML API. The legacy register: keyword is still accepted temporarily in MML parse methods, but it emits a deprecation warning and is normalized to a context id internally.

If you reset global contexts and need the built-in MML contexts restored explicitly, call populate_context! for the version(s) you want to restore.

Unsupported Features

The following MathML test suite files are intentionally skipped (not failures) because they use features that are not part of the MathML namespace:

Feature Reason Approx. Tests Skipped

HTML attributes (style, class, id, dir, mode, tabindex, data-*, event handlers)

These are HTML/XML attributes, not MathML attributes

Varies by test suite

HTML <span> elements

HTML elements are not part of MathML namespace

Varies by test suite

XML comments

XML comments inside MathML elements are not supported

Varies by test suite

Foreign content in annotation-xml

SVG, XHTML content inside annotation-xml is not supported

Varies by test suite

Entity references

Named entity references other than standard XML entities

Varies by test suite

These tests are filtered out via UNSUPPORTED_PATTERNS in the test configuration and do not represent bugs. They are marked as "pending" in test output because RSpec’s skip directive still records them as pending tests.

Version-Specific Attributes

Some MathML attributes are version-specific and only available on the appropriate version:

Attribute Element Notes

overflow

math

MathML 3 only (line overflow behavior)

linebreakmultchar

math

MathML 3 only (line break character)

scriptsizemultiplier

mscarries

Float type for fractional scaling

Content Elements

The following Content MathML elements are supported for cross-content markup:

  • cn - numeric content

  • ci - identifier content

  • csymbol - symbolic content (with presentation element support: msub, msup, mrow, etc.)

  • cs - string content

  • cbytes - bytes content with encoding attribute

  • apply, bind, bvar - function application and binding

  • semantics, annotation, annotation-xml - semantic annotations

Test Suite and Fixtures

The gem uses multiple MathML test suites to validate parsing and serialization:

Test Suite Version Description

spec/fixtures/mml2-testsuite/

V2

W3C MathML 2 test suite with Content and Presentation elements

spec/fixtures/mml3-testsuite/

V3

W3C MathML 3 test suite

spec/fixtures/mmlcore-testsuite/

V4

WPT MathML Core tests (modern browser implementation)

spec/fixtures/v2/, spec/fixtures/v4/

V2/V4

Hand-crafted fixtures for version-specific features

Running Tests

bundle exec rake        # Run all specs + rubocop
bundle exec rspec       # Run all tests
bundle exec rspec spec/mml/v3_spec.rb  # Run specific test file
bundle exec rspec --only-failures       # Run only previously failing tests

Test Fixture Processing

Some test suites require preprocessing to extract clean MathML from HTML wrappers:

# Preprocess test fixtures (strips HTML wrappers, extracts MathML)
rake spec:preprocess_fixtures

# Validate preprocessed fixtures against XSD schemas
rake spec:validate_cleaned_fixtures

# Both in sequence
rake spec:prepare

The preprocessed fixtures are stored in tmp/cleaned_fixtures/ and are excluded from git.

Unsupported Test Patterns

Tests are filtered via UNSUPPORTED_PATTERNS when they contain:

  • HTML elements/attributes (<span>, style=, class=, etc.)

  • XML comments inside elements

  • Foreign content (SVG in annotation-xml)

  • Entity references not handled by the parser

  • Content elements not supported in presentation context (V3/V4)

These are marked as pending, not failures, because they represent features outside the MathML namespace.

Known Pending Issues

Issue Affected Tests

lutaml-model Unicode NCR comparison

V2 tests with spacing characters

Parser schema validation

V3 ErrorHandling tests (parser accepts invalid MathML)

Development

rake                    # Run specs + rubocop
bundle exec rspec       # Run tests
bundle exec rubocop     # Lint
bin/console             # Interactive IRB

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/plurimath/mml.

Copyright Ribose Inc.