Organic - Free Range Org-Mode
Organic is an emacs-less implementation of an org-mode parser.
This project is still under HEAVY development. While the version remains v0.1.x the API will be changing often. Once we hit v0.2.x we will start following semver.
Currently, the parser is able to correctly identify the start/end bounds of all the org-mode objects and elements (except table.el tables, org-mode tables are supported) but many of the interior properties are not yet populated.
- We aim to provide perfect parity with the emacs org-mode parser. In that regard, any document that parses differently between Emacs and Organic is considered a bug.
- The parser should be fast. We're not doing anything special, but since this is written in Rust and natively compiled we should be able to beat the existing parsers.
- The parser should have minimal dependencies. This should reduce effort w.r.t.: security audits, legal compliance, portability.
- The parser should be usable everywhere. In the interest of getting org-mode used in as many places as possible, this parser should be usable by everyone everywhere. This means:
- It must have a permissive license for use in proprietary code bases.
- We will investigate compiling to WASM. This is an important goal of the project and will definitely happen, but only after the parser has a more stable API.
- We will investigate compiling to a C library for native linking to other code. This is more of a maybe-goal for the project.
- This project will not include an elisp engine since that would drastically increase the complexity of the code. Any features requiring an elisp engine will not be implemented (for example, Emacs supports embedded eval expressions in documents but this parser will never support that).
- This project is exclusively an org-mode parser. This limits its scope to roughly the output of
(org-element-parse-buffer). It will not render org-mode documents in other formats like HTML or LaTeX.
- table.el support. Currently we support org-mode tables but org-mode also allows table.el tables. So far, their use in org-mode documents seems rather uncommon so this is a low-priority feature.
- Document editing support. I do not anticipate any advanced editing features to make editing ergonomic, but it should be relatively easy to be able to parse an org-mode document and serialize it back into org-mode. This would enable cool features to be built on top of the library like auto-formatters. To accomplish this feature, We'd have to capture all of the various separators and whitespace that we are currently simply throwing away. This would add many additional fields to the parsed structs and it would add more noise to the parsers themselves, so I do not want to approach this feature until the parser is more complete since it would make modifications and refactoring more difficult.
This project targets the version of Emacs and Org-mode that are built into the organic-test docker image. This is newer than the version of Org-mode that shipped with Emacs 29.1. The parser itself does not depend on Emacs or Org-mode though, so this only matters for development purposes when running the automated tests that compare against upstream Org-mode.
Using this library
TODO: Add section on using Organic as a library (which is the intended use for this project). This will be added when we have a bit more API stability since currently the library is under heavy development.
The parse binary
This program takes org-mode input either streamed in on stdin or as paths to files passed in as arguments. It then parses them using Organic and dumps the result to stdout. This program is intended solely as a development tool. Examples:
cat /foo/bar.org | cargo run --bin parse
cargo build --profile release-lto ./target/release-lto/parse /foo/bar.org /lorem/ipsum.org
The compare binary
This program takes org-mode input either streamed in on stdin or as paths to files passed in as arguments. It then parses them using Organic and the official Emacs Org-mode parser and compares the parse result. This program is intended solely as a development tool. Since org-mode is a moving target, it is recommended that you run this through docker since we pin the version of org-mode to a specific revision. Examples:
cat /foo/bar.org | ./scripts/run_docker_compare.bash
./scripts/run_docker_compare.bash /foo/bar.org /lorem/ipsum.org
Not recommended since it is not through docker:
cat /foo/bar.org | cargo run --features compare --bin compare
cargo build --profile release-lto --features compare ./target/release-lto/compare /foo/bar.org /lorem/ipsum.org
Running the tests
There are three levels of tests for this repository: the standard tests, the autogenerated tests, and the foreign document tests.
The standard tests
These are regular hand-written rust tests. These can be run with:
The auto-generated tests
These tests are automatically generated from the files in the
org_mode_samples directory and they are still integrated with the rust/cargo testing framework. For each org-mode document in that folder, a test is generated that will parse the document with both Organic and the official Emacs Org-mode parser and then it will compare the parse results. Any deviation is considered a failure. Since org-mode is a moving target, it is recommended that you run these tests inside docker since the
organic-test docker image is pinned to a specific revision of org-mode. These can be run with:
The foreign document tests
These tests function the same as the auto-generated tests except they are not integrated with the rust/cargo testing framework and they involve comparing the parse of org-mode documents that live outside this repository. This allows us to test against a far greater variety of org-mode input documents without pulling massive sets of org-mode documents into this repository. The recommended way to run these tests is still through docker because it pins org-mode and the test documents to specific git revisions. These can be run with:
This project is released under the public-domain-equivalent 0BSD license, however, this project has a couple permissively licensed non-public-domain-equivalent dependencies which require their copyright notices and/or license texts to be included. I am not a lawyer and this is not legal advice but it is my layperson's understanding that if you distribute a binary statically linking this library, you will need to abide by their terms since their code will also be linked in your binary.