17 Commits

Author SHA1 Message Date
Tom Alexander
dd91e506bd Merge branch 'scan_optimization'
All checks were successful
rustfmt Build rustfmt has succeeded
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has succeeded
2023-09-24 03:11:46 -04:00
Tom Alexander
cd781a7dcf Add simple test to prove the scan for in-buffer settings is still working. 2023-09-24 03:09:51 -04:00
Tom Alexander
8cd0e4ec63 Optimize scanning for in-buffer settings by scanning forward for possible keywords.
Previously we stepped through the document character by character which involved a lot of extra processing inside OrgSource. By scanning for possible keywords, we can skip many of the intermediate steps.
2023-09-24 02:58:32 -04:00
Tom Alexander
f9460b88d7 Add a TODO for a performance optimization. 2023-09-24 01:59:26 -04:00
Tom Alexander
0b2a5f4fbf Change all runtime asserts in private functions to debug_assert.
All checks were successful
rustfmt Build rustfmt has succeeded
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has succeeded
These functions aren't exposed to the public so we can confidently say that if they work in dev then they will work in production. Removing these asserts theoretically should result in a speedup.
2023-09-23 21:17:58 -04:00
Tom Alexander
6097e4df18 Merge branch 'standard_properties'
All checks were successful
rustfmt Build rustfmt has succeeded
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has succeeded
2023-09-23 21:12:40 -04:00
Tom Alexander
d5b1014fe4 Unify the standard properties checks in diff.
Instead of copy+pasting them into each compare function, we now call a shared function from a handful of places.
2023-09-23 21:05:56 -04:00
Tom Alexander
dd8a8207ce Move assert bounds for elements and objects (except PlainText) to the compare element/object functions. 2023-09-23 19:35:12 -04:00
Tom Alexander
b4c985071c Add a GetStandardProperties trait. 2023-09-23 19:13:01 -04:00
Tom Alexander
d4f27ef297 Remove only use of Source trait. 2023-09-23 17:59:13 -04:00
Tom Alexander
f25246556c Rename the existing StandardProperties struct to EmacsStandardProperties. 2023-09-23 17:44:54 -04:00
Tom Alexander
3fe56e9aa3 Implement StandardProperties for all the AST nodes and restrict the Source trait to this crate.
Currently this is a copy of the Source trait but it will grow to more functions. The Source trait is restricted to this crate in anticipation of its removal in favor of StandardProperties.
2023-09-23 17:42:27 -04:00
Tom Alexander
f180412ff3 Introduce a StandardProperties trait. 2023-09-23 17:33:46 -04:00
Tom Alexander
f0e28206ff Add a supported versions section to the README.
All checks were successful
rustfmt Build rustfmt has succeeded
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has succeeded
2023-09-23 14:54:56 -04:00
Tom Alexander
1f64e289a2 Add TODOs for all of the properties that need to be compared.
Some checks failed
rust-foreign-document-test Build rust-foreign-document-test has started
rustfmt Build rustfmt has failed
rust-build Build rust-build has failed
rust-test Build rust-test has failed
2023-09-23 14:46:36 -04:00
Tom Alexander
f7690ff64b Remove an allocation for lesser block end. 2023-09-22 00:55:10 -04:00
Tom Alexander
bd5e50d558 Remove TODO.
All checks were successful
rustfmt Build rustfmt has succeeded
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has succeeded
I tested and we cannot nest different types of dynamic blocks.
2023-09-21 23:58:41 -04:00
28 changed files with 1271 additions and 1019 deletions

View File

@@ -4,10 +4,31 @@ Organic is an emacs-less implementation of an [org-mode](https://orgmode.org/) p
## Project Status
This project is a personal learning project to grow my experience in [rust](https://www.rust-lang.org/). It is under development and at this time I would not recommend anyone use this code. The goal is to turn this into a project others can use, at which point more information will appear in this README.
This project is still under HEAVY development. While the version remains v0.1.x the API will be changing often. Once we hit v0.2.x we will start following semver.
Currently, the parser is able to correctly identify the start/end bounds of all the org-mode objects and elements (except table.el tables, org-mode tables are supported) but many of the interior properties are not yet populated.
### Project Goals
- We aim to provide perfect parity with the emacs org-mode parser. In that regard, any document that parses differently between Emacs and Organic is considered a bug.
- The parser should be fast. We're not doing anything special, but since this is written in Rust and natively compiled we should be able to beat the existing parsers.
- The parser should have minimal dependencies. This should reduce effort w.r.t.: security audits, legal compliance, portability.
- The parser should be usable everywhere. In the interest of getting org-mode used in as many places as possible, this parser should be usable by everyone everywhere. This means:
- It must have a permissive license for use in proprietary code bases.
- We will investigate compiling to WASM. This is an important goal of the project and will definitely happen, but only after the parser has a more stable API.
- We will investigate compiling to a C library for native linking to other code. This is more of a maybe-goal for the project.
### Project Non-Goals
- This project will not include an elisp engine since that would drastically increase the complexity of the code. Any features requiring an elisp engine will not be implemented (for example, Emacs supports embedded eval expressions in documents but this parser will never support that).
- This project is exclusively an org-mode **parser**. This limits its scope to roughly the output of `(org-element-parse-buffer)`. It will not render org-mode documents in other formats like HTML or LaTeX.
### Project Maybe-Goals
- table.el support. Currently we support org-mode tables but org-mode also allows table.el tables. So far, their use in org-mode documents seems rather uncommon so this is a low-priority feature.
- Document editing support. I do not anticipate any advanced editing features to make editing ergonomic, but it should be relatively easy to be able to parse an org-mode document and serialize it back into org-mode. This would enable cool features to be built on top of the library like auto-formatters. To accomplish this feature, We'd have to capture all of the various separators and whitespace that we are currently simply throwing away. This would add many additional fields to the parsed structs and it would add more noise to the parsers themselves, so I do not want to approach this feature until the parser is more complete since it would make modifications and refactoring more difficult.
### Supported Versions
This project targets the version of Emacs and Org-mode that are built into the [organic-test docker image](docker/organic_test/Dockerfile). This is newer than the version of Org-mode that shipped with Emacs 29.1. The parser itself does not depend on Emacs or Org-mode though, so this only matters for development purposes when running the automated tests that compare against upstream Org-mode.
## Using this library
TODO: Add section on using Organic as a library (which is the intended use for this project).
TODO: Add section on using Organic as a library (which is the intended use for this project). This will be added when we have a bit more API stability since currently the library is under heavy development.
## Development
### The parse binary
This program takes org-mode input either streamed in on stdin or as paths to files passed in as arguments. It then parses them using Organic and dumps the result to stdout. This program is intended solely as a development tool. Examples:

File diff suppressed because it is too large Load Diff

474
src/compare/elisp_fact.rs Normal file
View File

@@ -0,0 +1,474 @@
use std::borrow::Cow;
use crate::types::AngleLink;
use crate::types::BabelCall;
use crate::types::Bold;
use crate::types::Citation;
use crate::types::CitationReference;
use crate::types::Clock;
use crate::types::Code;
use crate::types::Comment;
use crate::types::CommentBlock;
use crate::types::DiarySexp;
use crate::types::Document;
use crate::types::Drawer;
use crate::types::DynamicBlock;
use crate::types::Element;
use crate::types::Entity;
use crate::types::ExampleBlock;
use crate::types::ExportBlock;
use crate::types::ExportSnippet;
use crate::types::FixedWidthArea;
use crate::types::FootnoteDefinition;
use crate::types::FootnoteReference;
use crate::types::GreaterBlock;
use crate::types::Heading;
use crate::types::HorizontalRule;
use crate::types::InlineBabelCall;
use crate::types::InlineSourceBlock;
use crate::types::Italic;
use crate::types::Keyword;
use crate::types::LatexEnvironment;
use crate::types::LatexFragment;
use crate::types::LineBreak;
use crate::types::NodeProperty;
use crate::types::Object;
use crate::types::OrgMacro;
use crate::types::Paragraph;
use crate::types::PlainLink;
use crate::types::PlainList;
use crate::types::PlainListItem;
use crate::types::PlainText;
use crate::types::Planning;
use crate::types::PropertyDrawer;
use crate::types::RadioLink;
use crate::types::RadioTarget;
use crate::types::RegularLink;
use crate::types::Section;
use crate::types::SrcBlock;
use crate::types::StatisticsCookie;
use crate::types::StrikeThrough;
use crate::types::Subscript;
use crate::types::Superscript;
use crate::types::Table;
use crate::types::TableCell;
use crate::types::TableRow;
use crate::types::Target;
use crate::types::Timestamp;
use crate::types::Underline;
use crate::types::Verbatim;
use crate::types::VerseBlock;
pub(crate) trait ElispFact<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str>;
}
pub(crate) trait GetElispFact<'s> {
fn get_elisp_fact(&'s self) -> &'s dyn ElispFact<'s>;
}
impl<'s, I: ElispFact<'s>> GetElispFact<'s> for I {
fn get_elisp_fact(&'s self) -> &'s dyn ElispFact<'s> {
self
}
}
impl<'s> GetElispFact<'s> for Element<'s> {
fn get_elisp_fact(&'s self) -> &'s dyn ElispFact<'s> {
match self {
Element::Paragraph(inner) => inner,
Element::PlainList(inner) => inner,
Element::GreaterBlock(inner) => inner,
Element::DynamicBlock(inner) => inner,
Element::FootnoteDefinition(inner) => inner,
Element::Comment(inner) => inner,
Element::Drawer(inner) => inner,
Element::PropertyDrawer(inner) => inner,
Element::Table(inner) => inner,
Element::VerseBlock(inner) => inner,
Element::CommentBlock(inner) => inner,
Element::ExampleBlock(inner) => inner,
Element::ExportBlock(inner) => inner,
Element::SrcBlock(inner) => inner,
Element::Clock(inner) => inner,
Element::DiarySexp(inner) => inner,
Element::Planning(inner) => inner,
Element::FixedWidthArea(inner) => inner,
Element::HorizontalRule(inner) => inner,
Element::Keyword(inner) => inner,
Element::BabelCall(inner) => inner,
Element::LatexEnvironment(inner) => inner,
}
}
}
impl<'s> GetElispFact<'s> for Object<'s> {
fn get_elisp_fact(&'s self) -> &'s dyn ElispFact<'s> {
match self {
Object::Bold(inner) => inner,
Object::Italic(inner) => inner,
Object::Underline(inner) => inner,
Object::StrikeThrough(inner) => inner,
Object::Code(inner) => inner,
Object::Verbatim(inner) => inner,
Object::PlainText(inner) => inner,
Object::RegularLink(inner) => inner,
Object::RadioLink(inner) => inner,
Object::RadioTarget(inner) => inner,
Object::PlainLink(inner) => inner,
Object::AngleLink(inner) => inner,
Object::OrgMacro(inner) => inner,
Object::Entity(inner) => inner,
Object::LatexFragment(inner) => inner,
Object::ExportSnippet(inner) => inner,
Object::FootnoteReference(inner) => inner,
Object::Citation(inner) => inner,
Object::CitationReference(inner) => inner,
Object::InlineBabelCall(inner) => inner,
Object::InlineSourceBlock(inner) => inner,
Object::LineBreak(inner) => inner,
Object::Target(inner) => inner,
Object::StatisticsCookie(inner) => inner,
Object::Subscript(inner) => inner,
Object::Superscript(inner) => inner,
Object::Timestamp(inner) => inner,
}
}
}
impl<'s> ElispFact<'s> for Document<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"org-data".into()
}
}
impl<'s> ElispFact<'s> for Section<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"section".into()
}
}
impl<'s> ElispFact<'s> for Heading<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"headline".into()
}
}
impl<'s> ElispFact<'s> for PlainList<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"plain-list".into()
}
}
impl<'s> ElispFact<'s> for PlainListItem<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"item".into()
}
}
impl<'s> ElispFact<'s> for GreaterBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
match self.name.to_lowercase().as_str() {
"center" => "center-block".into(),
"quote" => "quote-block".into(),
_ => "special-block".into(),
}
}
}
impl<'s> ElispFact<'s> for DynamicBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"dynamic-block".into()
}
}
impl<'s> ElispFact<'s> for FootnoteDefinition<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"footnote-definition".into()
}
}
impl<'s> ElispFact<'s> for Drawer<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"drawer".into()
}
}
impl<'s> ElispFact<'s> for PropertyDrawer<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"property-drawer".into()
}
}
impl<'s> ElispFact<'s> for NodeProperty<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"node-property".into()
}
}
impl<'s> ElispFact<'s> for Table<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"table".into()
}
}
impl<'s> ElispFact<'s> for TableRow<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"table-row".into()
}
}
impl<'s> ElispFact<'s> for Paragraph<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"paragraph".into()
}
}
impl<'s> ElispFact<'s> for TableCell<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"table-cell".into()
}
}
impl<'s> ElispFact<'s> for Comment<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"comment".into()
}
}
impl<'s> ElispFact<'s> for VerseBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"verse-block".into()
}
}
impl<'s> ElispFact<'s> for CommentBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"comment-block".into()
}
}
impl<'s> ElispFact<'s> for ExampleBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"example-block".into()
}
}
impl<'s> ElispFact<'s> for ExportBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"export-block".into()
}
}
impl<'s> ElispFact<'s> for SrcBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"src-block".into()
}
}
impl<'s> ElispFact<'s> for Clock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"clock".into()
}
}
impl<'s> ElispFact<'s> for DiarySexp<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"diary-sexp".into()
}
}
impl<'s> ElispFact<'s> for Planning<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"planning".into()
}
}
impl<'s> ElispFact<'s> for FixedWidthArea<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"fixed-width".into()
}
}
impl<'s> ElispFact<'s> for HorizontalRule<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"horizontal-rule".into()
}
}
impl<'s> ElispFact<'s> for Keyword<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"keyword".into()
}
}
impl<'s> ElispFact<'s> for BabelCall<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"babel-call".into()
}
}
impl<'s> ElispFact<'s> for LatexEnvironment<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"latex-environment".into()
}
}
impl<'s> ElispFact<'s> for Bold<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"bold".into()
}
}
impl<'s> ElispFact<'s> for Italic<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"italic".into()
}
}
impl<'s> ElispFact<'s> for Underline<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"underline".into()
}
}
impl<'s> ElispFact<'s> for StrikeThrough<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"strike-through".into()
}
}
impl<'s> ElispFact<'s> for Code<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"code".into()
}
}
impl<'s> ElispFact<'s> for Verbatim<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"verbatim".into()
}
}
impl<'s> ElispFact<'s> for RegularLink<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"link".into()
}
}
impl<'s> ElispFact<'s> for RadioLink<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"link".into()
}
}
impl<'s> ElispFact<'s> for RadioTarget<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"radio-target".into()
}
}
impl<'s> ElispFact<'s> for PlainLink<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"link".into()
}
}
impl<'s> ElispFact<'s> for AngleLink<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"link".into()
}
}
impl<'s> ElispFact<'s> for OrgMacro<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"macro".into()
}
}
impl<'s> ElispFact<'s> for Entity<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"entity".into()
}
}
impl<'s> ElispFact<'s> for LatexFragment<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"latex-fragment".into()
}
}
impl<'s> ElispFact<'s> for ExportSnippet<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"export-snippet".into()
}
}
impl<'s> ElispFact<'s> for FootnoteReference<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"footnote-reference".into()
}
}
impl<'s> ElispFact<'s> for Citation<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"citation".into()
}
}
impl<'s> ElispFact<'s> for CitationReference<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"citation-reference".into()
}
}
impl<'s> ElispFact<'s> for InlineBabelCall<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"inline-babel-call".into()
}
}
impl<'s> ElispFact<'s> for InlineSourceBlock<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"inline-src-block".into()
}
}
impl<'s> ElispFact<'s> for LineBreak<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"line-break".into()
}
}
impl<'s> ElispFact<'s> for Target<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"target".into()
}
}
impl<'s> ElispFact<'s> for StatisticsCookie<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"statistics-cookie".into()
}
}
impl<'s> ElispFact<'s> for Subscript<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"subscript".into()
}
}
impl<'s> ElispFact<'s> for Superscript<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"superscript".into()
}
}
impl<'s> ElispFact<'s> for Timestamp<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
"timestamp".into()
}
}
impl<'s> ElispFact<'s> for PlainText<'s> {
fn get_elisp_name(&'s self) -> Cow<'s, str> {
// plain text from upstream emacs does not actually have a name but this is included here to make rendering the status diff easier.
"plain-text".into()
}
}

View File

@@ -1,5 +1,6 @@
mod compare;
mod diff;
mod elisp_fact;
mod parse;
mod sexp;
mod util;

View File

@@ -106,8 +106,8 @@ fn is_slice_of(parent: &str, child: &str) -> bool {
}
/// Get a slice of the string that was consumed in a parser using the original input to the parser and the remaining input after the parser.
pub fn get_consumed<'s>(input: &'s str, remaining: &'s str) -> &'s str {
assert!(is_slice_of(input, remaining));
fn get_consumed<'s>(input: &'s str, remaining: &'s str) -> &'s str {
debug_assert!(is_slice_of(input, remaining));
let source = {
let offset = remaining.as_ptr() as usize - input.as_ptr() as usize;
&input[..offset]

View File

@@ -1,5 +1,7 @@
use super::elisp_fact::GetElispFact;
use super::sexp::Token;
use crate::types::Source;
use crate::types::GetStandardProperties;
use crate::types::StandardProperties;
/// Check if the child string slice is a slice of the parent string slice.
fn is_slice_of(parent: &str, child: &str) -> bool {
@@ -10,21 +12,38 @@ fn is_slice_of(parent: &str, child: &str) -> bool {
child_start >= parent_start && child_end <= parent_end
}
/// Get the offset into source that the rust object exists at.
/// Get the byte offset into source that the rust object exists at.
///
/// These offsets are zero-based unlike the elisp ones.
fn get_offsets<'s, S: Source<'s>>(source: &'s str, rust_object: &'s S) -> (usize, usize) {
let rust_object_source = rust_object.get_source();
assert!(is_slice_of(source, rust_object_source));
let offset = rust_object_source.as_ptr() as usize - source.as_ptr() as usize;
fn get_rust_byte_offsets<'s, S: StandardProperties<'s> + ?Sized>(
original_document: &'s str,
rust_ast_node: &'s S,
) -> (usize, usize) {
let rust_object_source = rust_ast_node.get_source();
debug_assert!(is_slice_of(original_document, rust_object_source));
let offset = rust_object_source.as_ptr() as usize - original_document.as_ptr() as usize;
let end = offset + rust_object_source.len();
(offset, end)
}
pub(crate) fn assert_name<'s>(
pub(crate) fn compare_standard_properties<
's,
S: GetStandardProperties<'s> + GetElispFact<'s> + ?Sized,
>(
original_document: &'s str,
emacs: &'s Token<'s>,
name: &str,
rust: &'s S,
) -> Result<(), Box<dyn std::error::Error>> {
assert_name(emacs, rust.get_elisp_fact().get_elisp_name())?;
assert_bounds(original_document, emacs, rust.get_standard_properties())?;
Ok(())
}
pub(crate) fn assert_name<'s, S: AsRef<str>>(
emacs: &'s Token<'s>,
name: S,
) -> Result<(), Box<dyn std::error::Error>> {
let name = name.as_ref();
let children = emacs.as_list()?;
let first_child = children
.first()
@@ -32,7 +51,7 @@ pub(crate) fn assert_name<'s>(
.as_atom()?;
if first_child != name {
Err(format!(
"Expected a {expected} cell, but found a {found} cell.",
"AST node name mismatch. Expected a (rust) {expected} cell, but found a (emacs) {found} cell.",
expected = name,
found = first_child
))?;
@@ -40,30 +59,33 @@ pub(crate) fn assert_name<'s>(
Ok(())
}
pub(crate) fn assert_bounds<'s, S: Source<'s>>(
source: &'s str,
/// Assert that the character ranges defined by upstream org-mode's :standard-properties match the slices in Organic's StandardProperties.
///
/// This does **not** handle plain text because plain text is a special case.
pub(crate) fn assert_bounds<'s, S: StandardProperties<'s> + ?Sized>(
original_document: &'s str,
emacs: &'s Token<'s>,
rust: &'s S,
) -> Result<(), Box<dyn std::error::Error>> {
let standard_properties = get_standard_properties(emacs)?;
let standard_properties = get_emacs_standard_properties(emacs)?; // 1-based
let (begin, end) = (
standard_properties
.begin
.ok_or("Token should have a begin.")?,
standard_properties.end.ok_or("Token should have an end.")?,
);
let (rust_begin, rust_end) = get_offsets(source, rust);
let rust_begin_char_offset = (&source[..rust_begin]).chars().count();
let (rust_begin, rust_end) = get_rust_byte_offsets(original_document, rust); // 0-based
let rust_begin_char_offset = (&original_document[..rust_begin]).chars().count() + 1; // 1-based
let rust_end_char_offset =
rust_begin_char_offset + (&source[rust_begin..rust_end]).chars().count();
if (rust_begin_char_offset + 1) != begin || (rust_end_char_offset + 1) != end {
Err(format!("Rust bounds (in chars) ({rust_begin}, {rust_end}) do not match emacs bounds ({emacs_begin}, {emacs_end})", rust_begin = rust_begin_char_offset + 1, rust_end = rust_end_char_offset + 1, emacs_begin=begin, emacs_end=end))?;
rust_begin_char_offset + (&original_document[rust_begin..rust_end]).chars().count(); // 1-based
if rust_begin_char_offset != begin || rust_end_char_offset != end {
Err(format!("Rust bounds (in chars) ({rust_begin}, {rust_end}) do not match emacs bounds ({emacs_begin}, {emacs_end})", rust_begin = rust_begin_char_offset, rust_end = rust_end_char_offset, emacs_begin=begin, emacs_end=end))?;
}
Ok(())
}
struct StandardProperties {
struct EmacsStandardProperties {
begin: Option<usize>,
#[allow(dead_code)]
post_affiliated: Option<usize>,
@@ -76,9 +98,9 @@ struct StandardProperties {
post_blank: Option<usize>,
}
fn get_standard_properties<'s>(
fn get_emacs_standard_properties<'s>(
emacs: &'s Token<'s>,
) -> Result<StandardProperties, Box<dyn std::error::Error>> {
) -> Result<EmacsStandardProperties, Box<dyn std::error::Error>> {
let children = emacs.as_list()?;
let attributes_child = children
.iter()
@@ -97,7 +119,7 @@ fn get_standard_properties<'s>(
let contents_end = maybe_token_to_usize(std_props.next())?;
let end = maybe_token_to_usize(std_props.next())?;
let post_blank = maybe_token_to_usize(std_props.next())?;
StandardProperties {
EmacsStandardProperties {
begin,
post_affiliated,
contents_begin,
@@ -116,7 +138,7 @@ fn get_standard_properties<'s>(
maybe_token_to_usize(attributes_map.get(":post-blank").map(|token| *token))?;
let post_affiliated =
maybe_token_to_usize(attributes_map.get(":post-affiliated").map(|token| *token))?;
StandardProperties {
EmacsStandardProperties {
begin,
post_affiliated,
contents_begin,

View File

@@ -6,6 +6,7 @@ pub(crate) struct List<'parent, T> {
parent: Link<'parent, T>,
}
// TODO: Should I be defining a lifetime for T in the generics here? Ref: https://quinedot.github.io/rust-learning/dyn-elision-advanced.html#iteraction-with-type-aliases
type Link<'parent, T> = Option<&'parent List<'parent, T>>;
impl<'parent, T> List<'parent, T> {

View File

@@ -187,7 +187,7 @@ mod tests {
use crate::context::List;
use crate::parser::element_parser::element;
use crate::types::Element;
use crate::types::Source;
use crate::types::GetStandardProperties;
#[test]
fn citation_simple() {
@@ -202,7 +202,10 @@ mod tests {
_ => panic!("Should be a paragraph!"),
};
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(first_paragraph.get_source(), "[cite:@foo]");
assert_eq!(
first_paragraph.get_standard_properties().get_source(),
"[cite:@foo]"
);
assert_eq!(first_paragraph.children.len(), 1);
assert_eq!(
first_paragraph

View File

@@ -40,7 +40,6 @@ pub(crate) fn dynamic_block<'b, 'g, 'r, 's>(
context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, DynamicBlock<'s>> {
// TODO: Do I need to differentiate between different dynamic block types.
if immediate_in_section(context, "dynamic block") {
return Err(nom::Err::Error(CustomError::MyError(MyError(
"Cannot nest objects of the same element".into(),

View File

@@ -107,6 +107,7 @@ fn _element<'b, 'g, 'r, 's>(
match map(paragraph_matcher, Element::Paragraph)(remaining) {
the_ok @ Ok(_) => the_ok,
Err(_) => {
// TODO: Because this function expects a single element, if there are multiple affiliated keywords before an element that cannot have affiliated keywords, we end up re-parsing the affiliated keywords many times.
affiliated_keywords.clear();
map(affiliated_keyword_matcher, Element::Keyword)(input)
}

View File

@@ -129,7 +129,7 @@ mod tests {
use crate::context::Context;
use crate::context::GlobalSettings;
use crate::context::List;
use crate::types::Source;
use crate::types::GetStandardProperties;
#[test]
fn two_paragraphs() {
@@ -150,13 +150,17 @@ line footnote.",
footnote_definition_matcher(remaining).expect("Parse second footnote_definition.");
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(
first_footnote_definition.get_source(),
first_footnote_definition
.get_standard_properties()
.get_source(),
"[fn:1] A footnote.
"
);
assert_eq!(
second_footnote_definition.get_source(),
second_footnote_definition
.get_standard_properties()
.get_source(),
"[fn:2] A multi-
line footnote."
@@ -181,7 +185,9 @@ not in the footnote.",
footnote_definition_matcher(input).expect("Parse first footnote_definition");
assert_eq!(Into::<&str>::into(remaining), "not in the footnote.");
assert_eq!(
first_footnote_definition.get_source(),
first_footnote_definition
.get_standard_properties()
.get_source(),
"[fn:2] A multi-
line footnote.

View File

@@ -1,17 +1,15 @@
use nom::branch::alt;
use nom::bytes::complete::is_not;
use nom::bytes::complete::tag_no_case;
use nom::character::complete::anychar;
use nom::bytes::complete::take_until;
use nom::character::complete::space1;
use nom::combinator::map;
use nom::multi::many0;
use nom::multi::many_till;
use nom::multi::separated_list0;
use super::keyword::filtered_keyword;
use super::keyword_todo::todo_keywords;
use super::OrgSource;
use crate::context::HeadlineLevelFilter;
use crate::error::CustomError;
use crate::error::Res;
use crate::types::Keyword;
use crate::GlobalSettings;
@@ -20,13 +18,40 @@ use crate::GlobalSettings;
pub(crate) fn scan_for_in_buffer_settings<'s>(
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, Vec<Keyword<'s>>> {
// TODO: Optimization idea: since this is slicing the OrgSource at each character, it might be more efficient to do a parser that uses a search function like take_until, and wrap it in a function similar to consumed but returning the input along with the normal output, then pass all of that into a verify that confirms we were at the start of a line using the input we just returned.
// TODO: Write some tests to make sure this is functioning properly.
let keywords = many0(map(
many_till(anychar, filtered_keyword(in_buffer_settings_key)),
|(_, kw)| kw,
))(input);
keywords
let mut keywords = Vec::new();
let mut remaining = input;
loop {
// Skip text until possible in_buffer_setting
let start_of_pound = take_until::<_, _, CustomError<_>>("#+")(remaining);
let start_of_pound = if let Ok((start_of_pound, _)) = start_of_pound {
start_of_pound
} else {
break;
};
// Go backwards to the start of the line and run the filtered_keyword parser
let start_of_line = start_of_pound.get_start_of_line();
let (remain, maybe_kw) = match filtered_keyword(in_buffer_settings_key)(start_of_line) {
Ok((remain, kw)) => (remain, Some(kw)),
Err(_) => {
let end_of_line = take_until::<_, _, CustomError<_>>("\n")(start_of_pound);
if let Ok((end_of_line, _)) = end_of_line {
(end_of_line, None)
} else {
break;
}
}
};
if let Some(kw) = maybe_kw {
keywords.push(kw);
}
remaining = remain;
}
Ok((remaining, keywords))
}
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
@@ -88,3 +113,33 @@ pub(crate) fn apply_in_buffer_settings<'g, 's, 'sf>(
Ok(new_settings)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn scan_test() -> Result<(), Box<dyn std::error::Error>> {
let input = OrgSource::new(
r#"
foo
#+archive: bar
baz #+category: lorem
#+label: ipsum
#+todo: dolar
cat
"#,
);
let (remaining, settings) = scan_for_in_buffer_settings(input)?;
assert_eq!(Into::<&str>::into(remaining), "cat\n");
let keys: Vec<_> = settings.iter().map(|kw| kw.key).collect();
// category is skipped because it is not the first non-whitespace on the line.
//
// label is skipped because it is not an in-buffer setting.
assert_eq!(keys, vec!["archive", "todo"]);
Ok(())
}
}

View File

@@ -24,6 +24,7 @@ use crate::error::CustomError;
use crate::error::MyError;
use crate::error::Res;
use crate::parser::util::start_of_line;
use crate::types::BabelCall;
use crate::types::Keyword;
const ORG_ELEMENT_AFFILIATED_KEYWORDS: [&'static str; 13] = [
@@ -103,8 +104,16 @@ pub(crate) fn affiliated_keyword<'b, 'g, 'r, 's>(
pub(crate) fn babel_call_keyword<'b, 'g, 'r, 's>(
_context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, Keyword<'s>> {
filtered_keyword(babel_call_key)(input)
) -> Res<OrgSource<'s>, BabelCall<'s>> {
let (remaining, kw) = filtered_keyword(babel_call_key)(input)?;
Ok((
remaining,
BabelCall {
source: kw.source,
key: kw.key,
value: kw.value,
},
))
}
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]

View File

@@ -261,11 +261,10 @@ fn data<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> {
is_not("\r\n")(input)
}
fn lesser_block_end(current_name: &str) -> impl ContextMatcher {
let current_name_lower = current_name.to_lowercase();
move |context, input: OrgSource<'_>| {
_lesser_block_end(context, input, current_name_lower.as_str())
}
fn lesser_block_end<'c>(current_name: &'c str) -> impl ContextMatcher + 'c {
// Since the lesser block names are statically defined in code, we can simply assert that the name is lowercase instead of causing an allocation by converting to lowercase.
debug_assert!(current_name == current_name.to_lowercase());
move |context, input: OrgSource<'_>| _lesser_block_end(context, input, current_name)
}
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]

View File

@@ -1,6 +1,7 @@
use std::ops::RangeBounds;
use nom::Compare;
use nom::FindSubstring;
use nom::InputIter;
use nom::InputLength;
use nom::InputTake;
@@ -72,11 +73,60 @@ impl<'s> OrgSource<'s> {
}
pub(crate) fn get_until(&self, other: OrgSource<'s>) -> OrgSource<'s> {
assert!(other.start >= self.start);
assert!(other.end <= self.end);
debug_assert!(other.start >= self.start);
debug_assert!(other.end <= self.end);
self.slice(..(other.start - self.start))
}
pub(crate) fn get_start_of_line(&self) -> OrgSource<'s> {
let skipped_text = self.text_since_line_break();
let mut bracket_depth = self.bracket_depth;
let mut brace_depth = self.brace_depth;
let mut parenthesis_depth = self.parenthesis_depth;
// Since we're going backwards, this does the opposite.
for byte in skipped_text.bytes() {
match byte {
b'\n' => {
panic!("Should not hit a line break when only going back to the start of the line.");
}
b'[' => {
bracket_depth -= 1;
}
b']' => {
bracket_depth += 1;
}
b'{' => {
brace_depth -= 1;
}
b'}' => {
brace_depth += 1;
}
b'(' => {
parenthesis_depth -= 1;
}
b')' => {
parenthesis_depth += 1;
}
_ => {}
};
}
OrgSource {
full_source: self.full_source,
start: self.start_of_line,
end: self.end,
start_of_line: self.start_of_line,
preceding_character: if self.start_of_line > 0 {
Some('\n')
} else {
None
},
bracket_depth,
brace_depth,
parenthesis_depth,
}
}
pub(crate) fn get_bracket_depth(&self) -> BracketDepth {
self.bracket_depth
}
@@ -310,6 +360,12 @@ impl<'s> InputTakeAtPosition for OrgSource<'s> {
}
}
impl<'n, 's> FindSubstring<&'n str> for OrgSource<'s> {
fn find_substring(&self, substr: &'n str) -> Option<usize> {
Into::<&str>::into(self).find(substr)
}
}
pub(crate) fn convert_error<'a, I: Into<CustomError<&'a str>>>(
err: nom::Err<I>,
) -> nom::Err<CustomError<&'a str>> {

View File

@@ -74,7 +74,7 @@ mod tests {
use crate::context::List;
use crate::parser::element_parser::element;
use crate::parser::org_source::OrgSource;
use crate::types::Source;
use crate::types::GetStandardProperties;
#[test]
fn two_paragraphs() {
@@ -87,7 +87,13 @@ mod tests {
let (remaining, second_paragraph) =
paragraph_matcher(remaining).expect("Parse second paragraph.");
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(first_paragraph.get_source(), "foo bar baz\n\n");
assert_eq!(second_paragraph.get_source(), "lorem ipsum");
assert_eq!(
first_paragraph.get_standard_properties().get_source(),
"foo bar baz\n\n"
);
assert_eq!(
second_paragraph.get_standard_properties().get_source(),
"lorem ipsum"
);
}
}

View File

@@ -445,7 +445,7 @@ mod tests {
use crate::context::Context;
use crate::context::GlobalSettings;
use crate::context::List;
use crate::types::Source;
use crate::types::GetStandardProperties;
#[test]
fn plain_list_item_empty() {
@@ -456,7 +456,7 @@ mod tests {
let plain_list_item_matcher = parser_with_context!(plain_list_item)(&initial_context);
let (remaining, result) = plain_list_item_matcher(input).unwrap();
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(result.source, "1.");
assert_eq!(result.get_standard_properties().get_source(), "1.");
}
#[test]
@@ -468,7 +468,7 @@ mod tests {
let plain_list_item_matcher = parser_with_context!(plain_list_item)(&initial_context);
let (remaining, result) = plain_list_item_matcher(input).unwrap();
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(result.source, "1. foo");
assert_eq!(result.get_standard_properties().get_source(), "1. foo");
}
#[test]
@@ -480,7 +480,7 @@ mod tests {
let plain_list_matcher = parser_with_context!(plain_list)(&initial_context);
let (remaining, result) = plain_list_matcher(input).unwrap();
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(result.source, "1.");
assert_eq!(result.get_standard_properties().get_source(), "1.");
}
#[test]
@@ -492,7 +492,7 @@ mod tests {
let plain_list_matcher = parser_with_context!(plain_list)(&initial_context);
let (remaining, result) = plain_list_matcher(input).unwrap();
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(result.source, "1. foo");
assert_eq!(result.get_standard_properties().get_source(), "1. foo");
}
#[test]
@@ -539,7 +539,7 @@ mod tests {
plain_list_matcher(input).expect("Should parse the plain list successfully.");
assert_eq!(Into::<&str>::into(remaining), " ipsum\n");
assert_eq!(
result.get_source(),
result.get_standard_properties().get_source(),
r#"1. foo
2. bar
baz
@@ -567,7 +567,7 @@ baz"#,
plain_list_matcher(input).expect("Should parse the plain list successfully.");
assert_eq!(Into::<&str>::into(remaining), "baz");
assert_eq!(
result.get_source(),
result.get_standard_properties().get_source(),
r#"1. foo
1. bar
@@ -600,7 +600,7 @@ dolar"#,
plain_list_matcher(input).expect("Should parse the plain list successfully.");
assert_eq!(Into::<&str>::into(remaining), "dolar");
assert_eq!(
result.get_source(),
result.get_standard_properties().get_source(),
r#"1. foo
bar

View File

@@ -146,7 +146,7 @@ mod tests {
use crate::context::GlobalSettings;
use crate::context::List;
use crate::parser::object_parser::detect_standard_set_object_sans_plain_text;
use crate::types::Source;
use crate::types::GetStandardProperties;
#[test]
fn plain_text_simple() {
@@ -159,6 +159,9 @@ mod tests {
))(&initial_context);
let (remaining, result) = map(plain_text_matcher, Object::PlainText)(input).unwrap();
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(result.get_source(), Into::<&str>::into(input));
assert_eq!(
result.get_standard_properties().get_source(),
Into::<&str>::into(input)
);
}
}

View File

@@ -151,8 +151,8 @@ mod tests {
use crate::parser::element_parser::element;
use crate::types::Bold;
use crate::types::Element;
use crate::types::GetStandardProperties;
use crate::types::PlainText;
use crate::types::Source;
#[test]
fn plain_text_radio_target() {
@@ -172,7 +172,10 @@ mod tests {
_ => panic!("Should be a paragraph!"),
};
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(first_paragraph.get_source(), "foo bar baz");
assert_eq!(
first_paragraph.get_standard_properties().get_source(),
"foo bar baz"
);
assert_eq!(first_paragraph.children.len(), 3);
assert_eq!(
first_paragraph
@@ -208,7 +211,10 @@ mod tests {
_ => panic!("Should be a paragraph!"),
};
assert_eq!(Into::<&str>::into(remaining), "");
assert_eq!(first_paragraph.get_source(), "foo *bar* baz");
assert_eq!(
first_paragraph.get_standard_properties().get_source(),
"foo *bar* baz"
);
assert_eq!(first_paragraph.children.len(), 3);
assert_eq!(
first_paragraph

View File

@@ -1,6 +1,7 @@
use super::Element;
use super::GetStandardProperties;
use super::Object;
use super::Source;
use super::StandardProperties;
pub type PriorityCookie = u8;
pub type HeadlineLevel = u16;
@@ -43,28 +44,28 @@ pub enum TodoKeywordType {
Done,
}
impl<'s> Source<'s> for Document<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for DocumentElement<'s> {
fn get_source(&'s self) -> &'s str {
impl<'s> GetStandardProperties<'s> for DocumentElement<'s> {
fn get_standard_properties(&'s self) -> &'s dyn StandardProperties {
match self {
DocumentElement::Heading(obj) => obj.source,
DocumentElement::Section(obj) => obj.source,
DocumentElement::Heading(inner) => inner,
DocumentElement::Section(inner) => inner,
}
}
}
impl<'s> Source<'s> for Section<'s> {
impl<'s> StandardProperties<'s> for Document<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Heading<'s> {
impl<'s> StandardProperties<'s> for Section<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> StandardProperties<'s> for Heading<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}

View File

@@ -4,6 +4,7 @@ use super::greater_element::GreaterBlock;
use super::greater_element::PlainList;
use super::greater_element::PropertyDrawer;
use super::greater_element::Table;
use super::lesser_element::BabelCall;
use super::lesser_element::Clock;
use super::lesser_element::Comment;
use super::lesser_element::CommentBlock;
@@ -19,8 +20,9 @@ use super::lesser_element::Planning;
use super::lesser_element::SrcBlock;
use super::lesser_element::VerseBlock;
use super::Drawer;
use super::GetStandardProperties;
use super::SetSource;
use super::Source;
use super::StandardProperties;
#[derive(Debug)]
pub enum Element<'s> {
@@ -44,39 +46,10 @@ pub enum Element<'s> {
FixedWidthArea(FixedWidthArea<'s>),
HorizontalRule(HorizontalRule<'s>),
Keyword(Keyword<'s>),
BabelCall(Keyword<'s>),
BabelCall(BabelCall<'s>),
LatexEnvironment(LatexEnvironment<'s>),
}
impl<'s> Source<'s> for Element<'s> {
fn get_source(&'s self) -> &'s str {
match self {
Element::Paragraph(obj) => obj.get_source(),
Element::PlainList(obj) => obj.get_source(),
Element::GreaterBlock(obj) => obj.get_source(),
Element::DynamicBlock(obj) => obj.get_source(),
Element::FootnoteDefinition(obj) => obj.get_source(),
Element::Comment(obj) => obj.get_source(),
Element::Drawer(obj) => obj.get_source(),
Element::PropertyDrawer(obj) => obj.get_source(),
Element::Table(obj) => obj.get_source(),
Element::VerseBlock(obj) => obj.get_source(),
Element::CommentBlock(obj) => obj.get_source(),
Element::ExampleBlock(obj) => obj.get_source(),
Element::ExportBlock(obj) => obj.get_source(),
Element::SrcBlock(obj) => obj.get_source(),
Element::Clock(obj) => obj.get_source(),
Element::DiarySexp(obj) => obj.get_source(),
Element::Planning(obj) => obj.get_source(),
Element::FixedWidthArea(obj) => obj.get_source(),
Element::HorizontalRule(obj) => obj.get_source(),
Element::Keyword(obj) => obj.get_source(),
Element::BabelCall(obj) => obj.get_source(),
Element::LatexEnvironment(obj) => obj.get_source(),
}
}
}
impl<'s> SetSource<'s> for Element<'s> {
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn set_source(&mut self, source: &'s str) {
@@ -106,3 +79,32 @@ impl<'s> SetSource<'s> for Element<'s> {
}
}
}
impl<'s> GetStandardProperties<'s> for Element<'s> {
fn get_standard_properties(&'s self) -> &'s dyn StandardProperties {
match self {
Element::Paragraph(inner) => inner,
Element::PlainList(inner) => inner,
Element::GreaterBlock(inner) => inner,
Element::DynamicBlock(inner) => inner,
Element::FootnoteDefinition(inner) => inner,
Element::Comment(inner) => inner,
Element::Drawer(inner) => inner,
Element::PropertyDrawer(inner) => inner,
Element::Table(inner) => inner,
Element::VerseBlock(inner) => inner,
Element::CommentBlock(inner) => inner,
Element::ExampleBlock(inner) => inner,
Element::ExportBlock(inner) => inner,
Element::SrcBlock(inner) => inner,
Element::Clock(inner) => inner,
Element::DiarySexp(inner) => inner,
Element::Planning(inner) => inner,
Element::FixedWidthArea(inner) => inner,
Element::HorizontalRule(inner) => inner,
Element::Keyword(inner) => inner,
Element::BabelCall(inner) => inner,
Element::LatexEnvironment(inner) => inner,
}
}
}

View File

@@ -0,0 +1,12 @@
use super::StandardProperties;
pub trait GetStandardProperties<'s> {
// TODO: Can I eliminate this dynamic dispatch, perhaps using nominal generic structs? Low prioritiy since this is not used during parsing.
fn get_standard_properties(&'s self) -> &'s dyn StandardProperties;
}
impl<'s, I: StandardProperties<'s>> GetStandardProperties<'s> for I {
fn get_standard_properties(&'s self) -> &'s dyn StandardProperties {
self
}
}

View File

@@ -2,7 +2,7 @@ use super::element::Element;
use super::lesser_element::TableCell;
use super::Keyword;
use super::Object;
use super::Source;
use super::StandardProperties;
#[derive(Debug)]
pub struct PlainList<'s> {
@@ -85,61 +85,61 @@ pub struct TableRow<'s> {
pub children: Vec<TableCell<'s>>,
}
impl<'s> Source<'s> for PlainList<'s> {
impl<'s> StandardProperties<'s> for PlainList<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for PlainListItem<'s> {
impl<'s> StandardProperties<'s> for PlainListItem<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for GreaterBlock<'s> {
impl<'s> StandardProperties<'s> for GreaterBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for DynamicBlock<'s> {
impl<'s> StandardProperties<'s> for DynamicBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for FootnoteDefinition<'s> {
impl<'s> StandardProperties<'s> for FootnoteDefinition<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Drawer<'s> {
impl<'s> StandardProperties<'s> for Drawer<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for PropertyDrawer<'s> {
impl<'s> StandardProperties<'s> for PropertyDrawer<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for NodeProperty<'s> {
impl<'s> StandardProperties<'s> for NodeProperty<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Table<'s> {
impl<'s> StandardProperties<'s> for Table<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for TableRow<'s> {
impl<'s> StandardProperties<'s> for TableRow<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}

View File

@@ -1,6 +1,6 @@
use super::object::Object;
use super::PlainText;
use super::Source;
use super::StandardProperties;
#[derive(Debug)]
pub struct Paragraph<'s> {
@@ -91,6 +91,13 @@ pub struct Keyword<'s> {
pub value: &'s str,
}
#[derive(Debug)]
pub struct BabelCall<'s> {
pub source: &'s str,
pub key: &'s str,
pub value: &'s str,
}
#[derive(Debug)]
pub struct LatexEnvironment<'s> {
pub source: &'s str,
@@ -107,87 +114,93 @@ impl<'s> Paragraph<'s> {
}
}
impl<'s> Source<'s> for Paragraph<'s> {
impl<'s> StandardProperties<'s> for Paragraph<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for TableCell<'s> {
impl<'s> StandardProperties<'s> for TableCell<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Comment<'s> {
impl<'s> StandardProperties<'s> for Comment<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for VerseBlock<'s> {
impl<'s> StandardProperties<'s> for VerseBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for CommentBlock<'s> {
impl<'s> StandardProperties<'s> for CommentBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for ExampleBlock<'s> {
impl<'s> StandardProperties<'s> for ExampleBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for ExportBlock<'s> {
impl<'s> StandardProperties<'s> for ExportBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for SrcBlock<'s> {
impl<'s> StandardProperties<'s> for SrcBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Clock<'s> {
impl<'s> StandardProperties<'s> for Clock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for DiarySexp<'s> {
impl<'s> StandardProperties<'s> for DiarySexp<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Planning<'s> {
impl<'s> StandardProperties<'s> for Planning<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for FixedWidthArea<'s> {
impl<'s> StandardProperties<'s> for FixedWidthArea<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for HorizontalRule<'s> {
impl<'s> StandardProperties<'s> for HorizontalRule<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Keyword<'s> {
impl<'s> StandardProperties<'s> for Keyword<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for LatexEnvironment<'s> {
impl<'s> StandardProperties<'s> for BabelCall<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> StandardProperties<'s> for LatexEnvironment<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}

View File

@@ -1,9 +1,11 @@
mod document;
mod element;
mod get_standard_properties;
mod greater_element;
mod lesser_element;
mod object;
mod source;
mod standard_properties;
pub use document::Document;
pub use document::DocumentElement;
pub use document::Heading;
@@ -12,6 +14,7 @@ pub use document::PriorityCookie;
pub use document::Section;
pub use document::TodoKeywordType;
pub use element::Element;
pub use get_standard_properties::GetStandardProperties;
pub use greater_element::CheckboxType;
pub use greater_element::Drawer;
pub use greater_element::DynamicBlock;
@@ -24,6 +27,7 @@ pub use greater_element::PlainListItem;
pub use greater_element::PropertyDrawer;
pub use greater_element::Table;
pub use greater_element::TableRow;
pub use lesser_element::BabelCall;
pub use lesser_element::Clock;
pub use lesser_element::Comment;
pub use lesser_element::CommentBlock;
@@ -68,4 +72,4 @@ pub use object::Timestamp;
pub use object::Underline;
pub use object::Verbatim;
pub(crate) use source::SetSource;
pub use source::Source;
pub use standard_properties::StandardProperties;

View File

@@ -1,4 +1,5 @@
use super::Source;
use super::GetStandardProperties;
use super::StandardProperties;
#[derive(Debug, PartialEq)]
pub enum Object<'s> {
@@ -185,197 +186,197 @@ pub struct Timestamp<'s> {
pub source: &'s str,
}
impl<'s> Source<'s> for Object<'s> {
fn get_source(&'s self) -> &'s str {
impl<'s> GetStandardProperties<'s> for Object<'s> {
fn get_standard_properties(&'s self) -> &'s dyn StandardProperties {
match self {
Object::Bold(obj) => obj.source,
Object::Italic(obj) => obj.source,
Object::Underline(obj) => obj.source,
Object::StrikeThrough(obj) => obj.source,
Object::Code(obj) => obj.source,
Object::Verbatim(obj) => obj.source,
Object::PlainText(obj) => obj.source,
Object::RegularLink(obj) => obj.source,
Object::RadioLink(obj) => obj.source,
Object::RadioTarget(obj) => obj.source,
Object::PlainLink(obj) => obj.source,
Object::AngleLink(obj) => obj.source,
Object::OrgMacro(obj) => obj.source,
Object::Entity(obj) => obj.source,
Object::LatexFragment(obj) => obj.source,
Object::ExportSnippet(obj) => obj.source,
Object::FootnoteReference(obj) => obj.source,
Object::Citation(obj) => obj.source,
Object::CitationReference(obj) => obj.source,
Object::InlineBabelCall(obj) => obj.source,
Object::InlineSourceBlock(obj) => obj.source,
Object::LineBreak(obj) => obj.source,
Object::Target(obj) => obj.source,
Object::Timestamp(obj) => obj.source,
Object::StatisticsCookie(obj) => obj.source,
Object::Subscript(obj) => obj.source,
Object::Superscript(obj) => obj.source,
Object::Bold(inner) => inner,
Object::Italic(inner) => inner,
Object::Underline(inner) => inner,
Object::StrikeThrough(inner) => inner,
Object::Code(inner) => inner,
Object::Verbatim(inner) => inner,
Object::PlainText(inner) => inner,
Object::RegularLink(inner) => inner,
Object::RadioLink(inner) => inner,
Object::RadioTarget(inner) => inner,
Object::PlainLink(inner) => inner,
Object::AngleLink(inner) => inner,
Object::OrgMacro(inner) => inner,
Object::Entity(inner) => inner,
Object::LatexFragment(inner) => inner,
Object::ExportSnippet(inner) => inner,
Object::FootnoteReference(inner) => inner,
Object::Citation(inner) => inner,
Object::CitationReference(inner) => inner,
Object::InlineBabelCall(inner) => inner,
Object::InlineSourceBlock(inner) => inner,
Object::LineBreak(inner) => inner,
Object::Target(inner) => inner,
Object::StatisticsCookie(inner) => inner,
Object::Subscript(inner) => inner,
Object::Superscript(inner) => inner,
Object::Timestamp(inner) => inner,
}
}
}
impl<'s> Source<'s> for Bold<'s> {
impl<'s> StandardProperties<'s> for Bold<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Italic<'s> {
impl<'s> StandardProperties<'s> for Italic<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Underline<'s> {
impl<'s> StandardProperties<'s> for Underline<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for StrikeThrough<'s> {
impl<'s> StandardProperties<'s> for StrikeThrough<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Code<'s> {
impl<'s> StandardProperties<'s> for Code<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Verbatim<'s> {
impl<'s> StandardProperties<'s> for Verbatim<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for RegularLink<'s> {
impl<'s> StandardProperties<'s> for RegularLink<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for RadioLink<'s> {
impl<'s> StandardProperties<'s> for RadioLink<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for RadioTarget<'s> {
impl<'s> StandardProperties<'s> for RadioTarget<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for PlainLink<'s> {
impl<'s> StandardProperties<'s> for PlainLink<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for AngleLink<'s> {
impl<'s> StandardProperties<'s> for AngleLink<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for OrgMacro<'s> {
impl<'s> StandardProperties<'s> for OrgMacro<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Entity<'s> {
impl<'s> StandardProperties<'s> for Entity<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for LatexFragment<'s> {
impl<'s> StandardProperties<'s> for LatexFragment<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for ExportSnippet<'s> {
impl<'s> StandardProperties<'s> for ExportSnippet<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for FootnoteReference<'s> {
impl<'s> StandardProperties<'s> for FootnoteReference<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Citation<'s> {
impl<'s> StandardProperties<'s> for Citation<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for CitationReference<'s> {
impl<'s> StandardProperties<'s> for CitationReference<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for InlineBabelCall<'s> {
impl<'s> StandardProperties<'s> for InlineBabelCall<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for InlineSourceBlock<'s> {
impl<'s> StandardProperties<'s> for InlineSourceBlock<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for LineBreak<'s> {
impl<'s> StandardProperties<'s> for LineBreak<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Target<'s> {
impl<'s> StandardProperties<'s> for Target<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for StatisticsCookie<'s> {
impl<'s> StandardProperties<'s> for StatisticsCookie<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Subscript<'s> {
impl<'s> StandardProperties<'s> for Subscript<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Superscript<'s> {
impl<'s> StandardProperties<'s> for Superscript<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for Timestamp<'s> {
impl<'s> StandardProperties<'s> for Timestamp<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}
}
impl<'s> Source<'s> for PlainText<'s> {
impl<'s> StandardProperties<'s> for PlainText<'s> {
fn get_source(&'s self) -> &'s str {
self.source
}

View File

@@ -1,6 +1,3 @@
pub trait Source<'s> {
fn get_source(&'s self) -> &'s str;
}
pub(crate) trait SetSource<'s> {
fn set_source(&mut self, source: &'s str);
}

View File

@@ -0,0 +1,58 @@
// TODO: What is an anonymous AST node and how can I trigger one?
pub trait StandardProperties<'s> {
/// Get the slice of the entire AST node.
///
/// This corresponds to :begin to :end in upstream org-mode's standard properties.
fn get_source(&'s self) -> &'s str;
// Get the slice of the AST node's contents.
//
// This corresponds to :contents-begin to :contents-end
// fn get_contents(&'s self) -> &'s str;
}
// TODO: Write some debugging code to alert when any of the unknown fields below are non-nil in our test data so we can see what these fields represent.
// Order of upstream org-mode's standard properties array:
//
// :begin :post-affiliated :contents-begin :contents-end :end :post-blank :secondary :mode :granularity :cached :org-element--cache-sync-key :robust-begin :robust-end :true-level :buffer :deferred :structure :parent
//
// Per-field notes: (Leading character: 'X' for not going to include, 'Y' for going to include but not included yet ("Yes"), 'D' for already included ("Done"), '?' for undecided)
//
// D :begin - Number of characters (NOT bytes!) since the beginning of the file.
//
// ? :post-affiliated - ?
//
// Y :contents-begin - Number of characters (NOT bytes!) since the beginning of the file.
//
// Y :contents-end - Number of characters (NOT bytes!) since the beginning of the file.
//
// D :end - Number of characters (NOT bytes!) since the beginning of the file.
//
// Y :post-blank - Number of characters after :contents-end but before :end. This is the trailing whitespace.
//
// X :secondary - List of properties that may contain AST nodes. This will be important to reference for implementing TokenIter properly, but I see no value in including this in the StandardProperties trait since which properties contain AST nodes will be self-evident in the struct definition.
//
// ? :mode - ?
//
// ? :granularity - ?
//
// X :cached - ? Based on the name, I'm guessing this is a runtime-optimization rather than something relevant to export from a parser, so (unless I'm wrong about the purpose) I see no reason to include this.
//
// X :org-element--cache-sync-key - ? Based on the name, I'm guessing this is a runtime-optimization rather than something relevant to export from a parser, so (unless I'm wrong about the purpose) I see no reason to include this.
//
// ? :robust-begin - ? uhh what? What makes this begin/end "robust" and the others not? I have no idea.
//
// ? :robust-end - ? uhh what? What makes this begin/end "robust" and the others not? I have no idea.
//
// ? :true-level - This seems to correspond to the REAL star count for headlines (as opposed to the headline level we set for when "odd" is enabled instead of the default "oddeven"). This is great information to have, but is this a "standard" property? Does anything other than headlines have this set? I don't know, so I need to investigate. If it is headline-specific then we will not be including this in the StandardProperties trait even though it is in the :standard-properties array in org-mode.
//
// X :buffer - This is the Emacs buffer name containing the org-mode document. This seems more like a runtime thing than something we would want to export from our parser so this will not be included.
//
// X :deferred - Seems to be a runtime optimization about only calculating some properties when requested.
//
// ? :structure - ?
//
// X :parent - Some weird numeric reference to the containing object. Since we output a tree structure, I do not see any value in including this, especially considering the back-references would be a nightmare in rust.
// Special case: Plain text. Plain text counts :begin and :end from the start of the text (so :begin is always 0 AFAICT) and instead of including the full set of standard properties, it only includes :begin, :end, and :parent.