28 Commits

Author SHA1 Message Date
Tom Alexander
8450785186 Add test showing we are not handling the odd startup option for headline depth.
Some checks failed
rust-test Build rust-test has failed
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has failed
2023-09-15 22:08:42 -04:00
Tom Alexander
d443dbd468 Introduce the tab_width setting and give tabs a greater value when counting indentation level. 2023-09-15 21:59:48 -04:00
Tom Alexander
c9ce32c881 Remve redundant org_spaces functions.
Turns out the nom space0/space1 parsers accept tab characters already.
2023-09-15 21:28:40 -04:00
Tom Alexander
85454a0a27 Fix footnote reference function label matcher.
Previously when a label started with a number but contained other characters, this parser would fail because it would not match the entire label.
2023-09-15 21:14:44 -04:00
Tom Alexander
fdebf6dec5 Delete already solved TODO. 2023-09-15 21:08:52 -04:00
Tom Alexander
444d6758aa Handle leading blank lines in greater blocks.
Some checks failed
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has failed
2023-09-15 21:03:35 -04:00
Tom Alexander
6c7203410e Add a test showing we're not handling leading blank lines in greater blocks. 2023-09-15 17:02:41 -04:00
Tom Alexander
bfe67b1f75 Parse plain list item checkboxes. 2023-09-15 16:09:57 -04:00
Tom Alexander
fd41ad9c29 Pretend dos line endings do not exist. 2023-09-15 14:13:17 -04:00
Tom Alexander
7f751d4f28 Allow no digit in repeater in timestamp. 2023-09-15 13:12:54 -04:00
Tom Alexander
52a4dab67c Use the timestamp parser in planning.
Previously we did not support inactive timestamps in planning. This fixes that.
2023-09-15 12:45:19 -04:00
Tom Alexander
3d86e75059 Always match the entire entity name.
Some checks failed
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has failed
2023-09-14 04:29:50 -04:00
Tom Alexander
ca6fdf1924 Support different cases in radio links. 2023-09-14 04:04:21 -04:00
Tom Alexander
66d16d89ed Support interchangeable whitespace in re-matching plain text. 2023-09-14 04:00:34 -04:00
Tom Alexander
ee5e0698b1 Add an optimization idea. 2023-09-14 03:25:12 -04:00
Tom Alexander
22681b6a58 Support trailing whitespace in fixed-width areas. 2023-09-14 03:20:44 -04:00
Tom Alexander
876d33239e Allow any character to be escaped in the path for links. 2023-09-14 03:05:11 -04:00
Tom Alexander
87941271a4 Handle headlines with trailing spaces without tags. 2023-09-14 02:43:40 -04:00
Tom Alexander
32b19d68d0 Support todo keywords with fast access. 2023-09-14 02:24:06 -04:00
Tom Alexander
830097b0a9 Add a test showing we are not handling fast access states in todo keywords. 2023-09-14 02:18:49 -04:00
Tom Alexander
44e9f708c9 Handle the possibility of a title-less headline. 2023-09-14 02:01:24 -04:00
Tom Alexander
fc4ff97c14 Add a test showing we are not handling empty headlines properly. 2023-09-14 00:50:31 -04:00
Tom Alexander
33372429dd Add a config option for org-list-allow-alphabetical.
This fixes an issue where lines in a paragraph were incorrectly getting identified as lists because I had defaulted to assuming alphabetical bullets were allowed.
2023-09-14 00:27:54 -04:00
Tom Alexander
ac0db64081 Add cargo directive to rebuild the auto-generated tests when files under org_mode_samples get updated.
Some checks failed
rust-test Build rust-test has failed
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has failed
2023-09-13 21:28:44 -04:00
Tom Alexander
b8a4876779 Disable auto-aligning tables when Emacs loads Org-mode.
Emacs will auto-align tables when org-mode is loaded if the document contains "#+STARTUP: align". Since Organic is just a parser, it has no business editing the input it receives so we are disabling this auto-align in Emacs to make the tests work properly.
2023-09-13 21:02:38 -04:00
Tom Alexander
925c42c8fb Add test showing we currently are letting emacs align tables at startup. 2023-09-13 21:02:38 -04:00
Tom Alexander
7d4100d956 Add worg to the foreign document test.
A lot of the documents are failing so there are going to be a lot of bug fixes in this branch.
2023-09-13 20:10:50 -04:00
Tom Alexander
53d90a2949 Update the README to have instructions on running the tests and development programs.
All checks were successful
rustfmt Build rustfmt has succeeded
rust-test Build rust-test has succeeded
rust-build Build rust-build has succeeded
rust-foreign-document-test Build rust-foreign-document-test has succeeded
2023-09-13 20:10:14 -04:00
33 changed files with 464 additions and 134 deletions

View File

@@ -2,12 +2,63 @@
Organic is an emacs-less implementation of an [org-mode](https://orgmode.org/) parser. Organic is an emacs-less implementation of an [org-mode](https://orgmode.org/) parser.
## Project Status ## Project Status
This project is a personal learning project to grow my experience in [rust](https://www.rust-lang.org/). It is under development and at this time I would not recommend anyone use this code. The goal is to turn this into a project others can use, at which point more information will appear in this README. This project is a personal learning project to grow my experience in [rust](https://www.rust-lang.org/). It is under development and at this time I would not recommend anyone use this code. The goal is to turn this into a project others can use, at which point more information will appear in this README.
## Using this library
TODO: Add section on using Organic as a library (which is the intended use for this project).
### The parse binary
This program takes org-mode input either streamed in on stdin or as paths to files passed in as arguments. It then parses them using Organic and dumps the result to stdout. This program is intended solely as a development tool. Examples:
```bash
cat /foo/bar.org | cargo run --bin parse
```
```bash
cargo build --profile release-lto
./target/release-lto/parse /foo/bar.org /lorem/ipsum.org
```
### The compare binary
This program takes org-mode input either streamed in on stdin or as paths to files passed in as arguments. It then parses them using Organic and the official Emacs Org-mode parser and compares the parse result. This program is intended solely as a development tool. Since org-mode is a moving target, it is recommended that you run this through docker since we pin the version of org-mode to a specific revision. Examples:
```bash
cat /foo/bar.org | ./scripts/run_docker_compare.bash
```
```bash
./scripts/run_docker_compare.bash /foo/bar.org /lorem/ipsum.org
```
Not recommended since it is not through docker:
```bash
cat /foo/bar.org | cargo run --features compare --bin compare
```
```bash
cargo build --profile release-lto --features compare
./target/release-lto/compare /foo/bar.org /lorem/ipsum.org
```
## Running the tests
There are three levels of tests for this repository: the standard tests, the autogenerated tests, and the foreign document tests.
### The standard tests
These are regular hand-written rust tests. These can be run with:
```bash
make unittest
```
### The auto-generated tests
These tests are automatically generated from the files in the `org_mode_samples` directory and they are still integrated with the rust/cargo testing framework. For each org-mode document in that folder, a test is generated that will parse the document with both Organic and the official Emacs Org-mode parser and then it will compare the parse results. Any deviation is considered a failure. Since org-mode is a moving target, it is recommended that you run these tests inside docker since the `organic-test` docker image is pinned to a specific revision of org-mode. These can be run with:
```bash
make dockertest
```
### The foreign document tests
These tests function the same as the auto-generated tests except they are **not** integrated with the rust/cargo testing framework and they involve comparing the parse of org-mode documents that live outside this repository. This allows us to test against a far greater variety of org-mode input documents without pulling massive sets of org-mode documents into this repository. The recommended way to run these tests is still through docker because it pins org-mode and the test documents to specific git revisions. These can be run with:
```bash
make foreign_document_test
```
## License ## License
This project is released under the public-domain-equivalent [0BSD license](https://www.tldrlegal.com/license/bsd-0-clause-license). This license puts no restrictions on the use of this code (you do not even have to include the copyright notice or license text when using it). HOWEVER, this project has a couple permissively licensed dependencies which do require their copyright notices and/or license texts to be included. I am not a lawyer and this is not legal advice but it is my layperson's understanding that if you distribute a binary with this library linked in, you will need to abide by their terms since their code will also be linked in your binary. I try to keep the dependencies to a minimum and the most restrictive dependency I will ever include is a permissively licensed one. This project is released under the public-domain-equivalent [0BSD license](https://www.tldrlegal.com/license/bsd-0-clause-license), however, this project has a couple permissively licensed non-public-domain-equivalent dependencies which require their copyright notices and/or license texts to be included. I am not a lawyer and this is not legal advice but it is my layperson's understanding that if you distribute a binary statically linking this library, you will need to abide by their terms since their code will also be linked in your binary.

View File

@@ -16,6 +16,9 @@ fn main() {
let destination = Path::new(&out_dir).join("tests.rs"); let destination = Path::new(&out_dir).join("tests.rs");
let mut test_file = File::create(&destination).unwrap(); let mut test_file = File::create(&destination).unwrap();
// Re-generate the tests if any org-mode files change
println!("cargo:rerun-if-changed=org_mode_samples");
write_header(&mut test_file); write_header(&mut test_file);
let test_files = WalkDir::new("org_mode_samples") let test_files = WalkDir::new("org_mode_samples")

View File

@@ -88,14 +88,20 @@ ARG DOOMEMACS_PATH=/foreign_documents/doomemacs
ARG DOOMEMACS_REPO=https://github.com/doomemacs/doomemacs.git ARG DOOMEMACS_REPO=https://github.com/doomemacs/doomemacs.git
RUN mkdir -p $DOOMEMACS_PATH && git -C $DOOMEMACS_PATH init --initial-branch=main && git -C $DOOMEMACS_PATH remote add origin $DOOMEMACS_REPO && git -C $DOOMEMACS_PATH fetch origin $DOOMEMACS_VERSION && git -C $DOOMEMACS_PATH checkout FETCH_HEAD RUN mkdir -p $DOOMEMACS_PATH && git -C $DOOMEMACS_PATH init --initial-branch=main && git -C $DOOMEMACS_PATH remote add origin $DOOMEMACS_REPO && git -C $DOOMEMACS_PATH fetch origin $DOOMEMACS_VERSION && git -C $DOOMEMACS_PATH checkout FETCH_HEAD
ARG WORG_VERSION=74e80b0f7600801b1d1594542602394c085cc2f9
ARG WORG_PATH=/foreign_documents/worg
ARG WORG_REPO=https://git.sr.ht/~bzg/worg
RUN mkdir -p $WORG_PATH && git -C $WORG_PATH init --initial-branch=main && git -C $WORG_PATH remote add origin $WORG_REPO && git -C $WORG_PATH fetch origin $WORG_VERSION && git -C $WORG_PATH checkout FETCH_HEAD
FROM tester as foreign-document-test FROM tester as foreign-document-test
RUN apk add --no-cache bash coreutils RUN apk add --no-cache bash coreutils
RUN mkdir /foreign_documents RUN mkdir /foreign_documents
COPY --from=build-org-mode /root/org-mode /foreign_documents/org-mode
COPY --from=build-emacs /root/emacs /foreign_documents/emacs
COPY --from=foreign-document-gather /foreign_documents/howardabrams /foreign_documents/howardabrams COPY --from=foreign-document-gather /foreign_documents/howardabrams /foreign_documents/howardabrams
COPY --from=foreign-document-gather /foreign_documents/doomemacs /foreign_documents/doomemacs COPY --from=foreign-document-gather /foreign_documents/doomemacs /foreign_documents/doomemacs
COPY --from=foreign-document-gather /foreign_documents/worg /foreign_documents/worg
COPY --from=build-org-mode /root/org-mode /foreign_documents/org-mode
COPY --from=build-emacs /root/emacs /foreign_documents/emacs
COPY foreign_document_test_entrypoint.sh /entrypoint.sh COPY foreign_document_test_entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"] ENTRYPOINT ["/entrypoint.sh"]

View File

@@ -32,6 +32,8 @@ function main {
if [ "$?" -ne 0 ]; then all_status=1; fi if [ "$?" -ne 0 ]; then all_status=1; fi
(run_compare_function "emacs" compare_all_org_document "/foreign_documents/emacs") (run_compare_function "emacs" compare_all_org_document "/foreign_documents/emacs")
if [ "$?" -ne 0 ]; then all_status=1; fi if [ "$?" -ne 0 ]; then all_status=1; fi
(run_compare_function "worg" compare_all_org_document "/foreign_documents/worg")
if [ "$?" -ne 0 ]; then all_status=1; fi
(run_compare_function "howard_abrams" compare_howard_abrams) (run_compare_function "howard_abrams" compare_howard_abrams)
if [ "$?" -ne 0 ]; then all_status=1; fi if [ "$?" -ne 0 ]; then all_status=1; fi
(run_compare_function "doomemacs" compare_all_org_document "/foreign_documents/doomemacs") (run_compare_function "doomemacs" compare_all_org_document "/foreign_documents/doomemacs")

View File

@@ -25,3 +25,4 @@ This could significantly reduce our calls to exit matchers.
I think targets would break this. I think targets would break this.
The exit matchers are already implicitly building this behavior since they should all exit very early when the starting character is wrong. Putting this logic in a centralized place, far away from where those characters are actually going to be used, is unfortunate for readability. The exit matchers are already implicitly building this behavior since they should all exit very early when the starting character is wrong. Putting this logic in a centralized place, far away from where those characters are actually going to be used, is unfortunate for readability.
** Use exit matcher to cut off trailing whitespace instead of re-matching in plain lists.

View File

@@ -0,0 +1,5 @@
#+begin_quote
foo
#+end_quote

View File

@@ -0,0 +1,3 @@
# These are only allowed by configuring org-list-allow-alphabetical which the automated tests are not currently set up to do, so this will parse as a paragraph:
a. foo
b. bar

View File

@@ -0,0 +1,6 @@
# The STARTUP directive here instructs org-mode to align tables which emacs normally does when opening the file. Since Organic is solely a parser, we have no business editing the org-mode document so Organic does not handle aligning tables, so in order for this test to pass, we have to avoid that behavior in Emacs.
#+STARTUP: align
|foo|bar|
|-
|lorem|ipsum|

View File

@@ -0,0 +1,3 @@
<<<Foo Bar Baz>>>
foo bar baz

View File

@@ -0,0 +1,6 @@
<<<foo bar baz>>>
foo
bar
baz

View File

@@ -0,0 +1 @@
[[elisp:(local-set-key "\M-\x" 'foo-bar-baz)]]

View File

@@ -0,0 +1,2 @@
* DONE
*

View File

@@ -0,0 +1,6 @@
#+TODO: TODO(t) INPROGRESS(i/!) | DONE(d!) CANCELED(c@/!)
# ! : Log changes leading to this state.
# @ : Log changes leading to this state and prompt for a comment to include.
# /! : Log changes leaving this state if and only if to a state that does not log. This can be combined with the above like WAIT(w!/!) or DELAYED(d@/!)
* INPROGRESS
- State "TODO" from "INPROGRESS" [2023-09-14 Thu 02:13]

View File

@@ -0,0 +1,6 @@
#+STARTUP: odd
* Foo
***** Bar
* Baz
*** Lorem
* Ipsum

View File

@@ -14,7 +14,9 @@ use crate::LocalFileAccessInterface;
pub fn run_anonymous_compare<P: AsRef<str>>( pub fn run_anonymous_compare<P: AsRef<str>>(
org_contents: P, org_contents: P,
) -> Result<(), Box<dyn std::error::Error>> { ) -> Result<(), Box<dyn std::error::Error>> {
let org_contents = org_contents.as_ref(); // TODO: This is a work-around to pretend that dos line endings do not exist. It would be better to handle the difference in line endings.
let org_contents = org_contents.as_ref().replace("\r\n", "\n");
let org_contents = org_contents.as_str();
eprintln!("Using emacs version: {}", get_emacs_version()?.trim()); eprintln!("Using emacs version: {}", get_emacs_version()?.trim());
eprintln!("Using org-mode version: {}", get_org_mode_version()?.trim()); eprintln!("Using org-mode version: {}", get_org_mode_version()?.trim());
let rust_parsed = parse(org_contents)?; let rust_parsed = parse(org_contents)?;
@@ -44,6 +46,8 @@ pub fn run_compare_on_file<P: AsRef<Path>>(org_path: P) -> Result<(), Box<dyn st
.parent() .parent()
.ok_or("Should be contained inside a directory.")?; .ok_or("Should be contained inside a directory.")?;
let org_contents = std::fs::read_to_string(org_path)?; let org_contents = std::fs::read_to_string(org_path)?;
// TODO: This is a work-around to pretend that dos line endings do not exist. It would be better to handle the difference in line endings.
let org_contents = org_contents.replace("\r\n", "\n");
let org_contents = org_contents.as_str(); let org_contents = org_contents.as_str();
let file_access_interface = LocalFileAccessInterface { let file_access_interface = LocalFileAccessInterface {
working_directory: Some(parent_directory.to_path_buf()), working_directory: Some(parent_directory.to_path_buf()),

View File

@@ -8,6 +8,7 @@ use super::util::assert_name;
use super::util::get_property; use super::util::get_property;
use crate::types::AngleLink; use crate::types::AngleLink;
use crate::types::Bold; use crate::types::Bold;
use crate::types::CheckboxType;
use crate::types::Citation; use crate::types::Citation;
use crate::types::CitationReference; use crate::types::CitationReference;
use crate::types::Clock; use crate::types::Clock;
@@ -546,14 +547,26 @@ fn compare_heading<'s>(
}; };
// Compare title // Compare title
let title = get_property(emacs, ":title")?.ok_or("Missing :title attribute.")?; let title = get_property(emacs, ":title")?;
let title_status = title match (title, rust.title.len()) {
.as_list()? (None, 0) => {}
.iter() (None, _) => {
.zip(rust.title.iter()) this_status = DiffStatus::Bad;
.map(|(emacs_child, rust_child)| compare_object(source, emacs_child, rust_child)) message = Some(format!(
.collect::<Result<Vec<_>, _>>()?; "Titles do not match (emacs != rust): {:?} != {:?}",
child_status.push(artificial_diff_scope("title".to_owned(), title_status)?); title, rust.title
))
}
(Some(title), _) => {
let title_status = title
.as_list()?
.iter()
.zip(rust.title.iter())
.map(|(emacs_child, rust_child)| compare_object(source, emacs_child, rust_child))
.collect::<Result<Vec<_>, _>>()?;
child_status.push(artificial_diff_scope("title".to_owned(), title_status)?);
}
};
// Compare priority // Compare priority
let priority = get_property(emacs, ":priority")?; let priority = get_property(emacs, ":priority")?;
@@ -787,7 +800,26 @@ fn compare_plain_list_item<'s>(
contents_status, contents_status,
)?); )?);
// TODO: compare :bullet :checkbox :counter :pre-blank // TODO: compare :bullet :counter :pre-blank
// Compare checkbox
let checkbox = get_property(emacs, ":checkbox")?
.map(Token::as_atom)
.map_or(Ok(None), |r| r.map(Some))?
.unwrap_or("nil");
match (checkbox, &rust.checkbox) {
("nil", None) => {}
("off", Some((CheckboxType::Off, _))) => {}
("trans", Some((CheckboxType::Trans, _))) => {}
("on", Some((CheckboxType::On, _))) => {}
_ => {
this_status = DiffStatus::Bad;
message = Some(format!(
"Checkbox mismatch (emacs != rust) {:?} != {:?}",
checkbox, rust.checkbox
));
}
};
Ok(DiffResult { Ok(DiffResult {
status: this_status, status: this_status,
@@ -1914,6 +1946,8 @@ fn compare_regular_link<'s>(
Ok(_) => {} Ok(_) => {}
}; };
// TODO: Compare :type :path :format :raw-link :application :search-option
Ok(DiffResult { Ok(DiffResult {
status: this_status, status: this_status,
name: emacs_name.to_owned(), name: emacs_name.to_owned(),

View File

@@ -11,6 +11,8 @@ where
let elisp_script = format!( let elisp_script = format!(
r#"(progn r#"(progn
(erase-buffer) (erase-buffer)
(require 'org)
(defun org-table-align () t)
(insert "{escaped_file_contents}") (insert "{escaped_file_contents}")
(org-mode) (org-mode)
(message "%s" (pp-to-string (org-element-parse-buffer))) (message "%s" (pp-to-string (org-element-parse-buffer)))
@@ -42,6 +44,8 @@ where
))?; ))?;
let elisp_script = format!( let elisp_script = format!(
r#"(progn r#"(progn
(require 'org)
(defun org-table-align () t)
(org-mode) (org-mode)
(message "%s" (pp-to-string (org-element-parse-buffer))) (message "%s" (pp-to-string (org-element-parse-buffer)))
)"# )"#

View File

@@ -2,6 +2,7 @@ use std::collections::BTreeSet;
use super::FileAccessInterface; use super::FileAccessInterface;
use super::LocalFileAccessInterface; use super::LocalFileAccessInterface;
use crate::types::IndentationLevel;
use crate::types::Object; use crate::types::Object;
// TODO: Ultimately, I think we'll need most of this: https://orgmode.org/manual/In_002dbuffer-Settings.html // TODO: Ultimately, I think we'll need most of this: https://orgmode.org/manual/In_002dbuffer-Settings.html
@@ -12,6 +13,15 @@ pub struct GlobalSettings<'g, 's> {
pub file_access: &'g dyn FileAccessInterface, pub file_access: &'g dyn FileAccessInterface,
pub in_progress_todo_keywords: BTreeSet<String>, pub in_progress_todo_keywords: BTreeSet<String>,
pub complete_todo_keywords: BTreeSet<String>, pub complete_todo_keywords: BTreeSet<String>,
/// Set to true to allow for plain lists using single letters as the bullet in the same way that numbers are used.
///
/// Corresponds to the org-list-allow-alphabetical elisp variable.
pub org_list_allow_alphabetical: bool,
/// How many spaces a tab should be equal to.
///
/// Corresponds to the tab-width elisp variable.
pub tab_width: IndentationLevel,
} }
impl<'g, 's> GlobalSettings<'g, 's> { impl<'g, 's> GlobalSettings<'g, 's> {
@@ -23,6 +33,8 @@ impl<'g, 's> GlobalSettings<'g, 's> {
}, },
in_progress_todo_keywords: BTreeSet::new(), in_progress_todo_keywords: BTreeSet::new(),
complete_todo_keywords: BTreeSet::new(), complete_todo_keywords: BTreeSet::new(),
org_list_allow_alphabetical: false,
tab_width: 8,
} }
} }
} }

View File

@@ -141,7 +141,7 @@ fn _detect_element<'b, 'g, 'r, 's>(
can_be_paragraph: bool, can_be_paragraph: bool,
) -> Res<OrgSource<'s>, ()> { ) -> Res<OrgSource<'s>, ()> {
if alt(( if alt((
detect_plain_list, parser_with_context!(detect_plain_list)(context),
detect_footnote_definition, detect_footnote_definition,
detect_diary_sexp, detect_diary_sexp,
detect_comment, detect_comment,

View File

@@ -1,10 +1,10 @@
use nom::branch::alt; use nom::branch::alt;
use nom::bytes::complete::tag; use nom::bytes::complete::tag;
use nom::bytes::complete::tag_no_case;
use nom::character::complete::satisfy; use nom::character::complete::satisfy;
use nom::combinator::eof; use nom::combinator::eof;
use nom::combinator::peek; use nom::combinator::peek;
use nom::combinator::recognize; use nom::combinator::recognize;
use nom::sequence::tuple;
use super::org_source::OrgSource; use super::org_source::OrgSource;
use super::util::maybe_consume_object_trailing_whitespace_if_not_exiting; use super::util::maybe_consume_object_trailing_whitespace_if_not_exiting;
@@ -439,7 +439,7 @@ pub(crate) fn entity<'b, 'g, 'r, 's>(
) -> Res<OrgSource<'s>, Entity<'s>> { ) -> Res<OrgSource<'s>, Entity<'s>> {
let (remaining, _) = tag("\\")(input)?; let (remaining, _) = tag("\\")(input)?;
let (remaining, entity_name) = name(context, remaining)?; let (remaining, entity_name) = name(context, remaining)?;
let (remaining, _) = alt((tag("{}"), peek(recognize(entity_end))))(remaining)?;
let (remaining, _trailing_whitespace) = let (remaining, _trailing_whitespace) =
maybe_consume_object_trailing_whitespace_if_not_exiting(context, remaining)?; maybe_consume_object_trailing_whitespace_if_not_exiting(context, remaining)?;
@@ -460,9 +460,12 @@ fn name<'b, 'g, 'r, 's>(
) -> Res<OrgSource<'s>, OrgSource<'s>> { ) -> Res<OrgSource<'s>, OrgSource<'s>> {
// TODO: This should be defined by org-entities and optionally org-entities-user // TODO: This should be defined by org-entities and optionally org-entities-user
for entity in ORG_ENTITIES { for entity in ORG_ENTITIES {
let result = tag_no_case::<_, _, CustomError<_>>(entity)(input); let result = tuple((
tag::<_, _, CustomError<_>>(entity),
alt((tag("{}"), peek(recognize(entity_end)))),
))(input);
match result { match result {
Ok((remaining, ent)) => { Ok((remaining, (ent, _))) => {
return Ok((remaining, ent)); return Ok((remaining, ent));
} }
Err(_) => {} Err(_) => {}

View File

@@ -6,12 +6,13 @@ use nom::character::complete::space0;
use nom::character::complete::space1; use nom::character::complete::space1;
use nom::combinator::eof; use nom::combinator::eof;
use nom::combinator::not; use nom::combinator::not;
use nom::combinator::opt; use nom::combinator::recognize;
use nom::multi::many0; use nom::multi::many0;
use nom::sequence::preceded; use nom::sequence::preceded;
use nom::sequence::tuple; use nom::sequence::tuple;
use super::org_source::OrgSource; use super::org_source::OrgSource;
use super::util::org_line_ending;
use crate::context::parser_with_context; use crate::context::parser_with_context;
use crate::context::RefContext; use crate::context::RefContext;
use crate::error::Res; use crate::error::Res;
@@ -47,10 +48,10 @@ fn fixed_width_area_line<'b, 'g, 'r, 's>(
) -> Res<OrgSource<'s>, OrgSource<'s>> { ) -> Res<OrgSource<'s>, OrgSource<'s>> {
start_of_line(input)?; start_of_line(input)?;
let (remaining, _indent) = space0(input)?; let (remaining, _indent) = space0(input)?;
let (remaining, (_colon, _leading_whitespace_and_content, _line_ending)) = tuple(( let (remaining, _) = tuple((
tag(":"), tag(":"),
opt(tuple((space1, is_not("\r\n")))), alt((recognize(tuple((space1, is_not("\r\n")))), space0)),
alt((line_ending, eof)), org_line_ending,
))(remaining)?; ))(remaining)?;
let source = get_consumed(input, remaining); let source = get_consumed(input, remaining);
Ok((remaining, source)) Ok((remaining, source))

View File

@@ -2,7 +2,6 @@ use nom::branch::alt;
use nom::bytes::complete::tag; use nom::bytes::complete::tag;
use nom::bytes::complete::tag_no_case; use nom::bytes::complete::tag_no_case;
use nom::bytes::complete::take_while; use nom::bytes::complete::take_while;
use nom::character::complete::digit1;
use nom::character::complete::space0; use nom::character::complete::space0;
use nom::combinator::opt; use nom::combinator::opt;
use nom::combinator::recognize; use nom::combinator::recognize;
@@ -94,10 +93,7 @@ pub(crate) fn footnote_definition<'b, 'g, 'r, 's>(
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
pub(crate) fn label<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> { pub(crate) fn label<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> {
alt(( take_while(|c| WORD_CONSTITUENT_CHARACTERS.contains(c) || "-_".contains(c))(input)
digit1,
take_while(|c| WORD_CONSTITUENT_CHARACTERS.contains(c) || "-_".contains(c)),
))(input)
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]

View File

@@ -4,11 +4,14 @@ use nom::bytes::complete::tag_no_case;
use nom::character::complete::line_ending; use nom::character::complete::line_ending;
use nom::character::complete::space0; use nom::character::complete::space0;
use nom::character::complete::space1; use nom::character::complete::space1;
use nom::combinator::consumed;
use nom::combinator::eof; use nom::combinator::eof;
use nom::combinator::not; use nom::combinator::not;
use nom::combinator::opt; use nom::combinator::opt;
use nom::combinator::verify; use nom::combinator::verify;
use nom::multi::many0;
use nom::multi::many_till; use nom::multi::many_till;
use nom::sequence::preceded;
use nom::sequence::tuple; use nom::sequence::tuple;
use super::org_source::OrgSource; use super::org_source::OrgSource;
@@ -80,25 +83,23 @@ pub(crate) fn greater_block<'b, 'g, 'r, 's>(
let element_matcher = parser_with_context!(element(true))(&parser_context); let element_matcher = parser_with_context!(element(true))(&parser_context);
let exit_matcher = parser_with_context!(exit_matcher_parser)(&parser_context); let exit_matcher = parser_with_context!(exit_matcher_parser)(&parser_context);
// Check for a completely empty block not(exit_matcher)(remaining)?;
let (remaining, children) = match tuple(( let (remaining, leading_blank_lines) = opt(consumed(tuple((
not(exit_matcher),
blank_line, blank_line,
many_till(blank_line, exit_matcher), many0(preceded(not(exit_matcher), blank_line)),
))(remaining) ))))(remaining)?;
{ let leading_blank_lines =
Ok((remain, (_not_immediate_exit, first_line, (_trailing_whitespace, _exit_contents)))) => { leading_blank_lines.map(|(source, (first_line, _remaining_lines))| {
let mut element = Element::Paragraph(Paragraph::of_text(first_line.into())); let mut element = Element::Paragraph(Paragraph::of_text(first_line.into()));
let source = get_consumed(remaining, remain);
element.set_source(source.into()); element.set_source(source.into());
(remain, vec![element]) element
} });
Err(_) => { let (remaining, (mut children, _exit_contents)) =
let (remaining, (children, _exit_contents)) = many_till(element_matcher, exit_matcher)(remaining)?;
many_till(element_matcher, exit_matcher)(remaining)?; if let Some(lines) = leading_blank_lines {
(remaining, children) children.insert(0, lines);
} }
};
let (remaining, _end) = exit_with_name(&parser_context, remaining)?; let (remaining, _end) = exit_with_name(&parser_context, remaining)?;
// Not checking if parent exit matcher is causing exit because the greater_block_end matcher asserts we matched a full greater block // Not checking if parent exit matcher is causing exit because the greater_block_end matcher asserts we matched a full greater block
@@ -126,7 +127,6 @@ fn parameters<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> {
} }
fn greater_block_end<'c>(name: &'c str) -> impl ContextMatcher + 'c { fn greater_block_end<'c>(name: &'c str) -> impl ContextMatcher + 'c {
// TODO: Can this be done without making an owned copy?
move |context, input: OrgSource<'_>| _greater_block_end(context, input, name) move |context, input: OrgSource<'_>| _greater_block_end(context, input, name)
} }

View File

@@ -1,13 +1,12 @@
use nom::branch::alt; use nom::branch::alt;
use nom::bytes::complete::tag; use nom::bytes::complete::tag;
use nom::character::complete::anychar; use nom::character::complete::anychar;
use nom::character::complete::line_ending;
use nom::character::complete::space0; use nom::character::complete::space0;
use nom::character::complete::space1; use nom::character::complete::space1;
use nom::combinator::eof;
use nom::combinator::map; use nom::combinator::map;
use nom::combinator::not; use nom::combinator::not;
use nom::combinator::opt; use nom::combinator::opt;
use nom::combinator::peek;
use nom::combinator::recognize; use nom::combinator::recognize;
use nom::combinator::verify; use nom::combinator::verify;
use nom::multi::many0; use nom::multi::many0;
@@ -19,6 +18,9 @@ use nom::sequence::tuple;
use super::org_source::OrgSource; use super::org_source::OrgSource;
use super::section::section; use super::section::section;
use super::util::get_consumed; use super::util::get_consumed;
use super::util::org_line_ending;
use super::util::org_space;
use super::util::org_space_or_line_ending;
use super::util::start_of_line; use super::util::start_of_line;
use crate::context::parser_with_context; use crate::context::parser_with_context;
use crate::context::ContextElement; use crate::context::ContextElement;
@@ -81,10 +83,10 @@ fn _heading<'b, 'g, 'r, 's>(
Heading { Heading {
source: source.into(), source: source.into(),
stars: star_count, stars: star_count,
todo_keyword: maybe_todo_keyword.map(|((todo_keyword_type, todo_keyword), _ws)| { todo_keyword: maybe_todo_keyword.map(|(todo_keyword_type, todo_keyword)| {
(todo_keyword_type, Into::<&str>::into(todo_keyword)) (todo_keyword_type, Into::<&str>::into(todo_keyword))
}), }),
priority_cookie: maybe_priority.map(|(priority, _)| priority), priority_cookie: maybe_priority.map(|(_, priority)| priority),
title, title,
tags: heading_tags, tags: heading_tags,
children, children,
@@ -109,9 +111,9 @@ fn headline<'b, 'g, 'r, 's>(
OrgSource<'s>, OrgSource<'s>,
( (
usize, usize,
Option<((TodoKeywordType, OrgSource<'s>), OrgSource<'s>)>, Option<(TodoKeywordType, OrgSource<'s>)>,
Option<(PriorityCookie, OrgSource<'s>)>, Option<(OrgSource<'s>, PriorityCookie)>,
Option<(OrgSource<'s>, OrgSource<'s>)>, Option<OrgSource<'s>>,
Vec<Object<'s>>, Vec<Object<'s>>,
Vec<&'s str>, Vec<&'s str>,
), ),
@@ -122,45 +124,45 @@ fn headline<'b, 'g, 'r, 's>(
}); });
let parser_context = context.with_additional_node(&parser_context); let parser_context = context.with_additional_node(&parser_context);
let ( let (remaining, (_, star_count, _)) = tuple((
remaining,
(
_,
star_count,
_,
maybe_todo_keyword,
maybe_priority,
maybe_comment,
title,
maybe_tags,
_,
_,
),
) = tuple((
start_of_line, start_of_line,
verify(many1_count(tag("*")), |star_count| { verify(many1_count(tag("*")), |star_count| {
*star_count > parent_stars *star_count > parent_stars
}), }),
space1, peek(org_space),
opt(tuple((
parser_with_context!(heading_keyword)(&parser_context),
space1,
))),
opt(tuple((priority_cookie, space1))),
opt(tuple((tag("COMMENT"), space1))),
many1(parser_with_context!(standard_set_object)(&parser_context)),
opt(tuple((space0, tags))),
space0,
alt((line_ending, eof)),
))(input)?; ))(input)?;
let (remaining, maybe_todo_keyword) = opt(tuple((
space1,
parser_with_context!(heading_keyword)(&parser_context),
peek(org_space_or_line_ending),
)))(remaining)?;
let (remaining, maybe_priority) = opt(tuple((space1, priority_cookie)))(remaining)?;
let (remaining, maybe_comment) = opt(tuple((
space1,
tag("COMMENT"),
peek(org_space_or_line_ending),
)))(remaining)?;
let (remaining, maybe_title) = opt(tuple((
space1,
many1(parser_with_context!(standard_set_object)(&parser_context)),
)))(remaining)?;
let (remaining, maybe_tags) = opt(tuple((space0, tags)))(remaining)?;
let (remaining, _) = tuple((space0, org_line_ending))(remaining)?;
Ok(( Ok((
remaining, remaining,
( (
star_count, star_count,
maybe_todo_keyword, maybe_todo_keyword.map(|(_, todo, _)| todo),
maybe_priority, maybe_priority,
maybe_comment, maybe_comment.map(|(_, comment, _)| comment),
title, maybe_title.map(|(_, title)| title).unwrap_or(Vec::new()),
maybe_tags maybe_tags
.map(|(_ws, tags)| { .map(|(_ws, tags)| {
tags.into_iter() tags.into_iter()
@@ -177,10 +179,7 @@ fn headline_title_end<'b, 'g, 'r, 's>(
_context: RefContext<'b, 'g, 'r, 's>, _context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>, input: OrgSource<'s>,
) -> Res<OrgSource<'s>, OrgSource<'s>> { ) -> Res<OrgSource<'s>, OrgSource<'s>> {
recognize(tuple(( recognize(tuple((space0, opt(tuple((tags, space0))), org_line_ending)))(input)
opt(tuple((space0, tags, space0))),
alt((line_ending, eof)),
)))(input)
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]

View File

@@ -44,9 +44,17 @@ pub(crate) fn todo_keywords<'s>(input: &'s str) -> Res<&'s str, (Vec<&'s str>, V
} }
fn todo_keyword_word<'s>(input: &'s str) -> Res<&'s str, &'s str> { fn todo_keyword_word<'s>(input: &'s str) -> Res<&'s str, &'s str> {
verify(take_till(|c| " \t\r\n|".contains(c)), |result: &str| { let (remaining, keyword) = verify(take_till(|c| "( \t\r\n|".contains(c)), |result: &str| {
!result.is_empty() !result.is_empty()
})(input) })(input)?;
let (remaining, _) = opt(tuple((
tag("("),
take_till(|c| "() \t\r\n|".contains(c)),
tag(")"),
)))(remaining)?;
Ok((remaining, keyword))
} }
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {

View File

@@ -7,6 +7,7 @@ use nom::character::complete::one_of;
use nom::character::complete::space0; use nom::character::complete::space0;
use nom::character::complete::space1; use nom::character::complete::space1;
use nom::combinator::eof; use nom::combinator::eof;
use nom::combinator::map;
use nom::combinator::not; use nom::combinator::not;
use nom::combinator::opt; use nom::combinator::opt;
use nom::combinator::peek; use nom::combinator::peek;
@@ -21,6 +22,7 @@ use super::element_parser::element;
use super::object_parser::standard_set_object; use super::object_parser::standard_set_object;
use super::org_source::OrgSource; use super::org_source::OrgSource;
use super::util::include_input; use super::util::include_input;
use super::util::indentation_level;
use super::util::non_whitespace_character; use super::util::non_whitespace_character;
use crate::context::parser_with_context; use crate::context::parser_with_context;
use crate::context::ContextElement; use crate::context::ContextElement;
@@ -35,18 +37,24 @@ use crate::parser::util::blank_line;
use crate::parser::util::exit_matcher_parser; use crate::parser::util::exit_matcher_parser;
use crate::parser::util::get_consumed; use crate::parser::util::get_consumed;
use crate::parser::util::maybe_consume_trailing_whitespace_if_not_exiting; use crate::parser::util::maybe_consume_trailing_whitespace_if_not_exiting;
use crate::parser::util::org_space;
use crate::parser::util::start_of_line; use crate::parser::util::start_of_line;
use crate::types::CheckboxType;
use crate::types::IndentationLevel;
use crate::types::Object; use crate::types::Object;
use crate::types::PlainList; use crate::types::PlainList;
use crate::types::PlainListItem; use crate::types::PlainListItem;
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
pub(crate) fn detect_plain_list<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, ()> { pub(crate) fn detect_plain_list<'b, 'g, 'r, 's>(
context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, ()> {
if verify( if verify(
tuple(( tuple((
start_of_line, start_of_line,
space0, space0,
bullet, parser_with_context!(bullet)(context),
alt((space1, line_ending, eof)), alt((space1, line_ending, eof)),
)), )),
|(_start, indent, bull, _after_whitespace)| { |(_start, indent, bull, _after_whitespace)| {
@@ -81,7 +89,7 @@ pub(crate) fn plain_list<'b, 'g, 'r, 's>(
let parser_context = parser_context.with_additional_node(&contexts[2]); let parser_context = parser_context.with_additional_node(&contexts[2]);
// children stores tuple of (input string, parsed object) so we can re-parse the final item // children stores tuple of (input string, parsed object) so we can re-parse the final item
let mut children = Vec::new(); let mut children = Vec::new();
let mut first_item_indentation: Option<usize> = None; let mut first_item_indentation: Option<IndentationLevel> = None;
let mut remaining = input; let mut remaining = input;
// The final list item does not consume trailing blank lines (which instead get consumed by the list). We have three options here: // The final list item does not consume trailing blank lines (which instead get consumed by the list). We have three options here:
@@ -142,17 +150,20 @@ fn plain_list_item<'b, 'g, 'r, 's>(
input: OrgSource<'s>, input: OrgSource<'s>,
) -> Res<OrgSource<'s>, PlainListItem<'s>> { ) -> Res<OrgSource<'s>, PlainListItem<'s>> {
start_of_line(input)?; start_of_line(input)?;
let (remaining, leading_whitespace) = space0(input)?; let (remaining, (indent_level, _leading_whitespace)) = indentation_level(context, input)?;
// It is fine that we get the indent level using the number of bytes rather than the number of characters because nom's space0 only matches space and tab (0x20 and 0x09) let (remaining, bull) = verify(
let indent_level = leading_whitespace.len(); parser_with_context!(bullet)(context),
let (remaining, bull) = verify(bullet, |bull: &OrgSource<'_>| { |bull: &OrgSource<'_>| Into::<&str>::into(bull) != "*" || indent_level > 0,
Into::<&str>::into(bull) != "*" || indent_level > 0 )(remaining)?;
})(remaining)?;
let (remaining, _maybe_counter_set) = let (remaining, _maybe_counter_set) = opt(tuple((
opt(tuple((space1, tag("[@"), counter, tag("]"))))(remaining)?; space1,
tag("[@"),
parser_with_context!(counter)(context),
tag("]"),
)))(remaining)?;
// TODO: parse checkbox let (remaining, maybe_checkbox) = opt(tuple((space1, item_checkbox)))(remaining)?;
let (remaining, maybe_tag) = let (remaining, maybe_tag) =
opt(tuple((space1, parser_with_context!(item_tag)(context))))(remaining)?; opt(tuple((space1, parser_with_context!(item_tag)(context))))(remaining)?;
@@ -170,6 +181,7 @@ fn plain_list_item<'b, 'g, 'r, 's>(
source: source.into(), source: source.into(),
indentation: indent_level, indentation: indent_level,
bullet: bull.into(), bullet: bull.into(),
checkbox: None,
tag: maybe_tag tag: maybe_tag
.map(|(_ws, item_tag)| item_tag) .map(|(_ws, item_tag)| item_tag)
.unwrap_or(Vec::new()), .unwrap_or(Vec::new()),
@@ -219,6 +231,8 @@ fn plain_list_item<'b, 'g, 'r, 's>(
source: source.into(), source: source.into(),
indentation: indent_level, indentation: indent_level,
bullet: bull.into(), bullet: bull.into(),
checkbox: maybe_checkbox
.map(|(_, (checkbox_type, source))| (checkbox_type, Into::<&str>::into(source))),
tag: maybe_tag tag: maybe_tag
.map(|(_ws, item_tag)| item_tag) .map(|(_ws, item_tag)| item_tag)
.unwrap_or(Vec::new()), .unwrap_or(Vec::new()),
@@ -228,18 +242,36 @@ fn plain_list_item<'b, 'g, 'r, 's>(
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn bullet<'s>(i: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> { fn bullet<'b, 'g, 'r, 's>(
context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, OrgSource<'s>> {
alt(( alt((
tag("*"), tag("*"),
tag("-"), tag("-"),
tag("+"), tag("+"),
recognize(tuple((counter, alt((tag("."), tag(")")))))), recognize(tuple((
))(i) parser_with_context!(counter)(context),
alt((tag("."), tag(")"))),
))),
))(input)
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn counter<'s>(i: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> { fn counter<'b, 'g, 'r, 's>(
alt((recognize(one_of("abcdefghijklmnopqrstuvwxyz")), digit1))(i) context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, OrgSource<'s>> {
if context.get_global_settings().org_list_allow_alphabetical {
alt((
recognize(one_of(
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",
)),
digit1,
))(input)
} else {
digit1(input)
}
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
@@ -255,7 +287,7 @@ fn plain_list_end<'b, 'g, 'r, 's>(
)))(input) )))(input)
} }
const fn plain_list_item_end(indent_level: usize) -> impl ContextMatcher { const fn plain_list_item_end(indent_level: IndentationLevel) -> impl ContextMatcher {
let line_indented_lte_matcher = line_indented_lte(indent_level); let line_indented_lte_matcher = line_indented_lte(indent_level);
move |context, input: OrgSource<'_>| { move |context, input: OrgSource<'_>| {
_plain_list_item_end(context, input, &line_indented_lte_matcher) _plain_list_item_end(context, input, &line_indented_lte_matcher)
@@ -278,20 +310,23 @@ fn _plain_list_item_end<'b, 'g, 'r, 's>(
)))(input) )))(input)
} }
const fn line_indented_lte(indent_level: usize) -> impl ContextMatcher { const fn line_indented_lte(indent_level: IndentationLevel) -> impl ContextMatcher {
move |context, input: OrgSource<'_>| _line_indented_lte(context, input, indent_level) move |context, input: OrgSource<'_>| _line_indented_lte(context, input, indent_level)
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn _line_indented_lte<'b, 'g, 'r, 's>( fn _line_indented_lte<'b, 'g, 'r, 's>(
_context: RefContext<'b, 'g, 'r, 's>, context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>, input: OrgSource<'s>,
indent_level: usize, indent_level: IndentationLevel,
) -> Res<OrgSource<'s>, OrgSource<'s>> { ) -> Res<OrgSource<'s>, OrgSource<'s>> {
let matched = recognize(verify( let matched = recognize(verify(
tuple((space0::<OrgSource<'_>, _>, non_whitespace_character)), tuple((
parser_with_context!(indentation_level)(context),
non_whitespace_character,
)),
// It is fine that we get the indent level using the number of bytes rather than the number of characters because nom's space0 only matches space and tab (0x20 and 0x09) // It is fine that we get the indent level using the number of bytes rather than the number of characters because nom's space0 only matches space and tab (0x20 and 0x09)
|(_space0, _anychar)| _space0.len() <= indent_level, |((indentation_level, _leading_whitespace), _anychar)| *indentation_level <= indent_level,
))(input)?; ))(input)?;
Ok(matched) Ok(matched)
@@ -363,6 +398,18 @@ fn item_tag_post_gap<'b, 'g, 'r, 's>(
)(input) )(input)
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn item_checkbox<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, (CheckboxType, OrgSource<'s>)> {
alt((
map(
recognize(tuple((tag("["), org_space, tag("]")))),
|capture| (CheckboxType::Off, capture),
),
map(tag("[-]"), |capture| (CheckboxType::Trans, capture)),
map(tag("[X]"), |capture| (CheckboxType::On, capture)),
))(input)
}
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn detect_contentless_item_contents<'b, 'g, 'r, 's>( fn detect_contentless_item_contents<'b, 'g, 'r, 's>(
context: RefContext<'b, 'g, 'r, 's>, context: RefContext<'b, 'g, 'r, 's>,
@@ -558,21 +605,30 @@ dolar"#,
r#"+ r#"+
"#, "#,
); );
let result = detect_plain_list(input); let global_settings = GlobalSettings::default();
let initial_context = ContextElement::document_context();
let initial_context = Context::new(&global_settings, List::new(&initial_context));
let result = detect_plain_list(&initial_context, input);
assert!(result.is_ok()); assert!(result.is_ok());
} }
#[test] #[test]
fn detect_eof() { fn detect_eof() {
let input = OrgSource::new(r#"+"#); let input = OrgSource::new(r#"+"#);
let result = detect_plain_list(input); let global_settings = GlobalSettings::default();
let initial_context = ContextElement::document_context();
let initial_context = Context::new(&global_settings, List::new(&initial_context));
let result = detect_plain_list(&initial_context, input);
assert!(result.is_ok()); assert!(result.is_ok());
} }
#[test] #[test]
fn detect_no_gap() { fn detect_no_gap() {
let input = OrgSource::new(r#"+foo"#); let input = OrgSource::new(r#"+foo"#);
let result = detect_plain_list(input); let global_settings = GlobalSettings::default();
let initial_context = ContextElement::document_context();
let initial_context = Context::new(&global_settings, List::new(&initial_context));
let result = detect_plain_list(&initial_context, input);
// Since there is no whitespace after the '+' this is a paragraph, not a plain list. // Since there is no whitespace after the '+' this is a paragraph, not a plain list.
assert!(result.is_err()); assert!(result.is_err());
} }
@@ -580,7 +636,10 @@ dolar"#,
#[test] #[test]
fn detect_with_gap() { fn detect_with_gap() {
let input = OrgSource::new(r#"+ foo"#); let input = OrgSource::new(r#"+ foo"#);
let result = detect_plain_list(input); let global_settings = GlobalSettings::default();
let initial_context = ContextElement::document_context();
let initial_context = Context::new(&global_settings, List::new(&initial_context));
let result = detect_plain_list(&initial_context, input);
assert!(result.is_ok()); assert!(result.is_ok());
} }
} }

View File

@@ -1,17 +1,24 @@
use nom::branch::alt; use nom::branch::alt;
use nom::bytes::complete::tag; use nom::bytes::complete::is_not;
use nom::bytes::complete::tag_no_case;
use nom::character::complete::anychar; use nom::character::complete::anychar;
use nom::combinator::map; use nom::character::complete::line_ending;
use nom::character::complete::one_of;
use nom::combinator::peek; use nom::combinator::peek;
use nom::combinator::recognize; use nom::combinator::recognize;
use nom::combinator::verify; use nom::combinator::verify;
use nom::multi::many1;
use nom::multi::many_till; use nom::multi::many_till;
use super::org_source::OrgSource; use super::org_source::OrgSource;
use super::radio_link::RematchObject; use super::radio_link::RematchObject;
use super::util::exit_matcher_parser; use super::util::exit_matcher_parser;
use super::util::get_consumed;
use super::util::org_space_or_line_ending;
use crate::context::parser_with_context; use crate::context::parser_with_context;
use crate::context::RefContext; use crate::context::RefContext;
use crate::error::CustomError;
use crate::error::MyError;
use crate::error::Res; use crate::error::Res;
use crate::types::Object; use crate::types::Object;
use crate::types::PlainText; use crate::types::PlainText;
@@ -72,11 +79,52 @@ impl<'x> RematchObject<'x> for PlainText<'x> {
_context: RefContext<'b, 'g, 'r, 's>, _context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>, input: OrgSource<'s>,
) -> Res<OrgSource<'s>, Object<'s>> { ) -> Res<OrgSource<'s>, Object<'s>> {
map(tag(self.source), |s| { let mut remaining = input;
let mut goal = self.source;
loop {
if goal.is_empty() {
break;
}
// let is_whitespace = recognize(many1(org_space_or_line_ending))(input);
let is_not_whitespace = is_not::<&str, &str, CustomError<_>>(" \t\r\n")(goal);
match is_not_whitespace {
Ok((new_goal, payload)) => {
let (new_remaining, _) = tag_no_case(payload)(remaining)?;
remaining = new_remaining;
goal = new_goal;
continue;
}
Err(_) => {}
};
let is_whitespace = recognize(many1(alt((
recognize(one_of::<&str, &str, CustomError<_>>(" \t")),
line_ending,
))))(goal);
match is_whitespace {
Ok((new_goal, _)) => {
let (new_remaining, _) = many1(org_space_or_line_ending)(remaining)?;
remaining = new_remaining;
goal = new_goal;
continue;
}
Err(_) => {}
};
return Err(nom::Err::Error(CustomError::MyError(MyError(
"Target does not match.".into(),
))));
}
let source = get_consumed(input, remaining);
Ok((
remaining,
Object::PlainText(PlainText { Object::PlainText(PlainText {
source: Into::<&str>::into(s), source: Into::<&str>::into(source),
}) }),
})(input) ))
} }
} }

View File

@@ -1,16 +1,16 @@
use nom::branch::alt; use nom::branch::alt;
use nom::bytes::complete::is_not;
use nom::bytes::complete::tag; use nom::bytes::complete::tag;
use nom::bytes::complete::tag_no_case; use nom::bytes::complete::tag_no_case;
use nom::character::complete::line_ending;
use nom::character::complete::space0; use nom::character::complete::space0;
use nom::character::complete::space1; use nom::character::complete::space1;
use nom::combinator::eof; use nom::multi::many1;
use nom::multi::separated_list1;
use nom::sequence::tuple; use nom::sequence::tuple;
use super::org_source::OrgSource; use super::org_source::OrgSource;
use super::timestamp::timestamp;
use super::util::maybe_consume_trailing_whitespace_if_not_exiting; use super::util::maybe_consume_trailing_whitespace_if_not_exiting;
use super::util::org_line_ending;
use crate::context::parser_with_context;
use crate::context::RefContext; use crate::context::RefContext;
use crate::error::Res; use crate::error::Res;
use crate::parser::util::get_consumed; use crate::parser::util::get_consumed;
@@ -24,8 +24,9 @@ pub(crate) fn planning<'b, 'g, 'r, 's>(
) -> Res<OrgSource<'s>, Planning<'s>> { ) -> Res<OrgSource<'s>, Planning<'s>> {
start_of_line(input)?; start_of_line(input)?;
let (remaining, _leading_whitespace) = space0(input)?; let (remaining, _leading_whitespace) = space0(input)?;
let (remaining, _planning_parameters) = separated_list1(space1, planning_parameter)(remaining)?; let (remaining, _planning_parameters) =
let (remaining, _trailing_ws) = tuple((space0, alt((line_ending, eof))))(remaining)?; many1(parser_with_context!(planning_parameter)(context))(remaining)?;
let (remaining, _trailing_ws) = tuple((space0, org_line_ending))(remaining)?;
let (remaining, _trailing_ws) = let (remaining, _trailing_ws) =
maybe_consume_trailing_whitespace_if_not_exiting(context, remaining)?; maybe_consume_trailing_whitespace_if_not_exiting(context, remaining)?;
@@ -40,15 +41,17 @@ pub(crate) fn planning<'b, 'g, 'r, 's>(
} }
#[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))] #[cfg_attr(feature = "tracing", tracing::instrument(ret, level = "debug"))]
fn planning_parameter<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> { fn planning_parameter<'b, 'g, 'r, 's>(
context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, OrgSource<'s>> {
let (remaining, _planning_type) = alt(( let (remaining, _planning_type) = alt((
tag_no_case("DEADLINE"), tag_no_case("DEADLINE"),
tag_no_case("SCHEDULED"), tag_no_case("SCHEDULED"),
tag_no_case("CLOSED"), tag_no_case("CLOSED"),
))(input)?; ))(input)?;
let (remaining, _gap) = tuple((tag(":"), space1))(remaining)?; let (remaining, _gap) = tuple((tag(":"), space1))(remaining)?;
// TODO: Make this invoke the real timestamp parser. let (remaining, _timestamp) = timestamp(context, remaining)?;
let (remaining, _timestamp) = tuple((tag("<"), is_not("\r\n>"), tag(">")))(remaining)?;
let source = get_consumed(input, remaining); let source = get_consumed(input, remaining);
Ok((remaining, source)) Ok((remaining, source))
} }

View File

@@ -2,7 +2,7 @@ use nom::branch::alt;
use nom::bytes::complete::escaped; use nom::bytes::complete::escaped;
use nom::bytes::complete::tag; use nom::bytes::complete::tag;
use nom::bytes::complete::take_till1; use nom::bytes::complete::take_till1;
use nom::character::complete::one_of; use nom::character::complete::anychar;
use nom::combinator::verify; use nom::combinator::verify;
use nom::multi::many_till; use nom::multi::many_till;
@@ -82,7 +82,7 @@ fn pathreg<'b, 'g, 'r, 's>(
_ => false, _ => false,
}), }),
'\\', '\\',
one_of(r#"]"#), anychar,
)(input)?; )(input)?;
Ok((remaining, path)) Ok((remaining, path))
} }

View File

@@ -1,6 +1,7 @@
use nom::branch::alt; use nom::branch::alt;
use nom::bytes::complete::tag; use nom::bytes::complete::tag;
use nom::character::complete::anychar; use nom::character::complete::anychar;
use nom::character::complete::digit0;
use nom::character::complete::digit1; use nom::character::complete::digit1;
use nom::character::complete::one_of; use nom::character::complete::one_of;
use nom::character::complete::space1; use nom::character::complete::space1;
@@ -414,7 +415,7 @@ fn repeater<'b, 'g, 'r, 's>(
// ++ for catch-up type // ++ for catch-up type
// .+ for restart type // .+ for restart type
let (remaining, _mark) = alt((tag("++"), tag("+"), tag(".+")))(input)?; let (remaining, _mark) = alt((tag("++"), tag("+"), tag(".+")))(input)?;
let (remaining, _value) = digit1(remaining)?; let (remaining, _value) = digit0(remaining)?;
// h = hour, d = day, w = week, m = month, y = year // h = hour, d = day, w = week, m = month, y = year
let (remaining, _unit) = recognize(one_of("hdwmy"))(remaining)?; let (remaining, _unit) = recognize(one_of("hdwmy"))(remaining)?;
let source = get_consumed(input, remaining); let source = get_consumed(input, remaining);
@@ -429,7 +430,7 @@ fn warning_delay<'b, 'g, 'r, 's>(
// - for all type // - for all type
// -- for first type // -- for first type
let (remaining, _mark) = alt((tag("--"), tag("-")))(input)?; let (remaining, _mark) = alt((tag("--"), tag("-")))(input)?;
let (remaining, _value) = digit1(remaining)?; let (remaining, _value) = digit0(remaining)?;
// h = hour, d = day, w = week, m = month, y = year // h = hour, d = day, w = week, m = month, y = year
let (remaining, _unit) = recognize(one_of("hdwmy"))(remaining)?; let (remaining, _unit) = recognize(one_of("hdwmy"))(remaining)?;
let source = get_consumed(input, remaining); let source = get_consumed(input, remaining);

View File

@@ -20,6 +20,7 @@ use crate::context::RefContext;
use crate::error::CustomError; use crate::error::CustomError;
use crate::error::MyError; use crate::error::MyError;
use crate::error::Res; use crate::error::Res;
use crate::types::IndentationLevel;
pub(crate) const WORD_CONSTITUENT_CHARACTERS: &str = pub(crate) const WORD_CONSTITUENT_CHARACTERS: &str =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"; "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
@@ -212,6 +213,9 @@ fn text_until_eol<'r, 's>(
Ok(line.trim()) Ok(line.trim())
} }
/// Return a tuple of (input, output) from a nom parser.
///
/// This is similar to recognize except it returns the input instead of the portion of the input that was consumed.
pub(crate) fn include_input<'s, F, O>( pub(crate) fn include_input<'s, F, O>(
mut inner: F, mut inner: F,
) -> impl FnMut(OrgSource<'s>) -> Res<OrgSource<'s>, (OrgSource<'s>, O)> ) -> impl FnMut(OrgSource<'s>) -> Res<OrgSource<'s>, (OrgSource<'s>, O)>
@@ -223,3 +227,43 @@ where
Ok((remaining, (input, output))) Ok((remaining, (input, output)))
} }
} }
/// Match single space or tab.
///
/// In org-mode syntax, spaces and tabs are interchangeable.
pub(crate) fn org_space<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, char> {
one_of(" \t")(input)
}
/// Matches a single space, tab, line ending, or end of file.
///
/// In org-mode syntax there are often delimiters that could be any whitespace at all or the end of file.
pub(crate) fn org_space_or_line_ending<'s>(
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, OrgSource<'s>> {
alt((recognize(org_space), org_line_ending))(input)
}
/// Match a line break or the end of the file.
///
/// In org-mode syntax, the end of the file can serve the same purpose as a line break syntactically.
pub(crate) fn org_line_ending<'s>(input: OrgSource<'s>) -> Res<OrgSource<'s>, OrgSource<'s>> {
alt((line_ending, eof))(input)
}
/// Match the whitespace at the beginning of a line and give it an indentation level.
pub(crate) fn indentation_level<'b, 'g, 'r, 's>(
context: RefContext<'b, 'g, 'r, 's>,
input: OrgSource<'s>,
) -> Res<OrgSource<'s>, (IndentationLevel, OrgSource<'s>)> {
let (remaining, leading_whitespace) = space0(input)?;
let indentation_level = Into::<&str>::into(leading_whitespace)
.chars()
.map(|c| match c {
' ' => 1,
'\t' => context.get_global_settings().tab_width,
_ => unreachable!(),
})
.sum();
Ok((remaining, (indentation_level, leading_whitespace)))
}

View File

@@ -10,15 +10,26 @@ pub struct PlainList<'s> {
pub children: Vec<PlainListItem<'s>>, pub children: Vec<PlainListItem<'s>>,
} }
/// The width that something is indented. For example, a single tab character could be a value of 4 or 8.
pub type IndentationLevel = u16;
#[derive(Debug)] #[derive(Debug)]
pub struct PlainListItem<'s> { pub struct PlainListItem<'s> {
pub source: &'s str, pub source: &'s str,
pub indentation: usize, pub indentation: IndentationLevel,
pub bullet: &'s str, pub bullet: &'s str,
pub checkbox: Option<(CheckboxType, &'s str)>,
pub tag: Vec<Object<'s>>, pub tag: Vec<Object<'s>>,
pub children: Vec<Element<'s>>, pub children: Vec<Element<'s>>,
} }
#[derive(Debug)]
pub enum CheckboxType {
On,
Trans,
Off,
}
#[derive(Debug)] #[derive(Debug)]
pub struct GreaterBlock<'s> { pub struct GreaterBlock<'s> {
pub source: &'s str, pub source: &'s str,

View File

@@ -11,10 +11,12 @@ pub use document::PriorityCookie;
pub use document::Section; pub use document::Section;
pub use document::TodoKeywordType; pub use document::TodoKeywordType;
pub use element::Element; pub use element::Element;
pub use greater_element::CheckboxType;
pub use greater_element::Drawer; pub use greater_element::Drawer;
pub use greater_element::DynamicBlock; pub use greater_element::DynamicBlock;
pub use greater_element::FootnoteDefinition; pub use greater_element::FootnoteDefinition;
pub use greater_element::GreaterBlock; pub use greater_element::GreaterBlock;
pub use greater_element::IndentationLevel;
pub use greater_element::NodeProperty; pub use greater_element::NodeProperty;
pub use greater_element::PlainList; pub use greater_element::PlainList;
pub use greater_element::PlainListItem; pub use greater_element::PlainListItem;