organic/notes/optimization_ideas.org
2023-09-14 03:25:12 -04:00

2.3 KiB

Analysis

Parse start per character

It might help analysis to record how often we start a specific type of parse for each character. For example, at the start of a plain list, if we had a count of how often each character was the start of a parse of a list we could use that to see how often that list is getting re-parsed.

Optimizations

Edit whitespace for list items

Whether or not a list item owns the trailing whitespace depends on if it is the last list item in that list. Since we do not know ahead of time if an item is the last item in the list, we have to either re-parse the list item or modify it after parsing.

For

We already are modifying the source of some elements after-the-fact with

set_source()
so this would be more of the same.

Against

I'd like to phase out such modifications because they seem hacky and fragile.

Make detect element function

Some exit matchers are based on when the next element is found. Some elements do not need to be fully parsed to identify that they are a valid element. For example,

1. foo
can already be identified as the start of a plain list (in the right context) without needing to parse the entire element.

For

Avoiding parsing the entire element for an exit matcher would reduce redundant parses.

Against

This adds code complexity and introduces the potential for bugs.

How many elements can be reasonably early-detected? For example,

#+begin_src foo} is not enough to detect the start of a source block because without the src_org{#+end_src
it is just plain text.

Grab multiple characters in plaintext parser before checking exit matcher

Currently we check the exit matcher after each character inside the plain text parser (and many others). Are there character sequences we can assume no exit matcher will trigger between? For example, a contiguous string of latin-alphabet letters?

For

This could significantly reduce our calls to exit matchers.

Against

I think targets would break this.

The exit matchers are already implicitly building this behavior since they should all exit very early when the starting character is wrong. Putting this logic in a centralized place, far away from where those characters are actually going to be used, is unfortunate for readability.

Use exit matcher to cut off trailing whitespace instead of re-matching in plain lists.