Compare commits

..

32 Commits

Author SHA1 Message Date
Tom Alexander
3760358783 Compile the elisp ahead of time so it is not done on every docker container launch. 2023-09-20 02:30:58 -04:00
Tom Alexander
024b2ade03 Update org-mode version. 2023-09-16 14:46:02 -04:00
Tom Alexander
55e5c31368 Update org-mode version. 2023-09-08 13:11:28 -04:00
Tom Alexander
4a556bc84f Use read-only root for docker containers. 2023-08-31 21:21:14 -04:00
Tom Alexander
9bf2a912d6 Enable unicode in the docker container. 2023-08-31 20:23:00 -04:00
Tom Alexander
e8f262727d Add a script to build and launch the docker container in one step. 2023-08-31 15:18:25 -04:00
Tom Alexander
b4170dda1f Update org-mode version. 2023-08-25 04:58:45 -04:00
Tom Alexander
bd99fbc4c4 Get the versions of emacs and org-mode and write them to stdout. 2023-08-25 02:30:52 -04:00
Tom Alexander
79c834a1e6 Add the init flag to the docker run command. 2023-08-25 02:10:03 -04:00
Tom Alexander
2505a10275 Parameterize the emacs and org-mode versions in the dockerfile. 2023-08-25 02:02:57 -04:00
Tom Alexander
cfc9153c28 Handle nodes that do not have a contents begin like fixed width areas. 2023-08-20 16:25:57 -04:00
Tom Alexander
13a73efdcf Handle line numbers properly when selected node does not end in a line break. 2023-08-20 15:55:54 -04:00
Tom Alexander
cba1d1e988 Add notes from plain list ownership investigation. 2023-08-19 01:10:42 -04:00
Tom Alexander
e8a89dfeca Remove log statement. 2023-08-18 23:33:48 -04:00
Tom Alexander
367dfaa146 Update org-mode version. 2023-08-18 23:32:44 -04:00
Tom Alexander
c4762510f4 Handle unicode.
Turns out javascript iterates over strings by character, but all the string functions like slicing, lastIndexOf, and indexing with [] are all based on codepoints without taking into account surrogate pairs like orange heart. It would have been nice if that was mentioned in the documentation...
2023-08-18 23:32:21 -04:00
Tom Alexander
372542d914 Add a print to announce the server is running. 2023-08-18 22:32:01 -04:00
Tom Alexander
0d6621d389 Add docker. 2023-08-18 22:26:42 -04:00
Tom Alexander
e96c39e3e0 Add a README. 2023-08-18 21:35:39 -04:00
Tom Alexander
9032b00e1b Fix handling of plain text. 2023-08-18 21:22:53 -04:00
Tom Alexander
acdc8b8993 Highlighting characters. 2023-08-18 21:06:43 -04:00
Tom Alexander
676dffa15f Rendering ast tree. 2023-08-18 19:23:31 -04:00
Tom Alexander
ab836f2794 Switch to returning the whole tree from rust instead of just the lists. 2023-08-18 19:11:51 -04:00
Tom Alexander
0ee33949e9 Beginning of rendering the ast list. 2023-08-18 18:32:23 -04:00
Tom Alexander
27a2bea705 Split the output so I can have a tree. 2023-08-18 17:40:19 -04:00
Tom Alexander
4fb203c1db Putting in new-line characters in the empty lines has fixed copy+paste and made the min-height css unnecessary. 2023-08-18 17:20:45 -04:00
Tom Alexander
51b4eed034 Beginning the render the parsed content. 2023-08-18 17:10:55 -04:00
Tom Alexander
c3be0f249d Minor style improvements. 2023-08-18 16:26:05 -04:00
Tom Alexander
13fab742e5 Add a sample output code block. 2023-08-18 16:19:37 -04:00
Tom Alexander
893de9a65e Set cache control headers for the static files. 2023-08-18 15:50:22 -04:00
Tom Alexander
bff0a62291 Change response to impl IntoResponse. 2023-08-18 15:41:23 -04:00
Tom Alexander
c24c5ee54e POSTing the body to the server. 2023-08-18 15:41:06 -04:00
17 changed files with 946 additions and 159 deletions

7
.dockerignore Normal file
View File

@@ -0,0 +1,7 @@
**/.git
target/
docker/
LICENSE
readme/
README.md
notes/

1
Cargo.lock generated
View File

@@ -394,6 +394,7 @@ dependencies = [
"nom",
"serde",
"tokio",
"tower",
"tower-http",
]

View File

@@ -8,4 +8,10 @@ axum = { git = "https://github.com/tokio-rs/axum.git", rev = "52a90390195e884bcc
nom = "7.1.1"
serde = { version = "1.0.183", features = ["derive"] }
tokio = { version = "1.30.0", default-features = false, features = ["macros", "process", "rt", "rt-multi-thread"] }
tower-http = { version = "0.4.3", features = ["fs"] }
tower = "0.4.13"
tower-http = { version = "0.4.3", features = ["fs", "set-header"] }
[profile.release-lto]
inherits = "release"
lto = true
strip = "symbols"

33
README.md Normal file
View File

@@ -0,0 +1,33 @@
# Org-Mode AST Investigation Tool
This repository contains a slapdash tool to make visualizing the abstract syntax tree of an org-mode document easier. Write your org-mode source into the top text box, and below on the right it will create a clickable tree of the AST. When you click on a node, the contents of that node will be highlighted on the left.
![Screenshot showing the interface to the org-mode abstract syntax tree investigation tool.](readme/screenshot.png?raw=true "Org-mode investigation tool interface")
## Running
Running in docker is the recommended way to run this. It creates a consistent working environment, without impacting (or requiring you to install) emacs, org-mode, or rust.
### Docker
First we need to build the docker container. On the first run, this will pull the emacs and org-mode source code so this build will take a while the first time. After that, subsequent builds should be fast because docker caches the layers.
```bash
# from the root of this repository:
make --directory=docker
```
Next we need to launch the server:
```bash
docker run --init --rm --publish 3000:3000/tcp --read-only --mount type=tmpfs,destination=/tmp org-investigation
```
This launches a server listening on port 3000, so pop open your browser to http://127.0.0.1:3000/ to access the web interface.
(alternatively, you can run the `scripts/launch_docker.bash` script which performs these two steps.)
### No docker
You will need a fully functional rust setup with nightly installed (due to the use of exit_status_error). Then from the root of this repo you can launch the server by running:
```bash
cargo run --release
```
It will use your installed version of emacs and org-mode which may differ from what the docker users are using.
This launches a server listening on port 3000, so pop open your browser to http://127.0.0.1:3000/ to access the web interface.

44
docker/Dockerfile Normal file
View File

@@ -0,0 +1,44 @@
FROM alpine:3.17 AS build
RUN apk add --no-cache build-base musl-dev git autoconf make texinfo gnutls-dev ncurses-dev gawk libgccjit-dev
FROM build AS build-emacs
ARG EMACS_VERSION=emacs-29.1
RUN git clone --depth 1 --branch $EMACS_VERSION https://git.savannah.gnu.org/git/emacs.git /root/emacs
WORKDIR /root/emacs
RUN mkdir /root/dist
RUN ./autogen.sh
RUN ./configure --prefix /usr --without-x --without-sound --with-native-compilation=aot
RUN make
RUN make DESTDIR="/root/dist" install
FROM build AS build-org-mode
ARG ORG_VERSION=c703541ffcc14965e3567f928de1683a1c1e33f6
COPY --from=build-emacs /root/dist/ /
RUN mkdir /root/dist
# Savannah does not allow fetching specific revisions, so we're going to have to put unnecessary load on their server by cloning main and then checking out the revision we want.
RUN git clone https://git.savannah.gnu.org/git/emacs/org-mode.git /root/org-mode && git -C /root/org-mode checkout $ORG_VERSION
# RUN mkdir /root/org-mode && git -C /root/org-mode init --initial-branch=main && git -C /root/org-mode remote add origin https://git.savannah.gnu.org/git/emacs/org-mode.git && git -C /root/org-mode fetch origin $ORG_VERSION && git -C /root/org-mode checkout FETCH_HEAD
WORKDIR /root/org-mode
RUN make compile
RUN make DESTDIR="/root/dist" install
FROM rustlang/rust:nightly-alpine3.17 AS build-org-investigation
RUN apk add --no-cache musl-dev
RUN mkdir /root/org-investigation
WORKDIR /root/org-investigation
COPY . .
RUN CARGO_TARGET_DIR=/target cargo build --profile release-lto
FROM alpine:3.17 AS run
ENV LANG=en_US.UTF-8
RUN apk add --no-cache ncurses gnutls libgccjit
COPY --from=build-emacs /root/dist/ /
COPY --from=build-org-mode /root/dist/ /
COPY --from=build-org-investigation /target/release-lto/org_ownership_investigation /usr/bin/
COPY static /opt/org-investigation/static
WORKDIR /opt/org-investigation
CMD ["/usr/bin/org_ownership_investigation"]

9
docker/Makefile Normal file
View File

@@ -0,0 +1,9 @@
IMAGE_NAME:=org-investigation
.PHONY: build
build:
docker build -t $(IMAGE_NAME) -f Dockerfile ../
.PHONY: clean
clean:
docker rmi $(IMAGE_NAME)

View File

@@ -0,0 +1,130 @@
* Test 1
** Source
#+begin_src org
1. foo
1. bar
2. baz
2. lorem
ipsum
#+end_src
** Ownership
This table is just showing ownership for the plain list items, not the containing plain list nor the elements inside each item.
| Plain List *Item* | Owns trailing blank lines |
|------------------------+---------------------------|
| foo (includes bar baz) | Yes |
| bar | Yes |
| baz | Yes |
| lorem | No |
** Analysis
In this test case, we see that the only list item that doesn't own its trailing blank lines is "lorem", the final list item of the outer-most list.
* Test 2
We add "cat" as a paragraph at the end of foo which makes "baz" lose its trailing blank lines.
** Source
#+begin_src org
1. foo
1. bar
2. baz
cat
2. lorem
ipsum
#+end_src
** Ownership
| Plain List *Item* | Owns trailing blank lines |
|-------------------------------+---------------------------|
| foo -> cat (includes bar baz) | Yes |
| bar | Yes |
| baz | No |
| lorem | No |
** Analysis
In isolation, this implies that the final plain list item does not own its trailing blank lines, which conflicts with "baz" from test 1.
New theory: List items own their trailing blank lines unless they are both the final list item and not the final element of a list item.
| Plain List *Item* | Owns trailing blank lines | Why |
|-------------------------------+---------------------------+-----------------------------------------------------------|
| foo -> cat (includes bar baz) | Yes | Not the final list item |
| bar | Yes | Not the final list item |
| baz | No | Final item of bar->baz and not the final element of "foo" |
| lorem | No | Final item of foo->lorem and not contained in a list item |
* Test 3
So if that theory is true, taking the entire (foo -> lorem) list from test 1 and nesting it inside a list should coerce "lorem" to own its trailing blank lines since it would then be a final list item (of foo -> lorem) and the final element of the new list.
** Source
#+begin_src org
1. cat
1. foo
1. bar
2. baz
2. lorem
ipsum
#+end_src
** Ownership
| Plain List *Item* | Owns trailing blank lines |
|-----------------------------+---------------------------|
| cat (includes foo -> lorem) | No |
| foo (includes bar baz) | Yes |
| bar | Yes |
| baz | Yes |
| lorem | No |
** Analysis
Against expectations, we did not coerce lorem to consume its trailing blank lines. What is different between "baz" and "lorem"? Well, "baz" is contained within "foo" which has a "lorem" after it, whereas "lorem" is contained within "cat" which does not have any list items after it.
New theory: List items own their trailing blank lines unless they are both the final list item and not the final element of a non-final list item.
| Plain List *Item* | Owns trailing blank lines | Why |
|-----------------------------+---------------------------+------------------------------------------------------|
| cat (includes foo -> lorem) | No | Final list item and not contained in a list item |
| foo (includes bar baz) | Yes | Not the final list item |
| bar | Yes | Not the final list item |
| baz | Yes | Final element of non-final list item |
| lorem | No | Final list item and final element of final list item |
* Test 4
So if that theory is true, then we should be able to coerce lorem to consume its trailing blank lines by adding a second item to the cat list.
** Source
#+begin_src org
1. cat
1. foo
1. bar
2. baz
2. lorem
2. dog
ipsum
#+end_src
** Ownership
| Plain List *Item* | Owns trailing blank lines |
|-----------------------------+---------------------------|
| cat (includes foo -> lorem) | Yes |
| foo (includes bar baz) | Yes |
| bar | Yes |
| baz | Yes |
| lorem | Yes |
| dog | No |
** Analysis
For the first time our expectations were met!
Enduring theory: List items own their trailing blank lines unless they are both the final list item and not the final element of a non-final list item.
| Plain List *Item* | Owns trailing blank lines | Why |
|-----------------------------+---------------------------+--------------------------------------------------|
| cat (includes foo -> lorem) | Yes | Not the final list item |
| foo (includes bar baz) | Yes | Not the final list item |
| bar | Yes | Not the final list item |
| baz | Yes | Final element of non-final list item |
| lorem | Yes | Final element of non-final list item |
| dog | No | Final list item and not contained in a list item |

BIN
readme/screenshot.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

12
scripts/launch_docker.bash Executable file
View File

@@ -0,0 +1,12 @@
#!/usr/bin/env bash
#
set -euo pipefail
IFS=$'\n\t'
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
function main {
make --directory "$DIR/../docker"
exec docker run --init --rm --read-only --mount type=tmpfs,destination=/tmp --publish 3000:3000/tcp org-investigation
}
main "${@}"

View File

@@ -1,36 +1,57 @@
#![feature(exit_status_error)]
use axum::http::header::CACHE_CONTROL;
use axum::http::HeaderValue;
use axum::response::IntoResponse;
use axum::{http::StatusCode, routing::post, Json, Router};
use owner_tree::{build_owner_tree, OwnerTree};
use parse::emacs_parse_org_document;
use owner_tree::build_owner_tree;
use parse::{emacs_parse_org_document, get_emacs_version};
use tower::ServiceBuilder;
use tower_http::services::{ServeDir, ServeFile};
use tower_http::set_header::SetResponseHeaderLayer;
use crate::parse::get_org_mode_version;
mod error;
mod owner_tree;
mod parse;
mod rtrim_iterator;
mod sexp;
#[tokio::main]
async fn main() {
let serve_dir = ServeDir::new("static").not_found_service(ServeFile::new("static/index.html"));
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let static_files_service = {
let serve_dir =
ServeDir::new("static").not_found_service(ServeFile::new("static/index.html"));
ServiceBuilder::new()
.layer(SetResponseHeaderLayer::if_not_present(
CACHE_CONTROL,
HeaderValue::from_static("public, max-age=120"),
))
.service(serve_dir)
};
let app = Router::new()
.route("/parse", post(parse_org_mode))
.fallback_service(serve_dir);
.fallback_service(static_files_service);
let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
axum::serve(listener, app).await.unwrap();
let (emacs_version, org_mode_version) =
tokio::join!(get_emacs_version(), get_org_mode_version());
println!("Using emacs version: {}", emacs_version?.trim());
println!("Using org-mode version: {}", org_mode_version?.trim());
let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
println!("Listening on port 3000. Pop open your browser to http://127.0.0.1:3000/ .");
axum::serve(listener, app).await?;
Ok(())
}
async fn parse_org_mode(
body: String,
) -> Result<(StatusCode, Json<OwnerTree>), (StatusCode, String)> {
async fn parse_org_mode(body: String) -> Result<impl IntoResponse, (StatusCode, String)> {
_parse_org_mode(body)
.await
.map_err(|e| (StatusCode::BAD_REQUEST, e.to_string()))
}
async fn _parse_org_mode(
body: String,
) -> Result<(StatusCode, Json<OwnerTree>), Box<dyn std::error::Error>> {
async fn _parse_org_mode(body: String) -> Result<impl IntoResponse, Box<dyn std::error::Error>> {
let ast = emacs_parse_org_document(&body).await?;
let owner_tree = build_owner_tree(body.as_str(), ast.as_str()).map_err(|e| e.to_string())?;
Ok((StatusCode::OK, Json(owner_tree)))

View File

@@ -1,18 +1,22 @@
use serde::Serialize;
use crate::sexp::{sexp_with_padding, Token};
use crate::{
rtrim_iterator::RTrimIterator,
sexp::{sexp_with_padding, Token},
};
pub fn build_owner_tree<'a>(
body: &'a str,
ast_raw: &'a str,
) -> Result<OwnerTree, Box<dyn std::error::Error + 'a>> {
let (_remaining, parsed_sexp) = sexp_with_padding(ast_raw)?;
let lists = find_lists_in_document(body, &parsed_sexp)?;
assert_name(&parsed_sexp, "org-data")?;
let ast_node = build_ast_node(body, None, &parsed_sexp)?;
Ok(OwnerTree {
input: body.to_owned(),
ast: ast_raw.to_owned(),
lists,
tree: ast_node,
})
}
@@ -20,7 +24,14 @@ pub fn build_owner_tree<'a>(
pub struct OwnerTree {
input: String,
ast: String,
lists: Vec<PlainList>,
tree: AstNode,
}
#[derive(Serialize)]
pub struct AstNode {
name: String,
position: SourceRange,
children: Vec<AstNode>,
}
#[derive(Serialize)]
@@ -37,108 +48,79 @@ pub struct PlainListItem {
#[derive(Serialize)]
pub struct SourceRange {
start_line: u32,
end_line: u32, // Exclusive
start_character: u32,
end_character: u32, // Exclusive
start_line: usize,
end_line: usize, // Exclusive
start_character: usize,
end_character: usize, // Exclusive
}
fn find_lists_in_document<'a>(
fn build_ast_node<'a>(
original_source: &str,
parent_contents_begin: Option<usize>,
current_token: &Token<'a>,
) -> Result<Vec<PlainList>, Box<dyn std::error::Error>> {
// DFS looking for top-level lists
let mut found_lists = Vec::new();
let children = current_token.as_list()?;
let token_name = "org-data";
assert_name(current_token, token_name)?;
// skip 2 to skip token name and standard properties
for child_token in children.iter().skip(2) {
found_lists.extend(recurse_token(original_source, child_token)?);
}
Ok(found_lists)
}
fn recurse_token<'a>(
original_source: &str,
current_token: &Token<'a>,
) -> Result<Vec<PlainList>, Box<dyn std::error::Error>> {
match current_token {
Token::Atom(_) | Token::TextWithProperties(_) => Ok(Vec::new()),
Token::List(_) => {
let new_lists = find_lists_in_list(original_source, current_token)?;
Ok(new_lists)
) -> Result<AstNode, Box<dyn std::error::Error>> {
let maybe_plain_text = current_token.as_text();
let ast_node = match maybe_plain_text {
Ok(plain_text) => {
let parent_contents_begin = parent_contents_begin
.ok_or("parent_contents_begin should be set for all plain text nodes.")?;
let mut parameters = plain_text.properties.iter();
let begin = parent_contents_begin
+ maybe_token_to_usize(parameters.next())?
.ok_or("Missing first element past the text.")?;
let end = parent_contents_begin
+ maybe_token_to_usize(parameters.next())?
.ok_or("Missing second element past the text.")?;
let (start_line, end_line) = get_line_numbers(original_source, begin, end)?;
AstNode {
name: "plain-text".to_owned(),
position: SourceRange {
start_line,
end_line,
start_character: begin,
end_character: end,
},
children: Vec::new(),
}
}
Token::Vector(_) => {
let new_lists = find_lists_in_vector(original_source, current_token)?;
Ok(new_lists)
Err(_) => {
// Not plain text, so it must be a list
let parameters = current_token.as_list()?;
let name = parameters
.first()
.ok_or("Should have at least one child.")?
.as_atom()?;
let position = get_bounds(original_source, current_token)?;
let mut children = Vec::new();
let original_contents_begin = get_contents_begin(current_token);
match original_contents_begin {
Ok(original_contents_begin) => {
let mut contents_begin = original_contents_begin;
for child in parameters.into_iter().skip(2) {
let new_ast_node =
build_ast_node(original_source, Some(contents_begin), child)?;
contents_begin = new_ast_node.position.end_character;
children.push(new_ast_node);
}
}
Err(_) => {
// Some nodes don't have a contents begin, so hopefully plain text can't be inside them.
for child in parameters.into_iter().skip(2) {
let new_ast_node = build_ast_node(original_source, None, child)?;
children.push(new_ast_node);
}
}
};
AstNode {
name: name.to_owned(),
position,
children,
}
}
}
}
};
fn find_lists_in_list<'a>(
original_source: &str,
current_token: &Token<'a>,
) -> Result<Vec<PlainList>, Box<dyn std::error::Error>> {
let mut found_lists = Vec::new();
let children = current_token.as_list()?;
if assert_name(current_token, "plain-list").is_ok() {
// Found a list!
let mut found_items = Vec::new();
// skip 2 to skip token name and standard properties
for child_token in children.iter().skip(2) {
found_items.push(get_item_in_list(original_source, child_token)?);
}
found_lists.push(PlainList {
position: get_bounds(original_source, current_token)?,
items: found_items,
});
} else {
// skip 2 to skip token name and standard properties
for child_token in children.iter().skip(2) {
found_lists.extend(recurse_token(original_source, child_token)?);
}
}
Ok(found_lists)
}
fn find_lists_in_vector<'a>(
original_source: &str,
current_token: &Token<'a>,
) -> Result<Vec<PlainList>, Box<dyn std::error::Error>> {
let mut found_lists = Vec::new();
let children = current_token.as_vector()?;
for child_token in children.iter() {
found_lists.extend(recurse_token(original_source, child_token)?);
}
Ok(found_lists)
}
fn get_item_in_list<'a>(
original_source: &str,
current_token: &Token<'a>,
) -> Result<PlainListItem, Box<dyn std::error::Error>> {
let mut found_lists = Vec::new();
let children = current_token.as_list()?;
let token_name = "item";
assert_name(current_token, token_name)?;
// skip 2 to skip token name and standard properties
for child_token in children.iter().skip(2) {
found_lists.extend(recurse_token(original_source, child_token)?);
}
Ok(PlainListItem {
position: get_bounds(original_source, current_token)?,
lists: found_lists,
})
Ok(ast_node)
}
fn assert_name<'s>(emacs: &'s Token<'s>, name: &str) -> Result<(), Box<dyn std::error::Error>> {
@@ -161,39 +143,35 @@ fn get_bounds<'s>(
original_source: &'s str,
emacs: &'s Token<'s>,
) -> Result<SourceRange, Box<dyn std::error::Error>> {
let children = emacs.as_list()?;
let attributes_child = children
.iter()
.nth(1)
.ok_or("Should have an attributes child.")?;
let attributes_map = attributes_child.as_map()?;
let standard_properties = attributes_map.get(":standard-properties");
let (begin, end) = if standard_properties.is_some() {
let std_props = standard_properties
.expect("if statement proves its Some")
.as_vector()?;
let begin = std_props
.get(0)
.ok_or("Missing first element in standard properties")?
.as_atom()?;
let end = std_props
.get(1)
.ok_or("Missing first element in standard properties")?
.as_atom()?;
(begin, end)
} else {
let begin = attributes_map
.get(":begin")
.ok_or("Missing :begin attribute.")?
.as_atom()?;
let end = attributes_map
.get(":end")
.ok_or("Missing :end attribute.")?
.as_atom()?;
(begin, end)
};
let begin = begin.parse::<u32>()?;
let end = end.parse::<u32>()?;
let standard_properties = get_standard_properties(emacs)?;
let (begin, end) = (
standard_properties
.begin
.ok_or("Token should have a begin.")?,
standard_properties.end.ok_or("Token should have an end.")?,
);
let (start_line, end_line) = get_line_numbers(original_source, begin, end)?;
Ok(SourceRange {
start_line,
end_line,
start_character: begin,
end_character: end,
})
}
fn get_contents_begin<'s>(emacs: &'s Token<'s>) -> Result<usize, Box<dyn std::error::Error>> {
let standard_properties = get_standard_properties(emacs)?;
Ok(standard_properties
.contents_begin
.ok_or("Token should have a contents-begin.")?)
}
fn get_line_numbers<'s>(
original_source: &'s str,
begin: usize,
end: usize,
) -> Result<(usize, usize), Box<dyn std::error::Error>> {
// This is used for highlighting which lines contain text relevant to the token, so even if a token does not extend all the way to the end of the line, the end_line figure will be the following line number (since the range is exclusive, not inclusive).
let start_line = original_source
.chars()
.into_iter()
@@ -201,17 +179,96 @@ fn get_bounds<'s>(
.filter(|x| *x == '\n')
.count()
+ 1;
let end_line = original_source
.chars()
.into_iter()
.take(usize::try_from(end)? - 1)
.filter(|x| *x == '\n')
.count()
+ 1;
Ok(SourceRange {
start_line: u32::try_from(start_line)?,
end_line: u32::try_from(end_line)?,
start_character: begin,
end_character: end,
let end_line = {
let content_up_to_and_including_token = original_source
.chars()
.into_iter()
.take(usize::try_from(end)? - 1);
// Remove the trailing newline (if there is one) because we're going to add an extra line regardless of whether or not this ends with a new line.
let without_trailing_newline = RTrimIterator::new(content_up_to_and_including_token, '\n');
without_trailing_newline.filter(|x| *x == '\n').count() + 2
};
Ok((usize::try_from(start_line)?, usize::try_from(end_line)?))
}
struct StandardProperties {
begin: Option<usize>,
#[allow(dead_code)]
post_affiliated: Option<usize>,
#[allow(dead_code)]
contents_begin: Option<usize>,
#[allow(dead_code)]
contents_end: Option<usize>,
end: Option<usize>,
#[allow(dead_code)]
post_blank: Option<usize>,
}
fn get_standard_properties<'s>(
emacs: &'s Token<'s>,
) -> Result<StandardProperties, Box<dyn std::error::Error>> {
let children = emacs.as_list()?;
let attributes_child = children
.iter()
.nth(1)
.ok_or("Should have an attributes child.")?;
let attributes_map = attributes_child.as_map()?;
let standard_properties = attributes_map.get(":standard-properties");
Ok(if standard_properties.is_some() {
let mut std_props = standard_properties
.expect("if statement proves its Some")
.as_vector()?
.into_iter();
let begin = maybe_token_to_usize(std_props.next())?;
let post_affiliated = maybe_token_to_usize(std_props.next())?;
let contents_begin = maybe_token_to_usize(std_props.next())?;
let contents_end = maybe_token_to_usize(std_props.next())?;
let end = maybe_token_to_usize(std_props.next())?;
let post_blank = maybe_token_to_usize(std_props.next())?;
StandardProperties {
begin,
post_affiliated,
contents_begin,
contents_end,
end,
post_blank,
}
} else {
let begin = maybe_token_to_usize(attributes_map.get(":begin").map(|token| *token))?;
let end = maybe_token_to_usize(attributes_map.get(":end").map(|token| *token))?;
let contents_begin =
maybe_token_to_usize(attributes_map.get(":contents-begin").map(|token| *token))?;
let contents_end =
maybe_token_to_usize(attributes_map.get(":contents-end").map(|token| *token))?;
let post_blank =
maybe_token_to_usize(attributes_map.get(":post-blank").map(|token| *token))?;
let post_affiliated =
maybe_token_to_usize(attributes_map.get(":post-affiliated").map(|token| *token))?;
StandardProperties {
begin,
post_affiliated,
contents_begin,
contents_end,
end,
post_blank,
}
})
}
fn maybe_token_to_usize(
token: Option<&Token<'_>>,
) -> Result<Option<usize>, Box<dyn std::error::Error>> {
Ok(token
.map(|token| token.as_atom())
.map_or(Ok(None), |r| r.map(Some))?
.map(|val| {
if val == "nil" {
None
} else {
Some(val.parse::<usize>())
}
})
.flatten() // Outer option is whether or not the param exists, inner option is whether or not it is nil
.map_or(Ok(None), |r| r.map(Some))?)
}

View File

@@ -51,3 +51,40 @@ where
}
output
}
pub async fn get_emacs_version() -> Result<String, Box<dyn std::error::Error>> {
let elisp_script = r#"(progn
(message "%s" (version))
)"#;
let mut cmd = Command::new("emacs");
let proc = cmd
.arg("-q")
.arg("--no-site-file")
.arg("--no-splash")
.arg("--batch")
.arg("--eval")
.arg(elisp_script);
let out = proc.output().await?;
out.status.exit_ok()?;
Ok(String::from_utf8(out.stderr)?)
}
pub async fn get_org_mode_version() -> Result<String, Box<dyn std::error::Error>> {
let elisp_script = r#"(progn
(org-mode)
(message "%s" (org-version nil t nil))
)"#;
let mut cmd = Command::new("emacs");
let proc = cmd
.arg("-q")
.arg("--no-site-file")
.arg("--no-splash")
.arg("--batch")
.arg("--eval")
.arg(elisp_script);
let out = proc.output().await?;
out.status.exit_ok()?;
Ok(String::from_utf8(out.stderr)?)
}

86
src/rtrim_iterator.rs Normal file
View File

@@ -0,0 +1,86 @@
/// Removes 1 character from the end of an iterator if it matches needle
pub struct RTrimIterator<I> {
iter: I,
needle: char,
buffer: Option<char>,
}
impl<I> Iterator for RTrimIterator<I>
where
I: Iterator<Item = char>,
{
type Item = char;
fn next(&mut self) -> Option<I::Item> {
loop {
match (self.buffer, self.iter.next()) {
(None, None) => {
// We reached the end of the list and have an empty buffer, meaning the string did not end with the needle character.
return None;
}
(None, Some(chr)) if chr == self.needle => {
// We came across an instance of needle, buffer it and loop again because we do not know if this is the end of the string.
self.buffer = Some(chr);
}
(None, Some(chr)) => {
// We have an empty buffer and the next character is not the needle character, return it immediately.
return Some(chr);
}
(Some(buf), None) if buf == self.needle => {
// We reached the end of the list and have the specified needle in the buffer where it will stay forever.
return None;
}
(Some(_), None) => {
// We reached the end of the list and the buffered character is not the needle character, so write it out.
return self.buffer.take();
}
(Some(_), Some(chr)) => {
// We have a buffered character, but it is not the end of the string, so regardless of its contents we can write it out.
return self.buffer.replace(chr);
}
};
}
}
}
impl<I> RTrimIterator<I> {
pub fn new(iter: I, needle: char) -> RTrimIterator<I> {
RTrimIterator {
iter,
needle,
buffer: None,
}
}
}
mod tests {
use super::*;
#[test]
fn no_match() {
let input = "abcd";
let output: String = RTrimIterator::new(input.chars(), '\n').collect();
assert_eq!(output, input);
}
#[test]
fn middle_match() {
let input = "ab\ncd";
let output: String = RTrimIterator::new(input.chars(), '\n').collect();
assert_eq!(output, input);
}
#[test]
fn end_match() {
let input = "abcd\n";
let output: String = RTrimIterator::new(input.chars(), '\n').collect();
assert_eq!(output, "abcd");
}
#[test]
fn double_match() {
let input = "abcd\n\n";
let output: String = RTrimIterator::new(input.chars(), '\n').collect();
assert_eq!(output, "abcd\n");
}
}

View File

@@ -1,5 +1,21 @@
<!doctype html>
<html>
<head>
<link rel="stylesheet" href="reset.css">
<link rel="stylesheet" href="style.css">
<script type="text/javascript" src="script.js" defer></script>
</head>
<body>
Test html file.
<h2>Input org-mode source:</h2>
<textarea id="org-input" rows="24" cols="80"></textarea>
<hr/>
<div class="output_container">
<div>
<div id="parse-output" class="code_block" style="counter-set: code_line_number 0;"></div>
</div>
<div>
<div id="ast-tree" class="ast_tree"></div>
</div>
</div>
</body>
</html>

48
static/reset.css Normal file
View File

@@ -0,0 +1,48 @@
/* http://meyerweb.com/eric/tools/css/reset/
v2.0 | 20110126
License: none (public domain)
*/
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
b, u, i, center,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td,
article, aside, canvas, details, embed,
figure, figcaption, footer, header, hgroup,
menu, nav, output, ruby, section, summary,
time, mark, audio, video {
margin: 0;
padding: 0;
border: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* HTML5 display-role reset for older browsers */
article, aside, details, figcaption, figure,
footer, header, hgroup, menu, nav, section {
display: block;
}
body {
line-height: 1;
}
ol, ul {
list-style: none;
}
blockquote, q {
quotes: none;
}
blockquote:before, blockquote:after,
q:before, q:after {
content: '';
content: none;
}
table {
border-collapse: collapse;
border-spacing: 0;
}

194
static/script.js Normal file
View File

@@ -0,0 +1,194 @@
let inFlightRequest = null;
const inputElement = document.querySelector("#org-input");
const outputElement = document.querySelector("#parse-output");
const astTreeElement = document.querySelector("#ast-tree");
function abortableFetch(request, options) {
const controller = new AbortController();
const signal = controller.signal;
return {
abort: () => controller.abort(),
ready: fetch(request, { ...options, signal })
};
}
function clearOutput() {
clearActiveAstNode();
outputElement.innerHTML = "";
astTreeElement.innerHTML = "";
}
function renderParseResponse(response) {
clearOutput();
renderSourceBox(response);
renderAstTree(response);
}
function renderSourceBox(response) {
const lines = response.input.split(/\r?\n/);
const numLines = lines.length;
const numDigits = Math.log10(numLines) + 1;
outputElement.style.paddingLeft = `calc(${numDigits + 1}ch + 10px)`;
for (let line of lines) {
let wrappedLine = document.createElement("code");
if (line !== "" && line !== null) {
for (let chr of line) {
// Please forgive me
let wrappedCharacter = document.createElement("span");
wrappedCharacter.textContent = chr;
wrappedLine.appendChild(wrappedCharacter);
}
} else {
let wrappedCharacter = document.createElement("span");
wrappedCharacter.textContent = "\n";
wrappedLine.appendChild(wrappedCharacter);
}
outputElement.appendChild(wrappedLine);
}
}
function renderAstTree(response) {
renderAstNode(response.input, 0, response.tree);
}
function renderAstNode(originalSource, depth, astNode) {
const nodeElem = document.createElement("div");
nodeElem.classList.add("ast_node");
let sourceForNode = unicodeAwareSlice(originalSource, astNode.position.start_character - 1, astNode.position.end_character - 1);
// Since sourceForList is a string, JSON.stringify will escape with backslashes and wrap the text in quotation marks, ensuring that the string ends up on a single line. Coincidentally, this is the behavior we want.
let escapedSource = JSON.stringify(sourceForNode);
nodeElem.innerText = `${astNode.name}: ${escapedSource}`;
nodeElem.style.marginLeft = `${depth * 20}px`;
nodeElem.dataset.startLine = astNode.position.start_line;
nodeElem.dataset.endLine = astNode.position.end_line;
nodeElem.dataset.startCharacter = astNode.position.start_character;
nodeElem.dataset.endCharacter = astNode.position.end_character;
nodeElem.addEventListener("click", () => {
setActiveAstNode(nodeElem, originalSource);
});
astTreeElement.appendChild(nodeElem);
for (let child of astNode.children) {
renderAstNode(originalSource, depth + 1, child);
}
}
function clearActiveAstNode() {
for (let elem of document.querySelectorAll("#ast-tree .ast_node.highlighted")) {
elem.classList.remove("highlighted");
}
for (let elem of document.querySelectorAll("#parse-output > code.highlighted")) {
elem.classList.remove("highlighted");
}
for (let elem of document.querySelectorAll("#parse-output > code > span")) {
elem.classList.remove("highlighted");
}
}
function setActiveAstNode(elem, originalSource) {
clearActiveAstNode();
elem.classList.add("highlighted");
let startLine = parseInt(elem.dataset.startLine, 10);
let endLine = parseInt(elem.dataset.endLine, 10);
let startCharacter = parseInt(elem.dataset.startCharacter, 10);
let endCharacter = parseInt(elem.dataset.endCharacter, 10);
for (let line = startLine; line < endLine; ++line) {
highlightLine("parse-output", line - 1);
}
highlightCharacters("parse-output", originalSource, startCharacter, endCharacter);
}
inputElement.addEventListener("input", async () => {
let orgSource = inputElement.value;
if (inFlightRequest != null) {
inFlightRequest.abort();
inFlightRequest = null;
}
clearOutput();
let newRequest = abortableFetch("/parse", {
method: "POST",
cache: "no-cache",
body: orgSource,
});
inFlightRequest = newRequest;
let response = null;
try {
response = await inFlightRequest.ready;
}
catch (err) {
if (err.name === "AbortError") return;
}
renderParseResponse(await response.json());
});
function highlightLine(htmlName, lineOffset) {
const childOffset = lineOffset + 1;
const codeLineElement = document.querySelector(`#${htmlName} > code:nth-child(${childOffset})`);
codeLineElement?.classList.add("highlighted")
}
function highlightCharacters(htmlName, originalSource, startCharacter, endCharacter) {
let sourceBefore = unicodeAwareSlice(originalSource, 0, startCharacter - 1);
let precedingLineBreak = unicodeAwareLastIndexOfCharacter(sourceBefore, "\n");
let characterIndexOnLine = precedingLineBreak !== -1 ? startCharacter - precedingLineBreak - 1 : startCharacter;
let lineNumber = (sourceBefore.match(/\r?\n/g) || '').length + 1;
for (let characterIndex = startCharacter; characterIndex < endCharacter; ++characterIndex) {
document.querySelector(`#${htmlName} > code:nth-child(${lineNumber}) > span:nth-child(${characterIndexOnLine})`)?.classList.add("highlighted");
if (unicodeAwareCharAtOffset(originalSource, characterIndex - 1) == "\n") {
++lineNumber;
characterIndexOnLine = 1;
} else {
++characterIndexOnLine;
}
}
}
function unicodeAwareSlice(text, start, end) {
// Boooo javascript
let i = 0;
let output = "";
for (chr of text) {
if (i >= end) {
break;
}
if (i >= start) {
output += chr;
}
++i;
}
return output;
}
function unicodeAwareLastIndexOfCharacter(haystack, needle) {
// Boooo javascript
let i = 0;
let found = -1;
for (chr of haystack) {
if (chr == needle) {
found = i;
}
++i;
}
return found;
}
function unicodeAwareCharAtOffset(text, offset) {
// Boooo javascript
let i = offset;
for (chr of text) {
if (i == 0) {
return chr;
}
--i;
}
}

86
static/style.css Normal file
View File

@@ -0,0 +1,86 @@
h1, h2, h3, h4, h5, h6, h7 {
font-weight: 700;
}
h1 {
font-size: 28px;
}
h2 {
font-size: 24px;
}
h3 {
font-size: 22px;
}
h4 {
font-size: 20px;
}
h5 {
font-size: 18px;
}
h6 {
font-size: 18px;
}
h7 {
font-size: 18px;
}
.code_block {
font: 14px/1.4 "Cascadia Mono", monospace;
background: #272822ff;
color: #f8f8f2ff;
display: table;
white-space: break-spaces;
padding: 5px;
}
.code_block > code {
display: table;
counter-increment: code_line_number;
}
.code_block > code::before {
content: counter(code_line_number) " ";
display: inline-block;
position: absolute;
transform: TranslateX(-100%);
padding-right: 5px;
color: #eeeeee;
}
.code_block > code.highlighted {
/* We aren't using this because we are going to highlight individual characters, but we still need to set the highlighted class on the code elem so the line numbers on the left get highlighted to make empty lines more obvious. */
/* background: #307351ff; */
}
.code_block > code.highlighted::before {
background: #307351ff;
}
.code_block > code > span.highlighted {
background: #307351ff;
}
.output_container {
display: flex;
flex-direction: row;
}
.output_container > * {
flex: 1 0 0;
}
.ast_tree {
padding: 5px;
}
.ast_node {
cursor: pointer;
background: #eeeeee;
margin-bottom: 5px;
border: 1px solid #000000;
padding: 2px;
}
.ast_node.highlighted {
background: #307351ff;
color: #ffffff;
}