1
0
mirror of https://git.savannah.gnu.org/git/emacs.git synced 2024-11-23 07:19:15 +00:00

(Parsing Expressions): Split up node.

(Motion via Parsing, Position Parse, Parser State)
(Low-Level Parsing, Control Parsing): New subnodes.
(Parser State): Document syntax-ppss-toplevel-pos.
This commit is contained in:
Richard M. Stallman 2006-12-17 22:02:52 +00:00
parent 07af30248a
commit fe963f844b
2 changed files with 189 additions and 139 deletions

View File

@ -1,5 +1,12 @@
2006-12-17 Richard Stallman <rms@gnu.org>
* syntax.texi (Parsing Expressions): Split up node.
(Motion via Parsing, Position Parse, Parser State)
(Low-Level Parsing, Control Parsing): New subnodes.
(Parser State): Document syntax-ppss-toplevel-pos.
* positions.texi (List Motion): Punctuation fix.
* files.texi (File Name Completion): Document PREDICATE arg
to file-name-completion.

View File

@ -597,26 +597,26 @@ expression prefix syntax class, and characters with the @samp{p} flag.
@end defun
@node Parsing Expressions
@section Parsing Balanced Expressions
@section Parsing Expressions
Here are several functions for parsing and scanning balanced
This section describes functions for parsing and scanning balanced
expressions, also known as @dfn{sexps}. Basically, a sexp is either a
balanced parenthetical grouping, or a symbol name (a sequence of
characters whose syntax is either word constituent or symbol
constituent). However, characters whose syntax is expression prefix
are treated as part of the sexp if they appear next to it.
balanced parenthetical grouping, a string, or a symbol name (a
sequence of characters whose syntax is either word constituent or
symbol constituent). However, characters whose syntax is expression
prefix are treated as part of the sexp if they appear next to it.
The syntax table controls the interpretation of characters, so these
functions can be used for Lisp expressions when in Lisp mode and for C
expressions when in C mode. @xref{List Motion}, for convenient
higher-level functions for moving over balanced expressions.
A syntax table only describes how each character changes the state
of the parser, rather than describing the state itself. For example,
a string delimiter character toggles the parser state between
``in-string'' and ``in-code'' but the characters inside the string do
not have any particular syntax to identify them as such. For example
(note that 15 is the syntax code for generic string delimiters),
A character's syntax controls how it changes the state of the
parser, rather than describing the state itself. For example, a
string delimiter character toggles the parser state between
``in-string'' and ``in-code,'' but the syntax of characters does not
directly say whether they are inside a string. For example (note that
15 is the syntax code for generic string delimiters),
@example
(put-text-property 1 9 'syntax-table '(15 . nil))
@ -627,46 +627,128 @@ does not tell Emacs that the first eight chars of the current buffer
are a string, but rather that they are all string delimiters. As a
result, Emacs treats them as four consecutive empty string constants.
Every time you use the parser, you specify it a starting state as
well as a starting position. If you omit the starting state, the
default is ``top level in parenthesis structure,'' as it would be at
the beginning of a function definition. (This is the case for
@code{forward-sexp}, which blindly assumes that the starting point is
in such a state.)
@menu
* Motion via Parsing:: Motion functions that work by parsing.
* Position Parse:: Determining the syntactic state of a position.
* Parser State:: How Emacs represents a syntactic state.
* Low-Level Parsing:: Parsing across a specified region.
* Control Parsing:: Parameters that affect parsing.
@end menu
@defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment
This function parses a sexp in the current buffer starting at
@var{start}, not scanning past @var{limit}. It stops at position
@var{limit} or when certain criteria described below are met, and sets
point to the location where parsing stops. It returns a value
describing the status of the parse at the point where it stops.
@node Motion via Parsing
@subsection Motion Commands Based on Parsing
If @var{state} is @code{nil}, @var{start} is assumed to be at the top
level of parenthesis structure, such as the beginning of a function
definition. Alternatively, you might wish to resume parsing in the
middle of the structure. To do this, you must provide a @var{state}
argument that describes the initial status of parsing.
This section describes simple point-motion functions that operate
based on parsing expressions.
@cindex parenthesis depth
If the third argument @var{target-depth} is non-@code{nil}, parsing
stops if the depth in parentheses becomes equal to @var{target-depth}.
The depth starts at 0, or at whatever is given in @var{state}.
@defun scan-lists from count depth
This function scans forward @var{count} balanced parenthetical groupings
from position @var{from}. It returns the position where the scan stops.
If @var{count} is negative, the scan moves backwards.
If the fourth argument @var{stop-before} is non-@code{nil}, parsing
stops when it comes to any character that starts a sexp. If
@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
start of a comment. If @var{stop-comment} is the symbol
@code{syntax-table}, parsing stops after the start of a comment or a
string, or the end of a comment or a string, whichever comes first.
If @var{depth} is nonzero, parenthesis depth counting begins from that
value. The only candidates for stopping are places where the depth in
parentheses becomes zero; @code{scan-lists} counts @var{count} such
places and then stops. Thus, a positive value for @var{depth} means go
out @var{depth} levels of parenthesis.
@cindex parse state
The fifth argument @var{state} is a ten-element list of the same form
as the value of this function, described below. The return value of
one call may be used to initialize the state of the parse on another
call to @code{parse-partial-sexp}.
Scanning ignores comments if @code{parse-sexp-ignore-comments} is
non-@code{nil}.
The result is a list of ten elements describing the final state of
the parse:
If the scan reaches the beginning or end of the buffer (or its
accessible portion), and the depth is not zero, an error is signaled.
If the depth is zero but the count is not used up, @code{nil} is
returned.
@end defun
@defun scan-sexps from count
This function scans forward @var{count} sexps from position @var{from}.
It returns the position where the scan stops. If @var{count} is
negative, the scan moves backwards.
Scanning ignores comments if @code{parse-sexp-ignore-comments} is
non-@code{nil}.
If the scan reaches the beginning or end of (the accessible part of) the
buffer while in the middle of a parenthetical grouping, an error is
signaled. If it reaches the beginning or end between groupings but
before count is used up, @code{nil} is returned.
@end defun
@defun forward-comment count
This function moves point forward across @var{count} complete comments
(that is, including the starting delimiter and the terminating
delimiter if any), plus any whitespace encountered on the way. It
moves backward if @var{count} is negative. If it encounters anything
other than a comment or whitespace, it stops, leaving point at the
place where it stopped. This includes (for instance) finding the end
of a comment when moving forward and expecting the beginning of one.
The function also stops immediately after moving over the specified
number of complete comments. If @var{count} comments are found as
expected, with nothing except whitespace between them, it returns
@code{t}; otherwise it returns @code{nil}.
This function cannot tell whether the ``comments'' it traverses are
embedded within a string. If they look like comments, it treats them
as comments.
@end defun
To move forward over all comments and whitespace following point, use
@code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good
argument to use, because the number of comments in the buffer cannot
exceed that many.
@node Position Parse
@subsection Finding the Parse State for a Position
For syntactic analysis, such as in indentation, often the useful
thing is to compute the syntactic state corresponding to a given buffer
position. This function does that conveniently.
@defun syntax-ppss &optional pos
This function returns the parser state (see next section) that the
parser would reach at position @var{pos} starting from the beginning
of the buffer. This is equivalent to @code{(parse-partial-sexp
(point-min) @var{pos})}, except that @code{syntax-ppss} uses a cache
to speed up the computation. Due to this optimization, the 2nd value
(previous complete subexpression) and 6th value (minimum parenthesis
depth) of the returned parser state are not meaningful.
@end defun
@code{syntax-ppss} automatically hooks itself to
@code{before-change-functions} to keep its cache consistent. But
updating can fail if @code{syntax-ppss} is called while
@code{before-change-functions} is temporarily let-bound, or if the
buffer is modified without obeying the hook, such as when using
@code{inhibit-modification-hooks}. For this reason, it is sometimes
necessary to flush the cache manually.
@defun syntax-ppss-flush-cache beg
This function flushes the cache used by @code{syntax-ppss}, starting at
position @var{beg}.
@end defun
Major modes can make @code{syntax-ppss} run faster by specifying
where it needs to start parsing.
@defvar syntax-begin-function
If this is non-@code{nil}, it should be a function that moves to an
earlier buffer position where the parser state is equivalent to
@code{nil}---in other words, a position outside of any comment,
string, or parenthesis. @code{syntax-ppss} uses it to further
optimize its computations, when the cache gives no help.
@end defvar
@node Parser State
@subsection Parser State
@cindex parser state
A @dfn{parser state} is a list of ten elements describing the final
state of parsing text syntactically as part of an expression. The
parsing functions in the following sections return a parser state as
the value, and in some cases accept one as an argument also, so that
you can resume parsing after it stops. Here are the meanings of the
elements of the parser state:
@enumerate 0
@item
@ -721,81 +803,65 @@ data is subject to change; it is used if you pass this list
as the @var{state} argument to another call.
@end enumerate
Elements 1, 2, and 6 are ignored in the argument @var{state}. Element
8 is used only to set the corresponding element of the return value,
in certain simple cases. Element 9 is used only to set element 1 of
the return value, in trivial cases where parsing starts and stops
within the same pair of parentheses.
Elements 1, 2, and 6 are ignored in a state which you pass as an
argument to continue parsing, and elements 8 and 9 are used only in
trivial cases. Those elements serve primarily to convey information
to the Lisp program which does the parsing.
@cindex indenting with parentheses
This function is most often used to compute indentation for languages
that have nested parentheses.
One additional piece of useful information is available from a
parser state using this function:
@defun syntax-ppss-toplevel-pos state
This function extracts, from parser state @var{state}, the last
position scanned in the parse which was at top level in grammatical
structure. ``At top level'' means outside of any parentheses,
comments, or strings.
The value is @code{nil} if @var{state} represents a parse which has
arrived at a top level position.
@end defun
@defun syntax-ppss &optional pos
This function returns the state that the parser would have at position
@var{pos}, if it were started with a default start state at the
beginning of the buffer. Thus, it is equivalent to
@code{(parse-partial-sexp (point-min) @var{pos})}, except that
@code{syntax-ppss} uses a cache to speed up the computation. Also,
the 2nd value (previous complete subexpression) and 6th value (minimum
parenthesis depth) of the returned state are not meaningful.
We have provided this access function rather than document how the
data is represented in the state, because we plan to change the
representation in the future.
@node Low-Level Parsing
@subsection Low-Level Parsing
The most basic way to use the expression parser is to tell it
to start at a given position with a certain state, and parse up to
a specified end position.
@defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment
This function parses a sexp in the current buffer starting at
@var{start}, not scanning past @var{limit}. It stops at position
@var{limit} or when certain criteria described below are met, and sets
point to the location where parsing stops. It returns a parser state
describing the status of the parse at the point where it stops.
@cindex parenthesis depth
If the third argument @var{target-depth} is non-@code{nil}, parsing
stops if the depth in parentheses becomes equal to @var{target-depth}.
The depth starts at 0, or at whatever is given in @var{state}.
If the fourth argument @var{stop-before} is non-@code{nil}, parsing
stops when it comes to any character that starts a sexp. If
@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
start of a comment. If @var{stop-comment} is the symbol
@code{syntax-table}, parsing stops after the start of a comment or a
string, or the end of a comment or a string, whichever comes first.
If @var{state} is @code{nil}, @var{start} is assumed to be at the top
level of parenthesis structure, such as the beginning of a function
definition. Alternatively, you might wish to resume parsing in the
middle of the structure. To do this, you must provide a @var{state}
argument that describes the initial status of parsing. The value
returned by a previous call to @code{parse-partial-sexp} will do
nicely.
@end defun
@defun syntax-ppss-flush-cache beg
This function flushes the cache used by @code{syntax-ppss}, starting at
position @var{beg}.
When @code{syntax-ppss} is called, it automatically hooks itself
to @code{before-change-functions} to keep its cache consistent.
But this can fail if @code{syntax-ppss} is called while
@code{before-change-functions} is temporarily let-bound, or if the
buffer is modified without obeying the hook, such as when using
@code{inhibit-modification-hooks}. For this reason, it is sometimes
necessary to flush the cache manually.
@end defun
@defvar syntax-begin-function
If this is non-@code{nil}, it should be a function that moves to an
earlier buffer position where the parser state is equivalent to
@code{nil}---in other words, a position outside of any comment,
string, or parenthesis. @code{syntax-ppss} uses it to supplement its
cache.
@end defvar
@defun scan-lists from count depth
This function scans forward @var{count} balanced parenthetical groupings
from position @var{from}. It returns the position where the scan stops.
If @var{count} is negative, the scan moves backwards.
If @var{depth} is nonzero, parenthesis depth counting begins from that
value. The only candidates for stopping are places where the depth in
parentheses becomes zero; @code{scan-lists} counts @var{count} such
places and then stops. Thus, a positive value for @var{depth} means go
out @var{depth} levels of parenthesis.
Scanning ignores comments if @code{parse-sexp-ignore-comments} is
non-@code{nil}.
If the scan reaches the beginning or end of the buffer (or its
accessible portion), and the depth is not zero, an error is signaled.
If the depth is zero but the count is not used up, @code{nil} is
returned.
@end defun
@defun scan-sexps from count
This function scans forward @var{count} sexps from position @var{from}.
It returns the position where the scan stops. If @var{count} is
negative, the scan moves backwards.
Scanning ignores comments if @code{parse-sexp-ignore-comments} is
non-@code{nil}.
If the scan reaches the beginning or end of (the accessible part of) the
buffer while in the middle of a parenthetical grouping, an error is
signaled. If it reaches the beginning or end between groupings but
before count is used up, @code{nil} is returned.
@end defun
@node Control Parsing
@subsection Parameters to Control Parsing
@defvar multibyte-syntax-as-symbol
If this variable is non-@code{nil}, @code{scan-sexps} treats all
@ -817,29 +883,6 @@ The behavior of @code{parse-partial-sexp} is also affected by
You can use @code{forward-comment} to move forward or backward over
one comment or several comments.
@defun forward-comment count
This function moves point forward across @var{count} complete comments
(that is, including the starting delimiter and the terminating
delimiter if any), plus any whitespace encountered on the way. It
moves backward if @var{count} is negative. If it encounters anything
other than a comment or whitespace, it stops, leaving point at the
place where it stopped. This includes (for instance) finding the end
of a comment when moving forward and expecting the beginning of one.
The function also stops immediately after moving over the specified
number of complete comments. If @var{count} comments are found as
expected, with nothing except whitespace between them, it returns
@code{t}; otherwise it returns @code{nil}.
This function cannot tell whether the ``comments'' it traverses are
embedded within a string. If they look like comments, it treats them
as comments.
@end defun
To move forward over all comments and whitespace following point, use
@code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good
argument to use, because the number of comments in the buffer cannot
exceed that many.
@node Standard Syntax Tables
@section Some Standard Syntax Tables