mirror of
https://git.savannah.gnu.org/git/emacs.git
synced 2024-12-15 09:47:20 +00:00
997 lines
44 KiB
Plaintext
997 lines
44 KiB
Plaintext
@c This is part of the Emacs manual.
|
|
@c Copyright (C) 1985, 86, 87, 93, 94, 95, 97, 2000
|
|
@c Free Software Foundation, Inc.
|
|
@c See file emacs.texi for copying conditions.
|
|
@node Search, Fixit, Display, Top
|
|
@chapter Searching and Replacement
|
|
@cindex searching
|
|
@cindex finding strings within text
|
|
|
|
Like other editors, Emacs has commands for searching for occurrences of
|
|
a string. The principal search command is unusual in that it is
|
|
@dfn{incremental}; it begins to search before you have finished typing the
|
|
search string. There are also nonincremental search commands more like
|
|
those of other editors.
|
|
|
|
Besides the usual @code{replace-string} command that finds all
|
|
occurrences of one string and replaces them with another, Emacs has a fancy
|
|
replacement command called @code{query-replace} which asks interactively
|
|
which occurrences to replace.
|
|
|
|
@menu
|
|
* Incremental Search:: Search happens as you type the string.
|
|
* Nonincremental Search:: Specify entire string and then search.
|
|
* Word Search:: Search for sequence of words.
|
|
* Regexp Search:: Search for match for a regexp.
|
|
* Regexps:: Syntax of regular expressions.
|
|
* Search Case:: To ignore case while searching, or not.
|
|
* Replace:: Search, and replace some or all matches.
|
|
* Other Repeating Search:: Operating on all matches for some regexp.
|
|
@end menu
|
|
|
|
@node Incremental Search, Nonincremental Search, Search, Search
|
|
@section Incremental Search
|
|
|
|
@cindex incremental search
|
|
An incremental search begins searching as soon as you type the first
|
|
character of the search string. As you type in the search string, Emacs
|
|
shows you where the string (as you have typed it so far) would be
|
|
found. When you have typed enough characters to identify the place you
|
|
want, you can stop. Depending on what you plan to do next, you may or
|
|
may not need to terminate the search explicitly with @key{RET}.
|
|
|
|
@c WideCommands
|
|
@table @kbd
|
|
@item C-s
|
|
Incremental search forward (@code{isearch-forward}).
|
|
@item C-r
|
|
Incremental search backward (@code{isearch-backward}).
|
|
@end table
|
|
|
|
@kindex C-s
|
|
@findex isearch-forward
|
|
@kbd{C-s} starts an incremental search. @kbd{C-s} reads characters from
|
|
the keyboard and positions the cursor at the first occurrence of the
|
|
characters that you have typed. If you type @kbd{C-s} and then @kbd{F},
|
|
the cursor moves right after the first @samp{F}. Type an @kbd{O}, and see
|
|
the cursor move to after the first @samp{FO}. After another @kbd{O}, the
|
|
cursor is after the first @samp{FOO} after the place where you started the
|
|
search. At each step, the buffer text that matches the search string is
|
|
highlighted, if the terminal can do that; at each step, the current search
|
|
string is updated in the echo area.
|
|
|
|
If you make a mistake in typing the search string, you can cancel
|
|
characters with @key{DEL}. Each @key{DEL} cancels the last character of
|
|
search string. This does not happen until Emacs is ready to read another
|
|
input character; first it must either find, or fail to find, the character
|
|
you want to erase. If you do not want to wait for this to happen, use
|
|
@kbd{C-g} as described below.
|
|
|
|
When you are satisfied with the place you have reached, you can type
|
|
@key{RET}, which stops searching, leaving the cursor where the search
|
|
brought it. Also, any command not specially meaningful in searches
|
|
stops the searching and is then executed. Thus, typing @kbd{C-a} would
|
|
exit the search and then move to the beginning of the line. @key{RET}
|
|
is necessary only if the next command you want to type is a printing
|
|
character, @key{DEL}, @key{RET}, or another control character that is
|
|
special within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s},
|
|
@kbd{C-y}, @kbd{M-y}, @kbd{M-r}, or @kbd{M-s}).
|
|
|
|
Sometimes you search for @samp{FOO} and find it, but not the one you
|
|
expected to find. There was a second @samp{FOO} that you forgot about,
|
|
before the one you were aiming for. In this event, type another @kbd{C-s}
|
|
to move to the next occurrence of the search string. This can be done any
|
|
number of times. If you overshoot, you can cancel some @kbd{C-s}
|
|
characters with @key{DEL}.
|
|
|
|
After you exit a search, you can search for the same string again by
|
|
typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
|
|
incremental search, and the second @kbd{C-s} means ``search again.''
|
|
|
|
To reuse earlier search strings, use the @dfn{search ring}. The
|
|
commands @kbd{M-p} and @kbd{M-n} move through the ring to pick a search
|
|
string to reuse. These commands leave the selected search ring element
|
|
in the minibuffer, where you can edit it. Type @kbd{C-s} or @kbd{C-r}
|
|
to terminate editing the string and search for it.
|
|
|
|
If your string is not found at all, the echo area says @samp{Failing
|
|
I-Search}. The cursor is after the place where Emacs found as much of your
|
|
string as it could. Thus, if you search for @samp{FOOT}, and there is no
|
|
@samp{FOOT}, you might see the cursor after the @samp{FOO} in @samp{FOOL}.
|
|
At this point there are several things you can do. If your string was
|
|
mistyped, you can rub some of it out and correct it. If you like the place
|
|
you have found, you can type @key{RET} or some other Emacs command to
|
|
``accept what the search offered.'' Or you can type @kbd{C-g}, which
|
|
removes from the search string the characters that could not be found (the
|
|
@samp{T} in @samp{FOOT}), leaving those that were found (the @samp{FOO} in
|
|
@samp{FOOT}). A second @kbd{C-g} at that point cancels the search
|
|
entirely, returning point to where it was when the search started.
|
|
|
|
An upper-case letter in the search string makes the search
|
|
case-sensitive. If you delete the upper-case character from the search
|
|
string, it ceases to have this effect. @xref{Search Case}.
|
|
|
|
If a search is failing and you ask to repeat it by typing another
|
|
@kbd{C-s}, it starts again from the beginning of the buffer. Repeating
|
|
a failing reverse search with @kbd{C-r} starts again from the end. This
|
|
is called @dfn{wrapping around}. @samp{Wrapped} appears in the search
|
|
prompt once this has happened. If you keep on going past the original
|
|
starting point of the search, it changes to @samp{Overwrapped}, which
|
|
means that you are revisiting matches that you have already seen.
|
|
|
|
@cindex quitting (in search)
|
|
The @kbd{C-g} ``quit'' character does special things during searches;
|
|
just what it does depends on the status of the search. If the search has
|
|
found what you specified and is waiting for input, @kbd{C-g} cancels the
|
|
entire search. The cursor moves back to where you started the search. If
|
|
@kbd{C-g} is typed when there are characters in the search string that have
|
|
not been found---because Emacs is still searching for them, or because it
|
|
has failed to find them---then the search string characters which have not
|
|
been found are discarded from the search string. With them gone, the
|
|
search is now successful and waiting for more input, so a second @kbd{C-g}
|
|
will cancel the entire search.
|
|
|
|
To search for a newline, type @kbd{C-j}. To search for another
|
|
control character, such as control-S or carriage return, you must quote
|
|
it by typing @kbd{C-q} first. This function of @kbd{C-q} is analogous
|
|
to its use for insertion (@pxref{Inserting Text}): it causes the
|
|
following character to be treated the way any ``ordinary'' character is
|
|
treated in the same context. You can also specify a character by its
|
|
octal code: enter @kbd{C-q} followed by a sequence of octal digits.
|
|
|
|
You can change to searching backwards with @kbd{C-r}. If a search fails
|
|
because the place you started was too late in the file, you should do this.
|
|
Repeated @kbd{C-r} keeps looking for more occurrences backwards. A
|
|
@kbd{C-s} starts going forwards again. @kbd{C-r} in a search can be canceled
|
|
with @key{DEL}.
|
|
|
|
@kindex C-r
|
|
@findex isearch-backward
|
|
If you know initially that you want to search backwards, you can use
|
|
@kbd{C-r} instead of @kbd{C-s} to start the search, because @kbd{C-r} as
|
|
a key runs a command (@code{isearch-backward}) to search backward. A
|
|
backward search finds matches that are entirely before the starting
|
|
point, just as a forward search finds matches that begin after it.
|
|
|
|
The characters @kbd{C-y} and @kbd{C-w} can be used in incremental
|
|
search to grab text from the buffer into the search string. This makes
|
|
it convenient to search for another occurrence of text at point.
|
|
@kbd{C-w} copies the word after point as part of the search string,
|
|
advancing point over that word. Another @kbd{C-s} to repeat the search
|
|
will then search for a string including that word. @kbd{C-y} is similar
|
|
to @kbd{C-w} but copies all the rest of the current line into the search
|
|
string. Both @kbd{C-y} and @kbd{C-w} convert the text they copy to
|
|
lower case if the search is currently not case-sensitive; this is so the
|
|
search remains case-insensitive.
|
|
|
|
The character @kbd{M-y} copies text from the kill ring into the search
|
|
string. It uses the same text that @kbd{C-y} as a command would yank.
|
|
@kbd{mouse-2} in the echo area does the same.
|
|
@xref{Yanking}.
|
|
|
|
When you exit the incremental search, it sets the mark to where point
|
|
@emph{was}, before the search. That is convenient for moving back
|
|
there. In Transient Mark mode, incremental search sets the mark without
|
|
activating it, and does so only if the mark is not already active.
|
|
|
|
@cindex lazy search highlighting
|
|
By default, Isearch uses @dfn{lazy highlighting}. All matches for
|
|
the current search string in the buffer after the point where searching
|
|
starts are highlighted. The extra highlighting makes it easier to
|
|
anticipate where the cursor will end up each time you press @kbd{C-s} or
|
|
@kbd{C-r} to repeat a pending search. Highlighting of these additional
|
|
matches happens in a deferred fashion so as not to rob Isearch of its
|
|
usual snappy response.
|
|
@vindex isearch-lazy-highlight-cleanup
|
|
By default the highlighting of matches is cleared when you end the
|
|
search. Customize the variable @code{isearch-lazy-highlight-cleanup} to
|
|
avoid cleaning up automatically. The command @kbd{M-x
|
|
isearch-lazy-highlight-cleanup} can be used to clean up manually.
|
|
@vindex isearch-lazy-highlight
|
|
Customize the variable @code{isearch-lazy-highlight} to turn off this
|
|
feature.
|
|
|
|
@vindex isearch-mode-map
|
|
To customize the special characters that incremental search understands,
|
|
alter their bindings in the keymap @code{isearch-mode-map}. For a list
|
|
of bindings, look at the documentation of @code{isearch-mode} with
|
|
@kbd{C-h f isearch-mode @key{RET}}.
|
|
|
|
@subsection Slow Terminal Incremental Search
|
|
|
|
Incremental search on a slow terminal uses a modified style of display
|
|
that is designed to take less time. Instead of redisplaying the buffer at
|
|
each place the search gets to, it creates a new single-line window and uses
|
|
that to display the line that the search has found. The single-line window
|
|
comes into play as soon as point gets outside of the text that is already
|
|
on the screen.
|
|
|
|
When you terminate the search, the single-line window is removed.
|
|
Then Emacs redisplays the window in which the search was done, to show
|
|
its new position of point.
|
|
|
|
@ignore
|
|
The three dots at the end of the search string, normally used to indicate
|
|
that searching is going on, are not displayed in slow style display.
|
|
@end ignore
|
|
|
|
@vindex search-slow-speed
|
|
The slow terminal style of display is used when the terminal baud rate is
|
|
less than or equal to the value of the variable @code{search-slow-speed},
|
|
initially 1200.
|
|
|
|
@vindex search-slow-window-lines
|
|
The number of lines to use in slow terminal search display is controlled
|
|
by the variable @code{search-slow-window-lines}. Its normal value is 1.
|
|
|
|
@node Nonincremental Search, Word Search, Incremental Search, Search
|
|
@section Nonincremental Search
|
|
@cindex nonincremental search
|
|
|
|
Emacs also has conventional nonincremental search commands, which require
|
|
you to type the entire search string before searching begins.
|
|
|
|
@table @kbd
|
|
@item C-s @key{RET} @var{string} @key{RET}
|
|
Search for @var{string}.
|
|
@item C-r @key{RET} @var{string} @key{RET}
|
|
Search backward for @var{string}.
|
|
@end table
|
|
|
|
To do a nonincremental search, first type @kbd{C-s @key{RET}}. This
|
|
enters the minibuffer to read the search string; terminate the string
|
|
with @key{RET}, and then the search takes place. If the string is not
|
|
found, the search command gets an error.
|
|
|
|
The way @kbd{C-s @key{RET}} works is that the @kbd{C-s} invokes
|
|
incremental search, which is specially programmed to invoke nonincremental
|
|
search if the argument you give it is empty. (Such an empty argument would
|
|
otherwise be useless.) @kbd{C-r @key{RET}} also works this way.
|
|
|
|
However, nonincremental searches performed using @kbd{C-s @key{RET}} do
|
|
not call @code{search-forward} right away. The first thing done is to see
|
|
if the next character is @kbd{C-w}, which requests a word search.
|
|
@ifinfo
|
|
@xref{Word Search}.
|
|
@end ifinfo
|
|
|
|
@findex search-forward
|
|
@findex search-backward
|
|
Forward and backward nonincremental searches are implemented by the
|
|
commands @code{search-forward} and @code{search-backward}. These
|
|
commands may be bound to keys in the usual manner. The feature that you
|
|
can get to them via the incremental search commands exists for
|
|
historical reasons, and to avoid the need to find suitable key sequences
|
|
for them.
|
|
|
|
@node Word Search, Regexp Search, Nonincremental Search, Search
|
|
@section Word Search
|
|
@cindex word search
|
|
|
|
Word search searches for a sequence of words without regard to how the
|
|
words are separated. More precisely, you type a string of many words,
|
|
using single spaces to separate them, and the string can be found even if
|
|
there are multiple spaces, newlines or other punctuation between the words.
|
|
|
|
Word search is useful for editing a printed document made with a text
|
|
formatter. If you edit while looking at the printed, formatted version,
|
|
you can't tell where the line breaks are in the source file. With word
|
|
search, you can search without having to know them.
|
|
|
|
@table @kbd
|
|
@item C-s @key{RET} C-w @var{words} @key{RET}
|
|
Search for @var{words}, ignoring details of punctuation.
|
|
@item C-r @key{RET} C-w @var{words} @key{RET}
|
|
Search backward for @var{words}, ignoring details of punctuation.
|
|
@end table
|
|
|
|
Word search is a special case of nonincremental search and is invoked
|
|
with @kbd{C-s @key{RET} C-w}. This is followed by the search string,
|
|
which must always be terminated with @key{RET}. Being nonincremental,
|
|
this search does not start until the argument is terminated. It works
|
|
by constructing a regular expression and searching for that; see
|
|
@ref{Regexp Search}.
|
|
|
|
Use @kbd{C-r @key{RET} C-w} to do backward word search.
|
|
|
|
@findex word-search-forward
|
|
@findex word-search-backward
|
|
Forward and backward word searches are implemented by the commands
|
|
@code{word-search-forward} and @code{word-search-backward}. These
|
|
commands may be bound to keys in the usual manner. The feature that you
|
|
can get to them via the incremental search commands exists for historical
|
|
reasons, and to avoid the need to find suitable key sequences for them.
|
|
|
|
@node Regexp Search, Regexps, Word Search, Search
|
|
@section Regular Expression Search
|
|
@cindex regular expression
|
|
@cindex regexp
|
|
|
|
A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
|
|
denotes a class of alternative strings to match, possibly infinitely
|
|
many. In GNU Emacs, you can search for the next match for a regexp
|
|
either incrementally or not.
|
|
|
|
@kindex C-M-s
|
|
@findex isearch-forward-regexp
|
|
@kindex C-M-r
|
|
@findex isearch-backward-regexp
|
|
Incremental search for a regexp is done by typing @kbd{C-M-s}
|
|
(@code{isearch-forward-regexp}). This command reads a search string
|
|
incrementally just like @kbd{C-s}, but it treats the search string as a
|
|
regexp rather than looking for an exact match against the text in the
|
|
buffer. Each time you add text to the search string, you make the
|
|
regexp longer, and the new regexp is searched for. Invoking @kbd{C-s}
|
|
with a prefix argument (its value does not matter) is another way to do
|
|
a forward incremental regexp search. To search backward for a regexp,
|
|
use @kbd{C-M-r} (@code{isearch-backward-regexp}), or @kbd{C-r} with a
|
|
prefix argument.
|
|
|
|
All of the control characters that do special things within an
|
|
ordinary incremental search have the same function in incremental regexp
|
|
search. Typing @kbd{C-s} or @kbd{C-r} immediately after starting the
|
|
search retrieves the last incremental search regexp used; that is to
|
|
say, incremental regexp and non-regexp searches have independent
|
|
defaults. They also have separate search rings that you can access with
|
|
@kbd{M-p} and @kbd{M-n}.
|
|
|
|
If you type @key{SPC} in incremental regexp search, it matches any
|
|
sequence of whitespace characters, including newlines. If you want
|
|
to match just a space, type @kbd{C-q @key{SPC}}.
|
|
|
|
Note that adding characters to the regexp in an incremental regexp
|
|
search can make the cursor move back and start again. For example, if
|
|
you have searched for @samp{foo} and you add @samp{\|bar}, the cursor
|
|
backs up in case the first @samp{bar} precedes the first @samp{foo}.
|
|
|
|
@findex re-search-forward
|
|
@findex re-search-backward
|
|
Nonincremental search for a regexp is done by the functions
|
|
@code{re-search-forward} and @code{re-search-backward}. You can invoke
|
|
these with @kbd{M-x}, or bind them to keys, or invoke them by way of
|
|
incremental regexp search with @kbd{C-M-s @key{RET}} and @kbd{C-M-r
|
|
@key{RET}}.
|
|
|
|
If you use the incremental regexp search commands with a prefix
|
|
argument, they perform ordinary string search, like
|
|
@code{isearch-forward} and @code{isearch-backward}. @xref{Incremental
|
|
Search}.
|
|
|
|
@node Regexps, Search Case, Regexp Search, Search
|
|
@section Syntax of Regular Expressions
|
|
@cindex regexp syntax
|
|
|
|
Regular expressions have a syntax in which a few characters are
|
|
special constructs and the rest are @dfn{ordinary}. An ordinary
|
|
character is a simple regular expression which matches that same
|
|
character and nothing else. The special characters are @samp{$},
|
|
@samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, @samp{]} and
|
|
@samp{\}. Any other character appearing in a regular expression is
|
|
ordinary, unless a @samp{\} precedes it.
|
|
|
|
For example, @samp{f} is not a special character, so it is ordinary, and
|
|
therefore @samp{f} is a regular expression that matches the string
|
|
@samp{f} and no other string. (It does @emph{not} match the string
|
|
@samp{ff}.) Likewise, @samp{o} is a regular expression that matches
|
|
only @samp{o}. (When case distinctions are being ignored, these regexps
|
|
also match @samp{F} and @samp{O}, but we consider this a generalization
|
|
of ``the same string,'' rather than an exception.)
|
|
|
|
Any two regular expressions @var{a} and @var{b} can be concatenated. The
|
|
result is a regular expression which matches a string if @var{a} matches
|
|
some amount of the beginning of that string and @var{b} matches the rest of
|
|
the string.@refill
|
|
|
|
As a simple example, we can concatenate the regular expressions @samp{f}
|
|
and @samp{o} to get the regular expression @samp{fo}, which matches only
|
|
the string @samp{fo}. Still trivial. To do something nontrivial, you
|
|
need to use one of the special characters. Here is a list of them.
|
|
|
|
@table @kbd
|
|
@item .@: @r{(Period)}
|
|
is a special character that matches any single character except a newline.
|
|
Using concatenation, we can make regular expressions like @samp{a.b}, which
|
|
matches any three-character string that begins with @samp{a} and ends with
|
|
@samp{b}.@refill
|
|
|
|
@item *
|
|
is not a construct by itself; it is a postfix operator that means to
|
|
match the preceding regular expression repetitively as many times as
|
|
possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
|
|
@samp{o}s).
|
|
|
|
@samp{*} always applies to the @emph{smallest} possible preceding
|
|
expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
|
|
@samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
|
|
|
|
The matcher processes a @samp{*} construct by matching, immediately,
|
|
as many repetitions as can be found. Then it continues with the rest
|
|
of the pattern. If that fails, backtracking occurs, discarding some
|
|
of the matches of the @samp{*}-modified construct in case that makes
|
|
it possible to match the rest of the pattern. For example, in matching
|
|
@samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
|
|
tries to match all three @samp{a}s; but the rest of the pattern is
|
|
@samp{ar} and there is only @samp{r} left to match, so this try fails.
|
|
The next alternative is for @samp{a*} to match only two @samp{a}s.
|
|
With this choice, the rest of the regexp matches successfully.@refill
|
|
|
|
@item +
|
|
is a postfix operator, similar to @samp{*} except that it must match
|
|
the preceding expression at least once. So, for example, @samp{ca+r}
|
|
matches the strings @samp{car} and @samp{caaaar} but not the string
|
|
@samp{cr}, whereas @samp{ca*r} matches all three strings.
|
|
|
|
@item ?
|
|
is a postfix operator, similar to @samp{*} except that it can match the
|
|
preceding expression either once or not at all. For example,
|
|
@samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
|
|
|
|
@item *?, +?, ??
|
|
@cindex non-greedy regexp matching
|
|
are non-greedy variants of the operators above. The normal operators
|
|
@samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as much
|
|
as they can, while if you append a @samp{?} after them, it makes them
|
|
non-greedy: they will match as little as possible.
|
|
|
|
@item \@{@var{n},@var{m}\@}
|
|
is another postfix operator that specifies an interval of iteration:
|
|
the preceding regular expression must match between @var{n} and
|
|
@var{m} times. If @var{m} is omitted, then there is no upper bound
|
|
and if @samp{,@var{m}} is omitted, then the regular expression must match
|
|
exactly @var{n} times. @*
|
|
@samp{\@{0,1\@}} is equivalent to @samp{?}. @*
|
|
@samp{\@{0,\@}} is equivalent to @samp{*}. @*
|
|
@samp{\@{1,\@}} is equivalent to @samp{+}. @*
|
|
@samp{\@{@var{n}\@}} is equivalent to @samp{\@{@var{n},@var{n}\@}}.
|
|
|
|
@item [ @dots{} ]
|
|
is a @dfn{character set}, which begins with @samp{[} and is terminated
|
|
by @samp{]}. In the simplest case, the characters between the two
|
|
brackets are what this set can match.
|
|
|
|
Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
|
|
@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
|
|
(including the empty string), from which it follows that @samp{c[ad]*r}
|
|
matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
|
|
|
|
You can also include character ranges in a character set, by writing the
|
|
starting and ending characters with a @samp{-} between them. Thus,
|
|
@samp{[a-z]} matches any lower-case ASCII letter. Ranges may be
|
|
intermixed freely with individual characters, as in @samp{[a-z$%.]},
|
|
which matches any lower-case ASCII letter or @samp{$}, @samp{%} or
|
|
period.
|
|
|
|
Note that the usual regexp special characters are not special inside a
|
|
character set. A completely different set of special characters exists
|
|
inside character sets: @samp{]}, @samp{-} and @samp{^}.
|
|
|
|
To include a @samp{]} in a character set, you must make it the first
|
|
character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To
|
|
include a @samp{-}, write @samp{-} as the first or last character of the
|
|
set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]}
|
|
and @samp{-}.
|
|
|
|
To include @samp{^} in a set, put it anywhere but at the beginning of
|
|
the set.
|
|
|
|
When you use a range in case-insensitive search, you should write both
|
|
ends of the range in upper case, or both in lower case, or both should
|
|
be non-letters. The behavior of a mixed-case range such as @samp{A-z}
|
|
is somewhat ill-defined, and it may change in future Emacs versions.
|
|
|
|
@item [^ @dots{} ]
|
|
@samp{[^} begins a @dfn{complemented character set}, which matches any
|
|
character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
|
|
all characters @emph{except} letters and digits.
|
|
|
|
@samp{^} is not special in a character set unless it is the first
|
|
character. The character following the @samp{^} is treated as if it
|
|
were first (in other words, @samp{-} and @samp{]} are not special there).
|
|
|
|
A complemented character set can match a newline, unless newline is
|
|
mentioned as one of the characters not to match. This is in contrast to
|
|
the handling of regexps in programs such as @code{grep}.
|
|
|
|
@item ^
|
|
is a special character that matches the empty string, but only at the
|
|
beginning of a line in the text being matched. Otherwise it fails to
|
|
match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
|
|
the beginning of a line.
|
|
|
|
@item $
|
|
is similar to @samp{^} but matches only at the end of a line. Thus,
|
|
@samp{x+$} matches a string of one @samp{x} or more at the end of a line.
|
|
|
|
@item \
|
|
has two functions: it quotes the special characters (including
|
|
@samp{\}), and it introduces additional special constructs.
|
|
|
|
Because @samp{\} quotes special characters, @samp{\$} is a regular
|
|
expression that matches only @samp{$}, and @samp{\[} is a regular
|
|
expression that matches only @samp{[}, and so on.
|
|
@end table
|
|
|
|
Note: for historical compatibility, special characters are treated as
|
|
ordinary ones if they are in contexts where their special meanings make no
|
|
sense. For example, @samp{*foo} treats @samp{*} as ordinary since there is
|
|
no preceding expression on which the @samp{*} can act. It is poor practice
|
|
to depend on this behavior; it is better to quote the special character anyway,
|
|
regardless of where it appears.@refill
|
|
|
|
For the most part, @samp{\} followed by any character matches only that
|
|
character. However, there are several exceptions: two-character
|
|
sequences starting with @samp{\} that have special meanings. The second
|
|
character in the sequence is always an ordinary character when used on
|
|
its own. Here is a table of @samp{\} constructs.
|
|
|
|
@table @kbd
|
|
@item \|
|
|
specifies an alternative. Two regular expressions @var{a} and @var{b}
|
|
with @samp{\|} in between form an expression that matches some text if
|
|
either @var{a} matches it or @var{b} matches it. It works by trying to
|
|
match @var{a}, and if that fails, by trying to match @var{b}.
|
|
|
|
Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
|
|
but no other string.@refill
|
|
|
|
@samp{\|} applies to the largest possible surrounding expressions. Only a
|
|
surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
|
|
@samp{\|}.@refill
|
|
|
|
Full backtracking capability exists to handle multiple uses of @samp{\|}.
|
|
|
|
@item \( @dots{} \)
|
|
is a grouping construct that serves three purposes:
|
|
|
|
@enumerate
|
|
@item
|
|
To enclose a set of @samp{\|} alternatives for other operations.
|
|
Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
|
|
|
|
@item
|
|
To enclose a complicated expression for the postfix operators @samp{*},
|
|
@samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
|
|
@samp{bananana}, etc., with any (zero or more) number of @samp{na}
|
|
strings.@refill
|
|
|
|
@item
|
|
To record a matched substring for future reference.
|
|
@end enumerate
|
|
|
|
This last application is not a consequence of the idea of a
|
|
parenthetical grouping; it is a separate feature that is assigned as a
|
|
second meaning to the same @samp{\( @dots{} \)} construct. In practice
|
|
there is almost no conflict between the two meanings.
|
|
|
|
@item \(?: @dots{} \)
|
|
is another grouping construct (often called ``shy'') that serves the same
|
|
first two purposes, but not the third:
|
|
it cannot be referred to later on by number. This is only useful
|
|
for mechanically constructed regular expressions where grouping
|
|
constructs need to be introduced implicitly and hence risk changing the
|
|
numbering of subsequent groups.
|
|
|
|
@item \@var{d}
|
|
matches the same text that matched the @var{d}th occurrence of a
|
|
@samp{\( @dots{} \)} construct.
|
|
|
|
After the end of a @samp{\( @dots{} \)} construct, the matcher remembers
|
|
the beginning and end of the text matched by that construct. Then,
|
|
later on in the regular expression, you can use @samp{\} followed by the
|
|
digit @var{d} to mean ``match the same text matched the @var{d}th time
|
|
by the @samp{\( @dots{} \)} construct.''
|
|
|
|
The strings matching the first nine @samp{\( @dots{} \)} constructs
|
|
appearing in a regular expression are assigned numbers 1 through 9 in
|
|
the order that the open-parentheses appear in the regular expression.
|
|
So you can use @samp{\1} through @samp{\9} to refer to the text matched
|
|
by the corresponding @samp{\( @dots{} \)} constructs.
|
|
|
|
For example, @samp{\(.*\)\1} matches any newline-free string that is
|
|
composed of two identical halves. The @samp{\(.*\)} matches the first
|
|
half, which may be anything, but the @samp{\1} that follows must match
|
|
the same exact text.
|
|
|
|
If a particular @samp{\( @dots{} \)} construct matches more than once
|
|
(which can easily happen if it is followed by @samp{*}), only the last
|
|
match is recorded.
|
|
|
|
@item \`
|
|
matches the empty string, but only at the beginning
|
|
of the buffer or string being matched against.
|
|
|
|
@item \'
|
|
matches the empty string, but only at the end of
|
|
the buffer or string being matched against.
|
|
|
|
@item \=
|
|
matches the empty string, but only at point.
|
|
|
|
@item \b
|
|
matches the empty string, but only at the beginning or
|
|
end of a word. Thus, @samp{\bfoo\b} matches any occurrence of
|
|
@samp{foo} as a separate word. @samp{\bballs?\b} matches
|
|
@samp{ball} or @samp{balls} as a separate word.@refill
|
|
|
|
@samp{\b} matches at the beginning or end of the buffer
|
|
regardless of what text appears next to it.
|
|
|
|
@item \B
|
|
matches the empty string, but @emph{not} at the beginning or
|
|
end of a word.
|
|
|
|
@item \<
|
|
matches the empty string, but only at the beginning of a word.
|
|
@samp{\<} matches at the beginning of the buffer only if a
|
|
word-constituent character follows.
|
|
|
|
@item \>
|
|
matches the empty string, but only at the end of a word. @samp{\>}
|
|
matches at the end of the buffer only if the contents end with a
|
|
word-constituent character.
|
|
|
|
@item \w
|
|
matches any word-constituent character. The syntax table
|
|
determines which characters these are. @xref{Syntax}.
|
|
|
|
@item \W
|
|
matches any character that is not a word-constituent.
|
|
|
|
@item \s@var{c}
|
|
matches any character whose syntax is @var{c}. Here @var{c} is a
|
|
character that represents a syntax code: thus, @samp{w} for word
|
|
constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
|
|
etc. Represent a character of whitespace (which can be a newline) by
|
|
either @samp{-} or a space character.
|
|
|
|
@item \S@var{c}
|
|
matches any character whose syntax is not @var{c}.
|
|
@end table
|
|
|
|
The constructs that pertain to words and syntax are controlled by the
|
|
setting of the syntax table (@pxref{Syntax}).
|
|
|
|
Here is a complicated regexp, used by Emacs to recognize the end of a
|
|
sentence together with any whitespace that follows. It is given in Lisp
|
|
syntax to enable you to distinguish the spaces from the tab characters. In
|
|
Lisp syntax, the string constant begins and ends with a double-quote.
|
|
@samp{\"} stands for a double-quote as part of the regexp, @samp{\\} for a
|
|
backslash as part of the regexp, @samp{\t} for a tab and @samp{\n} for a
|
|
newline.
|
|
|
|
@example
|
|
"[.?!][]\"')]*\\($\\|\t\\| \\)[ \t\n]*"
|
|
@end example
|
|
|
|
@noindent
|
|
This contains four parts in succession: a character set matching period,
|
|
@samp{?}, or @samp{!}; a character set matching close-brackets, quotes,
|
|
or parentheses, repeated any number of times; an alternative in
|
|
backslash-parentheses that matches end-of-line, a tab, or two spaces;
|
|
and a character set matching whitespace characters, repeated any number
|
|
of times.
|
|
|
|
To enter the same regexp interactively, you would type @key{TAB} to
|
|
enter a tab, and @kbd{C-j} to enter a newline. You would also type
|
|
single backslashes as themselves, instead of doubling them for Lisp syntax.
|
|
|
|
@node Search Case, Replace, Regexps, Search
|
|
@section Searching and Case
|
|
|
|
@vindex case-fold-search
|
|
Incremental searches in Emacs normally ignore the case of the text
|
|
they are searching through, if you specify the text in lower case.
|
|
Thus, if you specify searching for @samp{foo}, then @samp{Foo} and
|
|
@samp{foo} are also considered a match. Regexps, and in particular
|
|
character sets, are included: @samp{[ab]} would match @samp{a} or
|
|
@samp{A} or @samp{b} or @samp{B}.@refill
|
|
|
|
An upper-case letter anywhere in the incremental search string makes
|
|
the search case-sensitive. Thus, searching for @samp{Foo} does not find
|
|
@samp{foo} or @samp{FOO}. This applies to regular expression search as
|
|
well as to string search. The effect ceases if you delete the
|
|
upper-case letter from the search string.
|
|
|
|
If you set the variable @code{case-fold-search} to @code{nil}, then
|
|
all letters must match exactly, including case. This is a per-buffer
|
|
variable; altering the variable affects only the current buffer, but
|
|
there is a default value which you can change as well. @xref{Locals}.
|
|
This variable applies to nonincremental searches also, including those
|
|
performed by the replace commands (@pxref{Replace}) and the minibuffer
|
|
history matching commands (@pxref{Minibuffer History}).
|
|
|
|
@node Replace, Other Repeating Search, Search Case, Search
|
|
@section Replacement Commands
|
|
@cindex replacement
|
|
@cindex search-and-replace commands
|
|
@cindex string substitution
|
|
@cindex global substitution
|
|
|
|
Global search-and-replace operations are not needed as often in Emacs
|
|
as they are in other editors@footnote{In some editors,
|
|
search-and-replace operations are the only convenient way to make a
|
|
single change in the text.}, but they are available. In addition to the
|
|
simple @kbd{M-x replace-string} command which is like that found in most
|
|
editors, there is a @kbd{M-x query-replace} command which asks you, for
|
|
each occurrence of the pattern, whether to replace it.
|
|
|
|
The replace commands normally operate on the text from point to the
|
|
end of the buffer; however, in Transient Mark mode, when the mark is
|
|
active, they operate on the region. The replace commands all replace
|
|
one string (or regexp) with one replacement string. It is possible to
|
|
perform several replacements in parallel using the command
|
|
@code{expand-region-abbrevs} (@pxref{Expanding Abbrevs}).
|
|
|
|
@menu
|
|
* Unconditional Replace:: Replacing all matches for a string.
|
|
* Regexp Replace:: Replacing all matches for a regexp.
|
|
* Replacement and Case:: How replacements preserve case of letters.
|
|
* Query Replace:: How to use querying.
|
|
@end menu
|
|
|
|
@node Unconditional Replace, Regexp Replace, Replace, Replace
|
|
@subsection Unconditional Replacement
|
|
@findex replace-string
|
|
@findex replace-regexp
|
|
|
|
@table @kbd
|
|
@item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
|
|
Replace every occurrence of @var{string} with @var{newstring}.
|
|
@item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
|
|
Replace every match for @var{regexp} with @var{newstring}.
|
|
@end table
|
|
|
|
To replace every instance of @samp{foo} after point with @samp{bar},
|
|
use the command @kbd{M-x replace-string} with the two arguments
|
|
@samp{foo} and @samp{bar}. Replacement happens only in the text after
|
|
point, so if you want to cover the whole buffer you must go to the
|
|
beginning first. All occurrences up to the end of the buffer are
|
|
replaced; to limit replacement to part of the buffer, narrow to that
|
|
part of the buffer before doing the replacement (@pxref{Narrowing}).
|
|
In Transient Mark mode, when the region is active, replacement is
|
|
limited to the region (@pxref{Transient Mark}).
|
|
|
|
When @code{replace-string} exits, it leaves point at the last
|
|
occurrence replaced. It sets the mark to the prior position of point
|
|
(where the @code{replace-string} command was issued); use @kbd{C-u
|
|
C-@key{SPC}} to move back there.
|
|
|
|
A numeric argument restricts replacement to matches that are surrounded
|
|
by word boundaries. The argument's value doesn't matter.
|
|
|
|
@node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
|
|
@subsection Regexp Replacement
|
|
|
|
The @kbd{M-x replace-string} command replaces exact matches for a
|
|
single string. The similar command @kbd{M-x replace-regexp} replaces
|
|
any match for a specified pattern.
|
|
|
|
In @code{replace-regexp}, the @var{newstring} need not be constant: it
|
|
can refer to all or part of what is matched by the @var{regexp}.
|
|
@samp{\&} in @var{newstring} stands for the entire match being replaced.
|
|
@samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for
|
|
whatever matched the @var{d}th parenthesized grouping in @var{regexp}.
|
|
To include a @samp{\} in the text to replace with, you must enter
|
|
@samp{\\}. For example,
|
|
|
|
@example
|
|
M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
|
|
@end example
|
|
|
|
@noindent
|
|
replaces (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
|
|
with @samp{cddr-safe}.
|
|
|
|
@example
|
|
M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
|
|
@end example
|
|
|
|
@noindent
|
|
performs the inverse transformation.
|
|
|
|
@node Replacement and Case, Query Replace, Regexp Replace, Replace
|
|
@subsection Replace Commands and Case
|
|
|
|
If the first argument of a replace command is all lower case, the
|
|
commands ignores case while searching for occurrences to
|
|
replace---provided @code{case-fold-search} is non-@code{nil}. If
|
|
@code{case-fold-search} is set to @code{nil}, case is always significant
|
|
in all searches.
|
|
|
|
@vindex case-replace
|
|
In addition, when the @var{newstring} argument is all or partly lower
|
|
case, replacement commands try to preserve the case pattern of each
|
|
occurrence. Thus, the command
|
|
|
|
@example
|
|
M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
|
|
@end example
|
|
|
|
@noindent
|
|
replaces a lower case @samp{foo} with a lower case @samp{bar}, an
|
|
all-caps @samp{FOO} with @samp{BAR}, and a capitalized @samp{Foo} with
|
|
@samp{Bar}. (These three alternatives---lower case, all caps, and
|
|
capitalized, are the only ones that @code{replace-string} can
|
|
distinguish.)
|
|
|
|
If upper-case letters are used in the replacement string, they remain
|
|
upper case every time that text is inserted. If upper-case letters are
|
|
used in the first argument, the second argument is always substituted
|
|
exactly as given, with no case conversion. Likewise, if either
|
|
@code{case-replace} or @code{case-fold-search} is set to @code{nil},
|
|
replacement is done without case conversion.
|
|
|
|
@node Query Replace,, Replacement and Case, Replace
|
|
@subsection Query Replace
|
|
@cindex query replace
|
|
|
|
@table @kbd
|
|
@item M-% @var{string} @key{RET} @var{newstring} @key{RET}
|
|
@itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
|
|
Replace some occurrences of @var{string} with @var{newstring}.
|
|
@item C-M-% @var{regexp} @key{RET} @var{newstring} @key{RET}
|
|
@itemx M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
|
|
Replace some matches for @var{regexp} with @var{newstring}.
|
|
@end table
|
|
|
|
@kindex M-%
|
|
@findex query-replace
|
|
If you want to change only some of the occurrences of @samp{foo} to
|
|
@samp{bar}, not all of them, then you cannot use an ordinary
|
|
@code{replace-string}. Instead, use @kbd{M-%} (@code{query-replace}).
|
|
This command finds occurrences of @samp{foo} one by one, displays each
|
|
occurrence and asks you whether to replace it. A numeric argument to
|
|
@code{query-replace} tells it to consider only occurrences that are
|
|
bounded by word-delimiter characters. This preserves case, just like
|
|
@code{replace-string}, provided @code{case-replace} is non-@code{nil},
|
|
as it normally is.
|
|
|
|
@kindex C-M-%
|
|
@findex query-replace-regexp
|
|
Aside from querying, @code{query-replace} works just like
|
|
@code{replace-string}, and @code{query-replace-regexp} works just like
|
|
@code{replace-regexp}. This command is run by @kbd{C-M-%}.
|
|
|
|
The things you can type when you are shown an occurrence of @var{string}
|
|
or a match for @var{regexp} are:
|
|
|
|
@ignore @c Not worth it.
|
|
@kindex SPC @r{(query-replace)}
|
|
@kindex DEL @r{(query-replace)}
|
|
@kindex , @r{(query-replace)}
|
|
@kindex RET @r{(query-replace)}
|
|
@kindex . @r{(query-replace)}
|
|
@kindex ! @r{(query-replace)}
|
|
@kindex ^ @r{(query-replace)}
|
|
@kindex C-r @r{(query-replace)}
|
|
@kindex C-w @r{(query-replace)}
|
|
@kindex C-l @r{(query-replace)}
|
|
@end ignore
|
|
|
|
@c WideCommands
|
|
@table @kbd
|
|
@item @key{SPC}
|
|
to replace the occurrence with @var{newstring}.
|
|
|
|
@item @key{DEL}
|
|
to skip to the next occurrence without replacing this one.
|
|
|
|
@item , @r{(Comma)}
|
|
to replace this occurrence and display the result. You are then asked
|
|
for another input character to say what to do next. Since the
|
|
replacement has already been made, @key{DEL} and @key{SPC} are
|
|
equivalent in this situation; both move to the next occurrence.
|
|
|
|
You can type @kbd{C-r} at this point (see below) to alter the replaced
|
|
text. You can also type @kbd{C-x u} to undo the replacement; this exits
|
|
the @code{query-replace}, so if you want to do further replacement you
|
|
must use @kbd{C-x @key{ESC} @key{ESC} @key{RET}} to restart
|
|
(@pxref{Repetition}).
|
|
|
|
@item @key{RET}
|
|
to exit without doing any more replacements.
|
|
|
|
@item .@: @r{(Period)}
|
|
to replace this occurrence and then exit without searching for more
|
|
occurrences.
|
|
|
|
@item !
|
|
to replace all remaining occurrences without asking again.
|
|
|
|
@item ^
|
|
to go back to the position of the previous occurrence (or what used to
|
|
be an occurrence), in case you changed it by mistake. This works by
|
|
popping the mark ring. Only one @kbd{^} in a row is meaningful, because
|
|
only one previous replacement position is kept during @code{query-replace}.
|
|
|
|
@item C-r
|
|
to enter a recursive editing level, in case the occurrence needs to be
|
|
edited rather than just replaced with @var{newstring}. When you are
|
|
done, exit the recursive editing level with @kbd{C-M-c} to proceed to
|
|
the next occurrence. @xref{Recursive Edit}.
|
|
|
|
@item C-w
|
|
to delete the occurrence, and then enter a recursive editing level as in
|
|
@kbd{C-r}. Use the recursive edit to insert text to replace the deleted
|
|
occurrence of @var{string}. When done, exit the recursive editing level
|
|
with @kbd{C-M-c} to proceed to the next occurrence.
|
|
|
|
@item C-l
|
|
to redisplay the screen. Then you must type another character to
|
|
specify what to do with this occurrence.
|
|
|
|
@item C-h
|
|
to display a message summarizing these options. Then you must type
|
|
another character to specify what to do with this occurrence.
|
|
@end table
|
|
|
|
Some other characters are aliases for the ones listed above: @kbd{y},
|
|
@kbd{n} and @kbd{q} are equivalent to @key{SPC}, @key{DEL} and
|
|
@key{RET}.
|
|
|
|
Aside from this, any other character exits the @code{query-replace},
|
|
and is then reread as part of a key sequence. Thus, if you type
|
|
@kbd{C-k}, it exits the @code{query-replace} and then kills to end of
|
|
line.
|
|
|
|
To restart a @code{query-replace} once it is exited, use @kbd{C-x
|
|
@key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it
|
|
used the minibuffer to read its arguments. @xref{Repetition, C-x ESC
|
|
ESC}.
|
|
|
|
See also @ref{Transforming File Names}, for Dired commands to rename,
|
|
copy, or link files by replacing regexp matches in file names.
|
|
|
|
@node Other Repeating Search,, Replace, Search
|
|
@section Other Search-and-Loop Commands
|
|
|
|
Here are some other commands that find matches for a regular
|
|
expression. They all operate from point to the end of the buffer, and
|
|
all ignore case in matching, if the pattern contains no upper-case
|
|
letters and @code{case-fold-search} is non-@code{nil}.
|
|
|
|
@findex list-matching-lines
|
|
@findex occur
|
|
@findex count-matches
|
|
@findex delete-non-matching-lines
|
|
@findex delete-matching-lines
|
|
@findex flush-lines
|
|
@findex keep-lines
|
|
|
|
@table @kbd
|
|
@item M-x occur @key{RET} @var{regexp} @key{RET}
|
|
Display a list showing each line in the buffer that contains a match for
|
|
@var{regexp}. A numeric argument specifies the number of context lines
|
|
to print before and after each matching line; the default is none.
|
|
To limit the search to part of the buffer, narrow to that part
|
|
(@pxref{Narrowing}).
|
|
|
|
@kindex RET @r{(Occur mode)}
|
|
The buffer @samp{*Occur*} containing the output serves as a menu for
|
|
finding the occurrences in their original context. Click @kbd{Mouse-2}
|
|
on an occurrence listed in @samp{*Occur*}, or position point there and
|
|
type @key{RET}; this switches to the buffer that was searched and
|
|
moves point to the original of the chosen occurrence.
|
|
|
|
@item M-x list-matching-lines
|
|
Synonym for @kbd{M-x occur}.
|
|
|
|
@item M-x count-matches @key{RET} @var{regexp} @key{RET}
|
|
Print the number of matches for @var{regexp} after point.
|
|
|
|
@item M-x flush-lines @key{RET} @var{regexp} @key{RET}
|
|
Delete each line that follows point and contains a match for
|
|
@var{regexp}.
|
|
|
|
@item M-x keep-lines @key{RET} @var{regexp} @key{RET}
|
|
Delete each line that follows point and @emph{does not} contain a match
|
|
for @var{regexp}.
|
|
@end table
|
|
|
|
Searching and replacing can be performed under the control of tags
|
|
files (@pxref{Tags Search}) and Dired (@pxref{Operating on Files}).
|
|
|
|
In addition, you can use @code{grep} from Emacs to search a collection
|
|
of files for matches for a regular expression, then visit the matches
|
|
either sequentially or in arbitrary order. @xref{Grep Searching}.
|