mirror of
https://git.savannah.gnu.org/git/emacs.git
synced 2025-01-16 17:19:41 +00:00
Non-ASCII in regexp ranges.
This commit is contained in:
parent
40ad3db491
commit
6cc089d2ad
@ -311,10 +311,17 @@ matches both @samp{]} and @samp{-}.
|
||||
To include @samp{^} in a character alternative, put it anywhere but at
|
||||
the beginning.
|
||||
|
||||
The beginning and end of a range must be in the same character set
|
||||
(@pxref{Character Sets}). Thus, @samp{[a-\x8e0]} is invalid because
|
||||
@samp{a} is in the @sc{ascii} character set but the character 0x8e0
|
||||
(@samp{a} with grave accent) is in the Emacs character set for Latin-1.
|
||||
The beginning and end of a range of multibyte characters must be in the
|
||||
same character set (@pxref{Character Sets}). Thus, @samp{[\x8e0-\x97c]}
|
||||
is invalid because character 0x8e0 (@samp{a} with grave accent) is in
|
||||
the Emacs character set for Latin-1 but the character 0x97c (@samp{u}
|
||||
with diaeresis) is in the Emacs character set for Latin-2.
|
||||
|
||||
If a range starts with a unibyte character @var{c} and ends with a
|
||||
multibyte character @var{c2}, the range is divided into two parts: one
|
||||
is @samp{@var{c}..?\377}, the other is @samp{@var{c1}..@var{c2}}, where
|
||||
@var{c1} is the first character of the charset to which @var{c2}
|
||||
belongs.
|
||||
|
||||
You cannot always match all non-@sc{ascii} characters with the regular
|
||||
expression @samp{[\200-\377]}. This works when searching a unibyte
|
||||
|
Loading…
Reference in New Issue
Block a user