mirror of
https://git.savannah.gnu.org/git/emacs.git
synced 2024-11-24 07:20:37 +00:00
Update the bidirectional reordering engine for Unicode 6.3 and 7.0.
src/bidi.c (bidi_ignore_explicit_marks_for_paragraph_level): Remove variable. (bidi_get_type): Return the isolate initiators and terminator types. (bidi_isolate_fmt_char, bidi_paired_bracket_type) (bidi_fetch_char_skip_isolates, find_first_strong_char) (bidi_find_bracket_pairs, bidi_resolve_brackets): New functions. (bidi_set_sos_type): Renamed from bidi_set_sor_type and updated for the new features. (bidi_push_embedding_level, bidi_pop_embedding_level): Update to push and pop correctly for isolates. (bidi_remember_char): Modified to accept an additional argument and record the bidi type according to its value. (bidi_cache_iterator_state): Accept an additional argument to only update an existing state. Handle the new members of struct bidi_it. (bidi_cache_find): Arguments changed: no lnger accepts a level, instead accepts a flag telling it whether it is okay to return unresolved neutrals. (bidi_initialize): Initiate and staticpro the bracket-type uniprop table. Initialize new isolate-related members. (bidi_paragraph_init): Some code factored out into find_first_strong_char. (bidi_resolve_explicit_1): Function deleted, its code incorporated into bidi_resolve_explicit. (bidi_resolve_explicit): Support the isolate initiators and terminator. Fix handling of embeddings and overrides according to new UBA requirements. Record information about previously seen characters here (moved from bidi_level_of_next_char). (bidi_resolve_weak): Adapt to changes in struct members. (FLAG_EMBEDDING_INSIDE, FLAG_OPPOSITE_INSIDE, MAX_BPA_STACK) (STORE_BRACKET_CHARPOS, PUSH_BPA_STACK): New macros. (bidi_resolve_neutral): Call bidi_resolve_brackets to handle the paired bracket resolution. Handle isolate initiators and terminator. (bidi_type_of_next_char): Remove unneeded code for BN limit. (bidi_level_of_next_char): Move the code that records information about previous characters to bidi_resolve_explicit. Fix logic of resolving neutrals and make sure their cache entries are updated. Remove now unneeded special handling of PDF level. src/dispextern.h (struct glyph): Enlarge the width of resolved_level. (BIDI_MAXDEPTH): New macro, renamed from BIDI_MAXLEVEL and enlarged per Unicode 6.3. (enum bidi_bracket_type_t): New data type. (struct bidi_saved_info): Leave only 2 type members out of 4. Remove bytepos. (struct bidi_stack): Add members necessary to support isolating sequences. (struct bidi_it): Add new members necessary to support isolating sequences and bracket pair resolution. src/xdisp.c (Fbidi_resolved_levels): New function. (syms_of_xdisp): Defsubr it. (append_glyph, append_composite_glyph, produce_image_glyph) (append_stretch_glyph, append_glyphless_glyph): Convert aborts to assertions. (syms_of_xdisp) <inhibit-bidi-mirroring>: New variable. src/term.c (append_glyph, append_composite_glyph) (append_glyphless_glyph): Convert aborts to assertions. src/.gdbinit (pgx): Display the character codepoint, resolved level, and bidi type also for glyphless glyphs. lisp/simple.el (what-cursor-position): Update to support the new bidi characters. lisp/descr-text.el (describe-char): Update to support the new bidi characters. admin/unidata/unidata-gen.el (unidata-prop-alist): New properties 'paired-bracket' and 'bracket-type', in support of the UBA 6.3. (unidata-gen-table): Support PROP-IDX being a function. (unidata-describe-bidi-bracket-type, unidata-gen-brackets-list) (unidata-gen-bracket-type-list): New functions. (unidata-check): Support checking the 'bracket-type' attribute. (unidata-gen-files): Don't create backups for uni-*.el files. admin/unidata/Makefile.in (${unidir}/charprop.el): Depend on BidiMirroring.txt and BidiBrackets.txt. admin/unidata/BidiBrackets.txt: New file, from Unicode. etc/NEWS: Mention the UBA implementation update. etc/HELLO: Remove now unneeded directional control characters. doc/lispref/nonascii.texi (Character Properties): Document the new properties 'bracket-type' and 'paired-bracket'. doc/lisprefdisplay.texi (Bidirectional Display): Update the version of the UBA to which we are conforming. test/BidiCharacterTest.txt: New file, from Unicode. test/biditest.el: New file.
This commit is contained in:
commit
ed7ebd933a
1
.gitignore
vendored
1
.gitignore
vendored
@ -21,3 +21,4 @@ etc/refcards/*.aux
|
||||
etc/refcards/*.log
|
||||
info/dir
|
||||
info/*.info
|
||||
test/biditest.txt
|
||||
|
@ -1,3 +1,18 @@
|
||||
2014-10-15 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* unidata/unidata-gen.el (unidata-prop-alist): New properties
|
||||
'paired-bracket' and 'bracket-type', in support of the UBA 6.3.
|
||||
(unidata-gen-table): Support PROP-IDX being a function.
|
||||
(unidata-describe-bidi-bracket-type, unidata-gen-brackets-list)
|
||||
(unidata-gen-bracket-type-list): New functions.
|
||||
(unidata-check): Support checking the 'bracket-type' attribute.
|
||||
(unidata-gen-files): Don't create backups for uni-*.el files.
|
||||
|
||||
* unidata/Makefile.in (${unidir}/charprop.el): Depend on
|
||||
BidiMirroring.txt and BidiBrackets.txt.
|
||||
|
||||
* unidata/BidiBrackets.txt: New file, from Unicode.
|
||||
|
||||
2014-10-13 Glenn Morris <rgm@gnu.org>
|
||||
|
||||
* authors.el (authors-aliases, authors-fixed-case)
|
||||
|
176
admin/unidata/BidiBrackets.txt
Normal file
176
admin/unidata/BidiBrackets.txt
Normal file
@ -0,0 +1,176 @@
|
||||
# BidiBrackets-7.0.0.txt
|
||||
# Date: 2014-01-21, 02:30:00 GMT [AG, LI, KW]
|
||||
#
|
||||
# Bidi_Paired_Bracket and Bidi_Paired_Bracket_Type Properties
|
||||
#
|
||||
# This file is a normative contributory data file in the Unicode
|
||||
# Character Database.
|
||||
#
|
||||
# Copyright (c) 1991-2014 Unicode, Inc.
|
||||
# For terms of use, see http://www.unicode.org/terms_of_use.html
|
||||
#
|
||||
# Bidi_Paired_Bracket is a normative property of type Miscellaneous,
|
||||
# which establishes a mapping between characters that are treated as
|
||||
# bracket pairs by the Unicode Bidirectional Algorithm.
|
||||
#
|
||||
# Bidi_Paired_Bracket_Type is a normative property of type Enumeration,
|
||||
# which classifies characters into opening and closing paired brackets
|
||||
# for the purposes of the Unicode Bidirectional Algorithm.
|
||||
#
|
||||
# This file lists the set of code points with Bidi_Paired_Bracket_Type
|
||||
# property values Open and Close. The set is derived from the character
|
||||
# properties General_Category (gc), Bidi_Class (bc), Bidi_Mirrored (Bidi_M),
|
||||
# and Bidi_Mirroring_Glyph (bmg), as follows: two characters, A and B,
|
||||
# form a bracket pair if A has gc=Ps and B has gc=Pe, both have bc=ON and
|
||||
# Bidi_M=Y, and bmg of A is B. Bidi_Paired_Bracket (bpb) maps A to B and
|
||||
# vice versa, and their Bidi_Paired_Bracket_Type (bpt) property values are
|
||||
# Open (o) and Close (c), respectively.
|
||||
#
|
||||
# For legacy reasons, the characters U+FD3E ORNATE LEFT PARENTHESIS and
|
||||
# U+FD3F ORNATE RIGHT PARENTHESIS do not mirror in bidirectional display
|
||||
# and therefore do not form a bracket pair.
|
||||
#
|
||||
# The Unicode property value stability policy guarantees that characters
|
||||
# which have bpt=o or bpt=c also have bc=ON and Bidi_M=Y. As a result, an
|
||||
# implementation can optimize the lookup of the Bidi_Paired_Bracket_Type
|
||||
# property values Open and Close by restricting the processing to characters
|
||||
# with bc=ON.
|
||||
#
|
||||
# The format of the file is three fields separated by a semicolon.
|
||||
# Field 0: Unicode code point value, represented as a hexadecimal value
|
||||
# Field 1: Bidi_Paired_Bracket property value, a code point value or <none>
|
||||
# Field 2: Bidi_Paired_Bracket_Type property value, one of the following:
|
||||
# o Open
|
||||
# c Close
|
||||
# n None
|
||||
# The names of the characters in field 0 are given in comments at the end
|
||||
# of each line.
|
||||
#
|
||||
# For information on bidirectional paired brackets, see UAX #9: Unicode
|
||||
# Bidirectional Algorithm, at http://www.unicode.org/unicode/reports/tr9/
|
||||
#
|
||||
# This file was originally created by Andrew Glass and Laurentiu Iancu
|
||||
# for Unicode 6.3.
|
||||
|
||||
0028; 0029; o # LEFT PARENTHESIS
|
||||
0029; 0028; c # RIGHT PARENTHESIS
|
||||
005B; 005D; o # LEFT SQUARE BRACKET
|
||||
005D; 005B; c # RIGHT SQUARE BRACKET
|
||||
007B; 007D; o # LEFT CURLY BRACKET
|
||||
007D; 007B; c # RIGHT CURLY BRACKET
|
||||
0F3A; 0F3B; o # TIBETAN MARK GUG RTAGS GYON
|
||||
0F3B; 0F3A; c # TIBETAN MARK GUG RTAGS GYAS
|
||||
0F3C; 0F3D; o # TIBETAN MARK ANG KHANG GYON
|
||||
0F3D; 0F3C; c # TIBETAN MARK ANG KHANG GYAS
|
||||
169B; 169C; o # OGHAM FEATHER MARK
|
||||
169C; 169B; c # OGHAM REVERSED FEATHER MARK
|
||||
2045; 2046; o # LEFT SQUARE BRACKET WITH QUILL
|
||||
2046; 2045; c # RIGHT SQUARE BRACKET WITH QUILL
|
||||
207D; 207E; o # SUPERSCRIPT LEFT PARENTHESIS
|
||||
207E; 207D; c # SUPERSCRIPT RIGHT PARENTHESIS
|
||||
208D; 208E; o # SUBSCRIPT LEFT PARENTHESIS
|
||||
208E; 208D; c # SUBSCRIPT RIGHT PARENTHESIS
|
||||
2308; 2309; o # LEFT CEILING
|
||||
2309; 2308; c # RIGHT CEILING
|
||||
230A; 230B; o # LEFT FLOOR
|
||||
230B; 230A; c # RIGHT FLOOR
|
||||
2329; 232A; o # LEFT-POINTING ANGLE BRACKET
|
||||
232A; 2329; c # RIGHT-POINTING ANGLE BRACKET
|
||||
2768; 2769; o # MEDIUM LEFT PARENTHESIS ORNAMENT
|
||||
2769; 2768; c # MEDIUM RIGHT PARENTHESIS ORNAMENT
|
||||
276A; 276B; o # MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
|
||||
276B; 276A; c # MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
|
||||
276C; 276D; o # MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
|
||||
276D; 276C; c # MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
|
||||
276E; 276F; o # HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
|
||||
276F; 276E; c # HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
|
||||
2770; 2771; o # HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
|
||||
2771; 2770; c # HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
|
||||
2772; 2773; o # LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
|
||||
2773; 2772; c # LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
|
||||
2774; 2775; o # MEDIUM LEFT CURLY BRACKET ORNAMENT
|
||||
2775; 2774; c # MEDIUM RIGHT CURLY BRACKET ORNAMENT
|
||||
27C5; 27C6; o # LEFT S-SHAPED BAG DELIMITER
|
||||
27C6; 27C5; c # RIGHT S-SHAPED BAG DELIMITER
|
||||
27E6; 27E7; o # MATHEMATICAL LEFT WHITE SQUARE BRACKET
|
||||
27E7; 27E6; c # MATHEMATICAL RIGHT WHITE SQUARE BRACKET
|
||||
27E8; 27E9; o # MATHEMATICAL LEFT ANGLE BRACKET
|
||||
27E9; 27E8; c # MATHEMATICAL RIGHT ANGLE BRACKET
|
||||
27EA; 27EB; o # MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
|
||||
27EB; 27EA; c # MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
|
||||
27EC; 27ED; o # MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
|
||||
27ED; 27EC; c # MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
|
||||
27EE; 27EF; o # MATHEMATICAL LEFT FLATTENED PARENTHESIS
|
||||
27EF; 27EE; c # MATHEMATICAL RIGHT FLATTENED PARENTHESIS
|
||||
2983; 2984; o # LEFT WHITE CURLY BRACKET
|
||||
2984; 2983; c # RIGHT WHITE CURLY BRACKET
|
||||
2985; 2986; o # LEFT WHITE PARENTHESIS
|
||||
2986; 2985; c # RIGHT WHITE PARENTHESIS
|
||||
2987; 2988; o # Z NOTATION LEFT IMAGE BRACKET
|
||||
2988; 2987; c # Z NOTATION RIGHT IMAGE BRACKET
|
||||
2989; 298A; o # Z NOTATION LEFT BINDING BRACKET
|
||||
298A; 2989; c # Z NOTATION RIGHT BINDING BRACKET
|
||||
298B; 298C; o # LEFT SQUARE BRACKET WITH UNDERBAR
|
||||
298C; 298B; c # RIGHT SQUARE BRACKET WITH UNDERBAR
|
||||
298D; 2990; o # LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
|
||||
298E; 298F; c # RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
|
||||
298F; 298E; o # LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
|
||||
2990; 298D; c # RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
|
||||
2991; 2992; o # LEFT ANGLE BRACKET WITH DOT
|
||||
2992; 2991; c # RIGHT ANGLE BRACKET WITH DOT
|
||||
2993; 2994; o # LEFT ARC LESS-THAN BRACKET
|
||||
2994; 2993; c # RIGHT ARC GREATER-THAN BRACKET
|
||||
2995; 2996; o # DOUBLE LEFT ARC GREATER-THAN BRACKET
|
||||
2996; 2995; c # DOUBLE RIGHT ARC LESS-THAN BRACKET
|
||||
2997; 2998; o # LEFT BLACK TORTOISE SHELL BRACKET
|
||||
2998; 2997; c # RIGHT BLACK TORTOISE SHELL BRACKET
|
||||
29D8; 29D9; o # LEFT WIGGLY FENCE
|
||||
29D9; 29D8; c # RIGHT WIGGLY FENCE
|
||||
29DA; 29DB; o # LEFT DOUBLE WIGGLY FENCE
|
||||
29DB; 29DA; c # RIGHT DOUBLE WIGGLY FENCE
|
||||
29FC; 29FD; o # LEFT-POINTING CURVED ANGLE BRACKET
|
||||
29FD; 29FC; c # RIGHT-POINTING CURVED ANGLE BRACKET
|
||||
2E22; 2E23; o # TOP LEFT HALF BRACKET
|
||||
2E23; 2E22; c # TOP RIGHT HALF BRACKET
|
||||
2E24; 2E25; o # BOTTOM LEFT HALF BRACKET
|
||||
2E25; 2E24; c # BOTTOM RIGHT HALF BRACKET
|
||||
2E26; 2E27; o # LEFT SIDEWAYS U BRACKET
|
||||
2E27; 2E26; c # RIGHT SIDEWAYS U BRACKET
|
||||
2E28; 2E29; o # LEFT DOUBLE PARENTHESIS
|
||||
2E29; 2E28; c # RIGHT DOUBLE PARENTHESIS
|
||||
3008; 3009; o # LEFT ANGLE BRACKET
|
||||
3009; 3008; c # RIGHT ANGLE BRACKET
|
||||
300A; 300B; o # LEFT DOUBLE ANGLE BRACKET
|
||||
300B; 300A; c # RIGHT DOUBLE ANGLE BRACKET
|
||||
300C; 300D; o # LEFT CORNER BRACKET
|
||||
300D; 300C; c # RIGHT CORNER BRACKET
|
||||
300E; 300F; o # LEFT WHITE CORNER BRACKET
|
||||
300F; 300E; c # RIGHT WHITE CORNER BRACKET
|
||||
3010; 3011; o # LEFT BLACK LENTICULAR BRACKET
|
||||
3011; 3010; c # RIGHT BLACK LENTICULAR BRACKET
|
||||
3014; 3015; o # LEFT TORTOISE SHELL BRACKET
|
||||
3015; 3014; c # RIGHT TORTOISE SHELL BRACKET
|
||||
3016; 3017; o # LEFT WHITE LENTICULAR BRACKET
|
||||
3017; 3016; c # RIGHT WHITE LENTICULAR BRACKET
|
||||
3018; 3019; o # LEFT WHITE TORTOISE SHELL BRACKET
|
||||
3019; 3018; c # RIGHT WHITE TORTOISE SHELL BRACKET
|
||||
301A; 301B; o # LEFT WHITE SQUARE BRACKET
|
||||
301B; 301A; c # RIGHT WHITE SQUARE BRACKET
|
||||
FE59; FE5A; o # SMALL LEFT PARENTHESIS
|
||||
FE5A; FE59; c # SMALL RIGHT PARENTHESIS
|
||||
FE5B; FE5C; o # SMALL LEFT CURLY BRACKET
|
||||
FE5C; FE5B; c # SMALL RIGHT CURLY BRACKET
|
||||
FE5D; FE5E; o # SMALL LEFT TORTOISE SHELL BRACKET
|
||||
FE5E; FE5D; c # SMALL RIGHT TORTOISE SHELL BRACKET
|
||||
FF08; FF09; o # FULLWIDTH LEFT PARENTHESIS
|
||||
FF09; FF08; c # FULLWIDTH RIGHT PARENTHESIS
|
||||
FF3B; FF3D; o # FULLWIDTH LEFT SQUARE BRACKET
|
||||
FF3D; FF3B; c # FULLWIDTH RIGHT SQUARE BRACKET
|
||||
FF5B; FF5D; o # FULLWIDTH LEFT CURLY BRACKET
|
||||
FF5D; FF5B; c # FULLWIDTH RIGHT CURLY BRACKET
|
||||
FF5F; FF60; o # FULLWIDTH LEFT WHITE PARENTHESIS
|
||||
FF60; FF5F; c # FULLWIDTH RIGHT WHITE PARENTHESIS
|
||||
FF62; FF63; o # HALFWIDTH LEFT CORNER BRACKET
|
||||
FF63; FF62; c # HALFWIDTH RIGHT CORNER BRACKET
|
||||
|
||||
# EOF
|
@ -54,7 +54,9 @@ FORCE =
|
||||
FORCE:
|
||||
.PHONY: FORCE
|
||||
|
||||
${unidir}/charprop.el: ${FORCE} ${srcdir}/unidata-gen.el ${srcdir}/UnicodeData.txt | \
|
||||
${unidir}/charprop.el: ${FORCE} ${srcdir}/unidata-gen.el \
|
||||
${srcdir}/UnicodeData.txt ${srcdir}/BidiMirroring.txt \
|
||||
${srcdir}/BidiBrackets.txt | \
|
||||
${srcdir}/unidata-gen.elc unidata.txt
|
||||
-if [ -f "$@" ]; then \
|
||||
cd ${unidir} && chmod +w charprop.el `sed -n 's/^;; FILE: //p' < charprop.el`; \
|
||||
|
@ -154,7 +154,8 @@
|
||||
;; PROP: character property
|
||||
;; INDEX: index to each element of unidata-list for PROP.
|
||||
;; It may be a function that generates an alist of character codes
|
||||
;; vs. the corresponding property values.
|
||||
;; vs. the corresponding property values. Currently, only character
|
||||
;; codepoints or symbol values are supported in this case.
|
||||
;; GENERATOR: function to generate a char-table
|
||||
;; FILENAME: filename to store the char-table
|
||||
;; DOCSTRING: docstring for the property
|
||||
@ -273,7 +274,23 @@ is the character itself."
|
||||
"Unicode bidi-mirroring characters.
|
||||
Property value is a character that has the corresponding mirroring image or nil.
|
||||
The value nil means that the actual property value of a character
|
||||
is the character itself.")))
|
||||
is the character itself.")
|
||||
(paired-bracket
|
||||
unidata-gen-brackets-list unidata-gen-table-character "uni-brackets.el"
|
||||
"Unicode bidi paired-bracket characters.
|
||||
Property value is the paired bracket character, or nil.
|
||||
The value nil means that the character is neither an opening nor
|
||||
a closing paired bracket."
|
||||
string)
|
||||
(bracket-type
|
||||
unidata-gen-bracket-type-list unidata-gen-table-symbol "uni-brackets.el"
|
||||
"Unicode bidi paired-bracket type.
|
||||
Property value is a symbol `o' (Open), `c' (Close), or `n' (None)."
|
||||
unidata-describe-bidi-bracket-type
|
||||
n
|
||||
;; The order of elements must be in sync with bidi_bracket_type_t
|
||||
;; in src/dispextern.h.
|
||||
(n o c))))
|
||||
|
||||
;; Functions to access the above data.
|
||||
(defsubst unidata-prop-index (prop) (nth 1 (assq prop unidata-prop-alist)))
|
||||
@ -451,7 +468,10 @@ is the character itself.")))
|
||||
(unidata-encode-val val-list (nth 2 elm)))
|
||||
(set-char-table-range table (cons (car elm) (nth 1 elm)) (nth 2 elm)))
|
||||
|
||||
(setq tail unidata-list)
|
||||
(if (functionp prop-idx)
|
||||
(setq tail (funcall prop-idx)
|
||||
prop-idx 1)
|
||||
(setq tail unidata-list))
|
||||
(while tail
|
||||
(setq elt (car tail) tail (cdr tail))
|
||||
(setq range (car elt)
|
||||
@ -1157,6 +1177,12 @@ is the character itself.")))
|
||||
(string ?'))))
|
||||
val " "))
|
||||
|
||||
(defun unidata-describe-bidi-bracket-type (val)
|
||||
(cdr (assq val
|
||||
'((n . "Not a paired bracket character.")
|
||||
(o . "Opening paired bracket character.")
|
||||
(c . "Closing paired bracket character.")))))
|
||||
|
||||
(defun unidata-gen-mirroring-list ()
|
||||
(let ((head (list nil))
|
||||
tail)
|
||||
@ -1170,6 +1196,36 @@ is the character itself.")))
|
||||
(setq tail (setcdr tail (list (list char mirror)))))))
|
||||
(cdr head)))
|
||||
|
||||
(defun unidata-gen-brackets-list ()
|
||||
(let ((head (list nil))
|
||||
tail)
|
||||
(with-temp-buffer
|
||||
(insert-file-contents (expand-file-name "BidiBrackets.txt" unidata-dir))
|
||||
(goto-char (point-min))
|
||||
(setq tail head)
|
||||
(while (re-search-forward
|
||||
"^\\([0-9A-F]+\\);\\s +\\([0-9A-F]+\\);\\s +\\([oc]\\)"
|
||||
nil t)
|
||||
(let ((char (string-to-number (match-string 1) 16))
|
||||
(paired (match-string 2)))
|
||||
(setq tail (setcdr tail (list (list char paired)))))))
|
||||
(cdr head)))
|
||||
|
||||
(defun unidata-gen-bracket-type-list ()
|
||||
(let ((head (list nil))
|
||||
tail)
|
||||
(with-temp-buffer
|
||||
(insert-file-contents (expand-file-name "BidiBrackets.txt" unidata-dir))
|
||||
(goto-char (point-min))
|
||||
(setq tail head)
|
||||
(while (re-search-forward
|
||||
"^\\([0-9A-F]+\\);\\s +\\([0-9A-F]+\\);\\s +\\([oc]\\)"
|
||||
nil t)
|
||||
(let ((char (string-to-number (match-string 1) 16))
|
||||
(type (match-string 3)))
|
||||
(setq tail (setcdr tail (list (list char type)))))))
|
||||
(cdr head)))
|
||||
|
||||
;; Verify if we can retrieve correct values from the generated
|
||||
;; char-tables.
|
||||
;;
|
||||
@ -1218,7 +1274,9 @@ is the character itself.")))
|
||||
((eq generator 'unidata-gen-table-decomposition)
|
||||
(setq val1 (unidata-split-decomposition val1))))
|
||||
(cond ((eq prop 'decomposition)
|
||||
(setq val1 (list char)))))
|
||||
(setq val1 (list char)))
|
||||
((eq prop 'bracket-type)
|
||||
(setq val1 'n))))
|
||||
(when (>= char check)
|
||||
(message "%S %04X" prop check)
|
||||
(setq check (+ check #x400)))
|
||||
@ -1261,6 +1319,9 @@ is the character itself.")))
|
||||
(describer (unidata-prop-describer prop))
|
||||
(default-value (unidata-prop-default prop))
|
||||
(val-list (unidata-prop-val-list prop))
|
||||
;; Avoid creating backup files for those uni-*.el files
|
||||
;; that hold more than one table.
|
||||
(backup-inhibited t)
|
||||
table)
|
||||
;; Filename in this comment line is extracted by sed in
|
||||
;; Makefile.
|
||||
|
@ -1,3 +1,11 @@
|
||||
2014-10-15 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* nonascii.texi (Character Properties): Document the new
|
||||
properties 'bracket-type' and 'paired-bracket'.
|
||||
|
||||
* display.texi (Bidirectional Display): Update the version of the
|
||||
UBA to which we are conforming.
|
||||
|
||||
2014-10-13 Glenn Morris <rgm@gnu.org>
|
||||
|
||||
* Makefile.in (dist): Update for new output variables.
|
||||
|
@ -6613,10 +6613,9 @@ positions do not increase monotonically with string or buffer
|
||||
position. In performing this @dfn{bidirectional reordering}, Emacs
|
||||
follows the Unicode Bidirectional Algorithm (a.k.a.@: @acronym{UBA}),
|
||||
which is described in Annex #9 of the Unicode standard
|
||||
(@url{http://www.unicode.org/reports/tr9/}). Emacs currently provides
|
||||
a ``Non-isolate Bidirectionality'' class implementation of the
|
||||
@acronym{UBA}: it does not yet support the isolate directional
|
||||
formatting characters introduced with Unicode Standard v6.3.0.
|
||||
(@url{http://www.unicode.org/reports/tr9/}). Emacs provides a ``Full
|
||||
Bidirectionality'' class implementation of the @acronym{UBA},
|
||||
consistent with the requirements of the Unicode Standard v7.0.
|
||||
|
||||
@defvar bidi-display-reordering
|
||||
If the value of this buffer-local variable is non-@code{nil} (the
|
||||
|
@ -520,6 +520,24 @@ property to display mirror images of characters when appropriate
|
||||
(@pxref{Bidirectional Display}). For unassigned codepoints, the value
|
||||
is @code{nil}.
|
||||
|
||||
@item paired-bracket
|
||||
Corresponds to the Unicode @code{Bidi_Paired_Bracket} property. The
|
||||
value of this property is the codepoint of a character's @dfn{paired
|
||||
bracket}, or @code{nil} if the character is not a bracket character.
|
||||
This establishes a mapping between characters that are treated as
|
||||
bracket pairs by the Unicode Bidirectional Algorithm; Emacs uses this
|
||||
property when it decides how to reorder for display parentheses,
|
||||
braces, and other similar characters (@pxref{Bidirectional Display}).
|
||||
|
||||
@item bracket-type
|
||||
Corresponds to the Unicode @code{Bidi_Paired_Bracket_Type} property.
|
||||
For characters whose @code{paired-bracket} property is non-@code{nil},
|
||||
the value of this property is a symbol, either @code{o} (for opening
|
||||
bracket characters) or @code{c} (for closing bracket characters). For
|
||||
characters whose @code{paired-bracket} property is @code{nil}, the
|
||||
value is the symbol @code{n} (None). Like @code{paired-bracket}, this
|
||||
property is used for bidirectional display.
|
||||
|
||||
@item old-name
|
||||
Corresponds to the Unicode @code{Unicode_1_Name} property. The value
|
||||
is a string. Unassigned codepoints, and characters that have no value
|
||||
@ -574,6 +592,14 @@ This function returns the value of @var{char}'s @var{propname} property.
|
||||
(get-char-code-property ?\u2163 'numeric-value)
|
||||
@result{} 4
|
||||
@end group
|
||||
@group
|
||||
(get-char-code-property ?\( 'paired-bracket)
|
||||
@result{} 41 ;; closing parenthesis
|
||||
@end group
|
||||
@group
|
||||
(get-char-code-property ?\) 'bracket-type)
|
||||
@result{} c
|
||||
@end group
|
||||
@end example
|
||||
@end defun
|
||||
|
||||
|
@ -1,3 +1,9 @@
|
||||
2014-10-15 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* NEWS: Mention the UBA implementation update.
|
||||
|
||||
* HELLO: Remove now unneeded directional control characters.
|
||||
|
||||
2014-10-13 Jan Djärv <jan.h.d@swipnet.se>
|
||||
|
||||
* NEWS: Move and clarify OSX >= 10.6.
|
||||
|
@ -18,7 +18,7 @@ Non-ASCII examples:
|
||||
LANGUAGE (NATIVE NAME) HELLO
|
||||
---------------------- -----
|
||||
Amharic ($,1O M[MmN{(B) $,1M`MKM](B
|
||||
Arabic $,1ro(B($,1-g.$-y-q-h.*.1-i(B) $,1-g.$-s.1.$-g.%(B $,1-y.$.*.#.%(B
|
||||
Arabic ($,1-g.$-y-q-h.*.1-i(B) $,1-g.$-s.1.$-g.%(B $,1-y.$.*.#.%(B
|
||||
Bengali ($,17,7>6b727>(B) $,17(7.787M6u7>70(B
|
||||
Braille $,2(3(1('('(5(B
|
||||
Burmese ($,1H9H\H4HZH9HL(B) $,1H9H$HZHYH"H<HLH5HK(B
|
||||
@ -37,7 +37,7 @@ German (Deutsch) Guten Tag / Gr,A|_(B Gott
|
||||
Greek (,Fekkgmij\(B) ,FCei\(B ,Fsar(B
|
||||
Greek, ancient ($,1p1,Fkkgmij^(B) ,FO$,1pv,Fk](B ,Fte(B ,Fja$,1q6(B ,Fl]ca(B ,Fwa$,1r6,Fqe(B
|
||||
Gujarati ($,19W:!9\9p9~9d: (B) $,19h9n9x:-9d:'(B
|
||||
Hebrew $,1ro(B($,1-",q-(,y-*(B) ,Hylem(B
|
||||
Hebrew ($,1-",q-(,y-*(B) ,Hylem(B
|
||||
Hungarian (magyar) Sz,Bi(Bp j,Bs(B napot!
|
||||
Hindi ($,15y55B5f6 (B) $,15h5n5x6-5d6'(B / $,15h5n5x6-5U5~5p(B $,16D(B
|
||||
Italian (italiano) Ciao / Buon giorno
|
||||
|
8
etc/NEWS
8
etc/NEWS
@ -110,6 +110,14 @@ character in the pasted text as actual user input. This results in a
|
||||
paste experience similar to that under a window system, and significant
|
||||
performance improvements when pasting large amounts of text.
|
||||
|
||||
** Emacs now supports the latest version of the UBA.
|
||||
The Emacs implementation of the Unicode Bidirectional Algorithm (UBA)
|
||||
was updated to support all the latest additions and changes introduced
|
||||
in Unicode Standard versions 6.3 and 7.0, and a few changes suggested
|
||||
for Unicode 8.0. This includes full support for directional isolates
|
||||
and the Bidirectional Parentheses Algorithm (BPA) specified by these
|
||||
Unicode standards.
|
||||
|
||||
|
||||
* Changes in Specialized Modes and Packages in Emacs 25.1
|
||||
|
||||
|
@ -1,5 +1,11 @@
|
||||
2014-10-15 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* simple.el (what-cursor-position): Update to support the new bidi
|
||||
characters.
|
||||
|
||||
* descr-text.el (describe-char): Update to support the new bidi
|
||||
characters.
|
||||
|
||||
* emacs-lisp/tabulated-list.el (tabulated-list-mode): Force
|
||||
bidi-paragraph-direction to 'left-to-right'. This fixes
|
||||
buffer-menu display when the first buffer happens to start with
|
||||
|
@ -434,13 +434,26 @@ relevant to POS."
|
||||
code (encode-char char charset)))
|
||||
(setq code char))
|
||||
(cond
|
||||
;; Append a PDF character to directional embeddings and
|
||||
;; overrides, to prevent potential messup of the following
|
||||
;; text.
|
||||
((memq char '(?\x202a ?\x202b ?\x202d ?\x202e))
|
||||
;; Append a PDF character to left-to-right directional
|
||||
;; embeddings and overrides, to prevent potential messup of the
|
||||
;; following text.
|
||||
((memq char '(?\x202a ?\x202d))
|
||||
(setq char-description
|
||||
(concat char-description
|
||||
(propertize (string ?\x202c) 'invisible t))))
|
||||
;; Append a PDF character followed by LRM to right-to-left
|
||||
;; directional embeddings and overrides, to prevent potential
|
||||
;; messup of the following numerical text.
|
||||
((memq char '(?\x202b ?\x202e))
|
||||
(setq char-description
|
||||
(concat char-description
|
||||
(propertize (string ?\x202c ?\x200e) 'invisible t))))
|
||||
;; Append a PDI character to directional isolate initiators, to
|
||||
;; prevent potential messup of the following numerical text
|
||||
((memq char '(?\x2066 ?\x2067 ?\x2068))
|
||||
(setq char-description
|
||||
(concat char-description
|
||||
(propertize (string ?\x2069) 'invisible t))))
|
||||
;; Append a LRM character to any strong character to avoid
|
||||
;; messing up the numerical codepoint.
|
||||
((memq (get-char-code-property char 'bidi-class) '(R AL))
|
||||
|
@ -1223,15 +1223,21 @@ in *Help* buffer. See also the command `describe-char'."
|
||||
(interactive "P")
|
||||
(let* ((char (following-char))
|
||||
(bidi-fixer
|
||||
(cond ((memq char '(?\x202a ?\x202b ?\x202d ?\x202e))
|
||||
;; If the character is one of LRE, LRO, RLE, RLO, it
|
||||
;; will start a directional embedding, which could
|
||||
;; completely disrupt the rest of the line (e.g., RLO
|
||||
;; will display the rest of the line right-to-left).
|
||||
;; So we put an invisible PDF character after these
|
||||
;; characters, to end the embedding, which eliminates
|
||||
;; any effects on the rest of the line.
|
||||
;; If the character is one of LRE, LRO, RLE, RLO, it will
|
||||
;; start a directional embedding, which could completely
|
||||
;; disrupt the rest of the line (e.g., RLO will display the
|
||||
;; rest of the line right-to-left). So we put an invisible
|
||||
;; PDF character after these characters, to end the
|
||||
;; embedding, which eliminates any effects on the rest of
|
||||
;; the line. For RLE and RLO we also append an invisible
|
||||
;; LRM, to avoid reordering the following numerical
|
||||
;; characters. For LRI/RLI/FSI we append a PDI.
|
||||
(cond ((memq char '(?\x202a ?\x202d))
|
||||
(propertize (string ?\x202c) 'invisible t))
|
||||
((memq char '(?\x202b ?\x202e))
|
||||
(propertize (string ?\x202c ?\x200e) 'invisible t))
|
||||
((memq char '(?\x2066 ?\x2067 ?\x2068))
|
||||
(propertize (string ?\x2069) 'invisible t))
|
||||
;; Strong right-to-left characters cause reordering of
|
||||
;; the following numerical characters which show the
|
||||
;; codepoint, so append LRM to countermand that.
|
||||
|
12
src/.gdbinit
12
src/.gdbinit
@ -468,18 +468,18 @@ define pgx
|
||||
end
|
||||
# GLYPHLESS_GLYPH
|
||||
if ($g.type == 2)
|
||||
printf "GLYPHLESS["
|
||||
printf "G-LESS["
|
||||
if ($g.u.glyphless.method == 0)
|
||||
printf "THIN]"
|
||||
printf "THIN;0x%x]", $g.u.glyphless.ch
|
||||
end
|
||||
if ($g.u.glyphless.method == 1)
|
||||
printf "EMPTY]"
|
||||
printf "EMPTY;0x%x]", $g.u.glyphless.ch
|
||||
end
|
||||
if ($g.u.glyphless.method == 2)
|
||||
printf "ACRO]"
|
||||
printf "ACRO;0x%x]", $g.u.glyphless.ch
|
||||
end
|
||||
if ($g.u.glyphless.method == 3)
|
||||
printf "HEX]"
|
||||
printf "HEX;0x%x]", $g.u.glyphless.ch
|
||||
end
|
||||
end
|
||||
# IMAGE_GLYPH
|
||||
@ -498,7 +498,7 @@ define pgx
|
||||
printf " pos=%d", $g.charpos
|
||||
end
|
||||
# For characters, print their resolved level and bidi type
|
||||
if ($g.type == 0)
|
||||
if ($g.type == 0 || $g.type == 2)
|
||||
printf " blev=%d,btyp=", $g.resolved_level
|
||||
pbiditype $g.bidi_type
|
||||
end
|
||||
|
@ -1,3 +1,70 @@
|
||||
2014-10-15 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
Update the bidirectional reordering engine for Unicode 6.3 and 7.0.
|
||||
* bidi.c (bidi_ignore_explicit_marks_for_paragraph_level): Remove
|
||||
variable.
|
||||
(bidi_get_type): Return the isolate initiators and terminator
|
||||
types.
|
||||
(bidi_isolate_fmt_char, bidi_paired_bracket_type)
|
||||
(bidi_fetch_char_skip_isolates, find_first_strong_char)
|
||||
(bidi_find_bracket_pairs, bidi_resolve_brackets): New functions.
|
||||
(bidi_set_sos_type): Renamed from bidi_set_sor_type and updated
|
||||
for the new features.
|
||||
(bidi_push_embedding_level, bidi_pop_embedding_level): Update to
|
||||
push and pop correctly for isolates.
|
||||
(bidi_remember_char): Modified to accept an additional argument
|
||||
and record the bidi type according to its value.
|
||||
(bidi_cache_iterator_state): Accept an additional argument to only
|
||||
update an existing state. Handle the new members of struct bidi_it.
|
||||
(bidi_cache_find): Arguments changed: no lnger accepts a level,
|
||||
instead accepts a flag telling it whether it is okay to return
|
||||
unresolved neutrals.
|
||||
(bidi_initialize): Initiate and staticpro the bracket-type uniprop
|
||||
table. Initialize new isolate-related members.
|
||||
(bidi_paragraph_init): Some code factored out into
|
||||
find_first_strong_char.
|
||||
(bidi_resolve_explicit_1): Function deleted, its code incorporated
|
||||
into bidi_resolve_explicit.
|
||||
(bidi_resolve_explicit): Support the isolate initiators and
|
||||
terminator. Fix handling of embeddings and overrides according to
|
||||
new UBA requirements. Record information about previously seen
|
||||
characters here (moved from bidi_level_of_next_char).
|
||||
(bidi_resolve_weak): Adapt to changes in struct members.
|
||||
(FLAG_EMBEDDING_INSIDE, FLAG_OPPOSITE_INSIDE, MAX_BPA_STACK)
|
||||
(STORE_BRACKET_CHARPOS, PUSH_BPA_STACK): New macros.
|
||||
(bidi_resolve_neutral): Call bidi_resolve_brackets to handle the
|
||||
paired bracket resolution. Handle isolate initiators and
|
||||
terminator.
|
||||
(bidi_type_of_next_char): Remove unneeded code for BN limit.
|
||||
(bidi_level_of_next_char): Move the code that records information
|
||||
about previous characters to bidi_resolve_explicit. Fix logic of
|
||||
resolving neutrals and make sure their cache entries are updated.
|
||||
Remove now unneeded special handling of PDF level.
|
||||
|
||||
* dispextern.h (struct glyph): Enlarge the width of resolved_level.
|
||||
(BIDI_MAXDEPTH): New macro, renamed from BIDI_MAXLEVEL and
|
||||
enlarged per Unicode 6.3.
|
||||
(enum bidi_bracket_type_t): New data type.
|
||||
(struct bidi_saved_info): Leave only 2 type members out of 4.
|
||||
Remove bytepos.
|
||||
(struct bidi_stack): Add members necessary to support isolating
|
||||
sequences.
|
||||
(struct bidi_it): Add new members necessary to support isolating
|
||||
sequences and bracket pair resolution.
|
||||
|
||||
* xdisp.c (Fbidi_resolved_levels): New function.
|
||||
(syms_of_xdisp): Defsubr it.
|
||||
(append_glyph, append_composite_glyph, produce_image_glyph)
|
||||
(append_stretch_glyph, append_glyphless_glyph): Convert aborts to
|
||||
assertions.
|
||||
(syms_of_xdisp) <inhibit-bidi-mirroring>: New variable.
|
||||
|
||||
* term.c (append_glyph, append_composite_glyph)
|
||||
(append_glyphless_glyph): Convert aborts to assertions.
|
||||
|
||||
* .gdbinit (pgx): Display the character codepoint, resolved level,
|
||||
and bidi type also for glyphless glyphs.
|
||||
|
||||
2014-10-15 Dmitry Antipov <dmantipov@yandex.ru>
|
||||
|
||||
Avoid unwanted point motion in Fline_beginning_position.
|
||||
|
1634
src/bidi.c
1634
src/bidi.c
File diff suppressed because it is too large
Load Diff
@ -445,8 +445,8 @@ struct glyph
|
||||
/* True means don't display cursor here. */
|
||||
bool_bf avoid_cursor_p : 1;
|
||||
|
||||
/* Resolved bidirectional level of this character [0..63]. */
|
||||
unsigned resolved_level : 5;
|
||||
/* Resolved bidirectional level of this character [0..127]. */
|
||||
unsigned resolved_level : 7;
|
||||
|
||||
/* Resolved bidirectional type of this character, see enum
|
||||
bidi_type_t below. Note that according to UAX#9, only some
|
||||
@ -1857,7 +1857,9 @@ GLYPH_CODE_P (Lisp_Object gc)
|
||||
extern int face_change_count;
|
||||
|
||||
/* For reordering of bidirectional text. */
|
||||
#define BIDI_MAXLEVEL 64
|
||||
|
||||
/* UAX#9's max_depth value. */
|
||||
#define BIDI_MAXDEPTH 125
|
||||
|
||||
/* Data type for describing the bidirectional character types. The
|
||||
first 7 must be at the beginning, because they are the only values
|
||||
@ -1894,23 +1896,39 @@ typedef enum {
|
||||
NEUTRAL_ON /* other neutrals */
|
||||
} bidi_type_t;
|
||||
|
||||
/* Data type for describing the Bidi Paired Bracket Type of a character.
|
||||
|
||||
The order of members must be in sync with the 8th element of the
|
||||
member of unidata-prop-alist (in admin/unidata/unidata-gen.el) for
|
||||
Unicode character property `bracket-type'. */
|
||||
typedef enum {
|
||||
BIDI_BRACKET_NONE = 1,
|
||||
BIDI_BRACKET_OPEN,
|
||||
BIDI_BRACKET_CLOSE
|
||||
} bidi_bracket_type_t;
|
||||
|
||||
/* The basic directionality data type. */
|
||||
typedef enum { NEUTRAL_DIR, L2R, R2L } bidi_dir_t;
|
||||
|
||||
/* Data type for storing information about characters we need to
|
||||
remember. */
|
||||
struct bidi_saved_info {
|
||||
ptrdiff_t bytepos, charpos; /* character's buffer position */
|
||||
ptrdiff_t charpos; /* character's buffer position */
|
||||
bidi_type_t type; /* character's resolved bidi type */
|
||||
bidi_type_t type_after_w1; /* original type of the character, after W1 */
|
||||
bidi_type_t orig_type; /* type as we found it in the buffer */
|
||||
bidi_type_t orig_type; /* bidi type as we found it in the buffer */
|
||||
};
|
||||
|
||||
/* Data type for keeping track of saved embedding levels and override
|
||||
status information. */
|
||||
/* Data type for keeping track of information about saved embedding
|
||||
levels, override status, isolate status, and isolating sequence
|
||||
runs. */
|
||||
struct bidi_stack {
|
||||
int level;
|
||||
bidi_dir_t override;
|
||||
struct bidi_saved_info last_strong;
|
||||
struct bidi_saved_info next_for_neutral;
|
||||
struct bidi_saved_info prev_for_neutral;
|
||||
unsigned level : 7;
|
||||
bool_bf isolate_status : 1;
|
||||
unsigned override : 2;
|
||||
unsigned sos : 2;
|
||||
};
|
||||
|
||||
/* Data type for storing information about a string being iterated on. */
|
||||
@ -1935,22 +1953,24 @@ struct bidi_it {
|
||||
ptrdiff_t nchars; /* its "length", usually 1; it's > 1 for a run
|
||||
of characters covered by a display string */
|
||||
ptrdiff_t ch_len; /* its length in bytes */
|
||||
bidi_type_t type; /* bidi type of this character, after
|
||||
bidi_type_t type; /* final bidi type of this character, after
|
||||
resolving weak and neutral types */
|
||||
bidi_type_t type_after_w1; /* original type, after overrides and W1 */
|
||||
bidi_type_t orig_type; /* original type, as found in the buffer */
|
||||
int resolved_level; /* final resolved level of this character */
|
||||
int invalid_levels; /* how many PDFs to ignore */
|
||||
int invalid_rl_levels; /* how many PDFs from RLE/RLO to ignore */
|
||||
bidi_type_t type_after_wn; /* bidi type after overrides and Wn */
|
||||
bidi_type_t orig_type; /* original bidi type, as found in the buffer */
|
||||
char resolved_level; /* final resolved level of this character */
|
||||
char isolate_level; /* count of isolate initiators unmatched by PDI */
|
||||
ptrdiff_t invalid_levels; /* how many PDFs to ignore */
|
||||
ptrdiff_t invalid_isolates; /* how many PDIs to ignore */
|
||||
struct bidi_saved_info prev; /* info about previous character */
|
||||
struct bidi_saved_info last_strong; /* last-seen strong directional char */
|
||||
struct bidi_saved_info next_for_neutral; /* surrounding characters for... */
|
||||
struct bidi_saved_info prev_for_neutral; /* ...resolving neutrals */
|
||||
struct bidi_saved_info next_for_ws; /* character after sequence of ws */
|
||||
ptrdiff_t bracket_pairing_pos; /* position of pairing bracket */
|
||||
bidi_type_t bracket_enclosed_type; /* type for bracket resolution */
|
||||
ptrdiff_t next_en_pos; /* pos. of next char for determining ET type */
|
||||
bidi_type_t next_en_type; /* type of char at next_en_pos */
|
||||
ptrdiff_t ignore_bn_limit; /* position until which to ignore BNs */
|
||||
bidi_dir_t sor; /* direction of start-of-run in effect */
|
||||
bidi_dir_t sos; /* direction of start-of-sequence in effect */
|
||||
int scan_dir; /* direction of text scan, 1: forw, -1: back */
|
||||
ptrdiff_t disp_pos; /* position of display string after ch */
|
||||
int disp_prop; /* if non-zero, there really is a
|
||||
@ -1960,12 +1980,11 @@ struct bidi_it {
|
||||
/* Note: Everything from here on is not copied/saved when the bidi
|
||||
iterator state is saved, pushed, or popped. So only put here
|
||||
stuff that is not part of the bidi iterator's state! */
|
||||
struct bidi_stack level_stack[BIDI_MAXLEVEL]; /* stack of embedding levels */
|
||||
struct bidi_stack level_stack[BIDI_MAXDEPTH+2+1]; /* directional status stack */
|
||||
struct bidi_string_data string; /* string to reorder */
|
||||
struct window *w; /* the window being displayed */
|
||||
bidi_dir_t paragraph_dir; /* current paragraph direction */
|
||||
ptrdiff_t separator_limit; /* where paragraph separator should end */
|
||||
bool_bf prev_was_pdf : 1; /* if true, previous char was PDF */
|
||||
bool_bf first_elt : 1; /* if true, examine current char first */
|
||||
bool_bf new_paragraph : 1; /* if true, we expect a new paragraph */
|
||||
bool_bf frame_window_p : 1; /* true if displaying on a GUI frame */
|
||||
|
@ -1513,8 +1513,7 @@ append_glyph (struct it *it)
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
else
|
||||
@ -1710,8 +1709,7 @@ append_composite_glyph (struct it *it)
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
else
|
||||
@ -1795,8 +1793,7 @@ append_glyphless_glyph (struct it *it, int face_id, const char *str)
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
else
|
||||
|
133
src/xdisp.c
133
src/xdisp.c
@ -6935,7 +6935,8 @@ get_next_display_element (struct it *it)
|
||||
is R..." */
|
||||
/* FIXME: Do we need an exception for characters from display
|
||||
tables? */
|
||||
if (it->bidi_p && it->bidi_it.type == STRONG_R)
|
||||
if (it->bidi_p && it->bidi_it.type == STRONG_R
|
||||
&& !inhibit_bidi_mirroring)
|
||||
it->c = bidi_mirror_char (it->c);
|
||||
/* Map via display table or translate control characters.
|
||||
IT->c, IT->len etc. have been set to the next character by
|
||||
@ -21468,6 +21469,114 @@ Value is the new character position of point. */)
|
||||
#undef ROW_GLYPH_NEWLINE_P
|
||||
}
|
||||
|
||||
DEFUN ("bidi-resolved-levels", Fbidi_resolved_levels,
|
||||
Sbidi_resolved_levels, 0, 1, 0,
|
||||
doc: /* Return the resolved bidirectional levels of characters at VPOS.
|
||||
|
||||
The resolved levels are produced by the Emacs bidi reordering engine
|
||||
that implements the UBA, the Unicode Bidirectional Algorithm. Please
|
||||
read the Unicode Standard Annex 9 (UAX#9) for background information
|
||||
about these levels.
|
||||
|
||||
VPOS is the zero-based number of the current window's screen line
|
||||
for which to produce the resolved levels. If VPOS is nil or omitted,
|
||||
it defaults to the screen line of point. If the window displays a
|
||||
header line, VPOS of zero will report on the header line, and first
|
||||
line of text in the window will have VPOS of 1.
|
||||
|
||||
Value is an array of resolved levels, indexed by glyph number.
|
||||
Glyphs are numbered from zero starting from the beginning of the
|
||||
screen line, i.e. the left edge of the window for left-to-right lines
|
||||
and from the right edge for right-to-left lines. The resolved levels
|
||||
are produced only for the window's text area; text in display margins
|
||||
is not included.
|
||||
|
||||
If the selected window's display is not up-to-date, or if the specified
|
||||
screen line does not display text, this function returns nil. It is
|
||||
highly recommended to bind this function to some simple key, like F8,
|
||||
in order to avoid these problems.
|
||||
|
||||
This function exists mainly for testing the correctness of the
|
||||
Emacs UBA implementation, in particular with the test suite. */)
|
||||
(Lisp_Object vpos)
|
||||
{
|
||||
struct window *w = XWINDOW (selected_window);
|
||||
struct buffer *b = XBUFFER (w->contents);
|
||||
int nrow;
|
||||
struct glyph_row *row;
|
||||
|
||||
if (NILP (vpos))
|
||||
{
|
||||
int d1, d2, d3, d4, d5;
|
||||
|
||||
pos_visible_p (w, PT, &d1, &d2, &d3, &d4, &d5, &nrow);
|
||||
}
|
||||
else
|
||||
{
|
||||
CHECK_NUMBER_COERCE_MARKER (vpos);
|
||||
nrow = XINT (vpos);
|
||||
}
|
||||
|
||||
/* We require up-to-date glyph matrix for this window. */
|
||||
if (w->window_end_valid
|
||||
&& !windows_or_buffers_changed
|
||||
&& b
|
||||
&& !b->clip_changed
|
||||
&& !b->prevent_redisplay_optimizations_p
|
||||
&& !window_outdated (w)
|
||||
&& nrow >= 0
|
||||
&& nrow < w->current_matrix->nrows
|
||||
&& (row = MATRIX_ROW (w->current_matrix, nrow))->enabled_p
|
||||
&& MATRIX_ROW_DISPLAYS_TEXT_P (row))
|
||||
{
|
||||
struct glyph *g, *e, *g1;
|
||||
int nglyphs, i;
|
||||
Lisp_Object levels;
|
||||
|
||||
if (!row->reversed_p) /* Left-to-right glyph row. */
|
||||
{
|
||||
g = g1 = row->glyphs[TEXT_AREA];
|
||||
e = g + row->used[TEXT_AREA];
|
||||
|
||||
/* Skip over glyphs at the start of the row that was
|
||||
generated by redisplay for its own needs. */
|
||||
while (g < e
|
||||
&& INTEGERP (g->object)
|
||||
&& g->charpos < 0)
|
||||
g++;
|
||||
g1 = g;
|
||||
|
||||
/* Count the "interesting" glyphs in this row. */
|
||||
for (nglyphs = 0; g < e && !INTEGERP (g->object); g++)
|
||||
nglyphs++;
|
||||
|
||||
/* Create and fill the array. */
|
||||
levels = make_uninit_vector (nglyphs);
|
||||
for (i = 0; g1 < g; i++, g1++)
|
||||
ASET (levels, i, make_number (g1->resolved_level));
|
||||
}
|
||||
else /* Right-to-left glyph row. */
|
||||
{
|
||||
g = row->glyphs[TEXT_AREA] + row->used[TEXT_AREA] - 1;
|
||||
e = row->glyphs[TEXT_AREA] - 1;
|
||||
while (g > e
|
||||
&& INTEGERP (g->object)
|
||||
&& g->charpos < 0)
|
||||
g--;
|
||||
g1 = g;
|
||||
for (nglyphs = 0; g > e && !INTEGERP (g->object); g--)
|
||||
nglyphs++;
|
||||
levels = make_uninit_vector (nglyphs);
|
||||
for (i = 0; g1 > g; i++, g1--)
|
||||
ASET (levels, i, make_number (g1->resolved_level));
|
||||
}
|
||||
return levels;
|
||||
}
|
||||
else
|
||||
return Qnil;
|
||||
}
|
||||
|
||||
|
||||
|
||||
/***********************************************************************
|
||||
Menu Bar
|
||||
@ -25198,8 +25307,7 @@ append_glyph (struct it *it)
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
else
|
||||
@ -25282,8 +25390,7 @@ append_composite_glyph (struct it *it)
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
++it->glyph_row->used[area];
|
||||
@ -25471,8 +25578,7 @@ produce_image_glyph (struct it *it)
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
++it->glyph_row->used[area];
|
||||
@ -25560,8 +25666,7 @@ append_stretch_glyph (struct it *it, Lisp_Object object,
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
else
|
||||
@ -26020,8 +26125,7 @@ append_glyphless_glyph (struct it *it, int face_id, int for_no_font, int len,
|
||||
if (it->bidi_p)
|
||||
{
|
||||
glyph->resolved_level = it->bidi_it.resolved_level;
|
||||
if ((it->bidi_it.type & 7) != it->bidi_it.type)
|
||||
emacs_abort ();
|
||||
eassert ((it->bidi_it.type & 7) == it->bidi_it.type);
|
||||
glyph->bidi_type = it->bidi_it.type;
|
||||
}
|
||||
++it->glyph_row->used[area];
|
||||
@ -30437,6 +30541,7 @@ syms_of_xdisp (void)
|
||||
|
||||
DEFSYM (Qright_to_left, "right-to-left");
|
||||
DEFSYM (Qleft_to_right, "left-to-right");
|
||||
defsubr (&Sbidi_resolved_levels);
|
||||
|
||||
#ifdef HAVE_WINDOW_SYSTEM
|
||||
DEFVAR_BOOL ("x-stretch-cursor", x_stretch_cursor_p,
|
||||
@ -30843,6 +30948,12 @@ To add a prefix to continuation lines, use `wrap-prefix'. */);
|
||||
doc: /* Non-nil means don't free realized faces. Internal use only. */);
|
||||
inhibit_free_realized_faces = 0;
|
||||
|
||||
DEFVAR_BOOL ("inhibit-bidi-mirroring", inhibit_bidi_mirroring,
|
||||
doc: /* Non-nil means don't mirror characters even when bidi context requires that.
|
||||
Intended for use during debugging and for testing bidi display;
|
||||
see biditest.el in the test suite. */);
|
||||
inhibit_bidi_mirroring = 0;
|
||||
|
||||
#ifdef GLYPH_DEBUG
|
||||
DEFVAR_BOOL ("inhibit-try-window-id", inhibit_try_window_id,
|
||||
doc: /* Inhibit try_window_id display optimization. */);
|
||||
|
96392
test/BidiCharacterTest.txt
Normal file
96392
test/BidiCharacterTest.txt
Normal file
File diff suppressed because it is too large
Load Diff
@ -1,3 +1,9 @@
|
||||
2014-10-15 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* BidiCharacterTest.txt: New file, from Unicode.
|
||||
|
||||
* biditest.el: New file.
|
||||
|
||||
2014-10-08 Leo Liu <sdl.web@gmail.com>
|
||||
|
||||
* automated/print-tests.el: New file.
|
||||
|
121
test/biditest.el
Normal file
121
test/biditest.el
Normal file
@ -0,0 +1,121 @@
|
||||
;;; biditest.el --- test bidi reordering in GNU Emacs display engine.
|
||||
|
||||
;; Copyright (C) 2013-2014 Free Software Foundation, Inc.
|
||||
|
||||
;; Author: Eli Zaretskii
|
||||
;; Maintainer: FSF
|
||||
;; Package: emacs
|
||||
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU General Public License as published by
|
||||
;; the Free Software Foundation, either version 3 of the License, or
|
||||
;; (at your option) any later version.
|
||||
|
||||
;; This program is distributed in the hope that it will be useful,
|
||||
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
;; GNU General Public License for more details.
|
||||
|
||||
;; You should have received a copy of the GNU General Public License
|
||||
;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>.
|
||||
|
||||
;;; Commentary:
|
||||
|
||||
;; Produce a specially-formatted text file from BidiCharacterTest.txt
|
||||
;; file that is part of the Unicode Standard's UCD package. The file
|
||||
;; shows the expected results of reordering according to the UBA. The
|
||||
;; file is supposed to be visited in Emacs, and the resulting display
|
||||
;; compared with the expected one.
|
||||
|
||||
;;; Code:
|
||||
|
||||
(defun biditest-generate-testfile (input-file output-file)
|
||||
"Generate a bidi test file OUTPUT-FILE from data in INPUT-FILE.
|
||||
|
||||
INPUT-FILE should be in the format of the BidiCharacterTest.txt file
|
||||
available from the Unicode site, as part of the UCD database, see
|
||||
http://www.unicode.org/Public/UCD/latest/ucd/BidiCharacterTest.txt.
|
||||
|
||||
The resulting file should be viewed with `inhibit-bidi-mirroring' set to t."
|
||||
(let ((output-buf (get-buffer-create "*biditest-output*"))
|
||||
(lnum 1)
|
||||
tbuf)
|
||||
(with-temp-buffer
|
||||
(message "Generating output in %s ..." output-file)
|
||||
(setq tbuf (current-buffer))
|
||||
(insert-file-contents input-file)
|
||||
(goto-char (point-min))
|
||||
(while (not (eobp))
|
||||
(when (looking-at "^\\([0-9A-F ]+\\);\\([012]\\);\\([01]\\);\\([0-9 ]+\\);\\([0-9 ]+\\)$")
|
||||
(let ((codes (match-string 1))
|
||||
(default-paragraph (match-string 2))
|
||||
(resolved-paragraph (match-string 3))
|
||||
;; FIXME: Should compare LEVELS with what the display
|
||||
;; engine actually produced.
|
||||
(levels (match-string 4))
|
||||
(indices (match-string 5)))
|
||||
(setq codes (split-string codes " ")
|
||||
indices (split-string indices " "))
|
||||
(switch-to-buffer output-buf)
|
||||
(insert (format "Test on line %d:\n\n" lnum))
|
||||
;; Force paragraph direction to what the UCD test
|
||||
;; specifies.
|
||||
(insert (cond
|
||||
((string= default-paragraph "0") ;L2R
|
||||
#x200e)
|
||||
((string= default-paragraph "1") ;R2L
|
||||
#x200f)
|
||||
(t ""))) ; dynamic
|
||||
;; Insert the characters
|
||||
(mapc (lambda (code)
|
||||
(insert (string-to-number code 16)))
|
||||
codes)
|
||||
(insert "\n\n")
|
||||
;; Insert the expected results
|
||||
(insert "Expected result:\n\n")
|
||||
;; We want the expected results displayed exactly as
|
||||
;; specified in the test file, without any reordering, so
|
||||
;; we override the directional properties of all of the
|
||||
;; characters in the expected result by prepending
|
||||
;; LRO/RLO.
|
||||
(cond ((string= resolved-paragraph "0")
|
||||
(insert #x200e #x202d))
|
||||
((string= resolved-paragraph "1")
|
||||
(insert #x200f #x202e)
|
||||
;; We need to reverse the list of indices for R2L
|
||||
;; paragraphs, so that their logical order on
|
||||
;; display matches user expectations.
|
||||
(setq indices (nreverse indices))))
|
||||
(mapc (lambda (index)
|
||||
(insert (string-to-number
|
||||
(nth (string-to-number index 10) codes)
|
||||
16)))
|
||||
indices)
|
||||
(insert #x202c) ; end the embedding
|
||||
(insert "\n\n"))
|
||||
(switch-to-buffer tbuf))
|
||||
(forward-line 1)
|
||||
(setq lnum (1+ lnum)))
|
||||
(switch-to-buffer output-buf)
|
||||
(let ((coding-system-for-write 'utf-8-unix))
|
||||
(write-file output-file))
|
||||
(message "Generating output in %s ... done" output-file))))
|
||||
|
||||
(defun biditest-create-test ()
|
||||
"Create a test file for testing the Emacs bidirectional display.
|
||||
|
||||
The resulting file should be viewed with `inhibit-bidi-mirroring' set to t."
|
||||
(biditest-generate-testfile (pop command-line-args-left)
|
||||
(or (pop command-line-args-left)
|
||||
"biditest.txt")))
|
||||
|
||||
;; A handy function for displaying the resolved bidi levels.
|
||||
(defun bidi-levels ()
|
||||
"Display the resolved bidirectional levels of characters on current line.
|
||||
|
||||
The results can be compared with the levels stated in the
|
||||
BidiCharacterTest.txt file."
|
||||
(interactive)
|
||||
(message "%s" (bidi-resolved-levels)))
|
||||
|
||||
(define-key global-map [f8] 'bidi-levels)
|
Loading…
Reference in New Issue
Block a user