1
0
mirror of https://git.savannah.gnu.org/git/emacs.git synced 2025-01-04 11:40:22 +00:00

Update Unicode support to Unicode version 14.0.0

* admin/unidata/copyright.html:
* admin/unidata/UnicodeData.txt:
* admin/unidata/Blocks.txt:
* admin/unidata/BidiBrackets.txt:
* admin/unidata/BidiMirroring.txt:
* admin/unidata/IVD_Sequences.txt:
* admin/unidata/NormalizationTest.txt:
* admin/unidata/SpecialCasing.txt:
* test/manual/BidiCharacterTest.txt: Updated files from Unicode
14.0.

* lisp/international/fontset.el (script-representative-chars): Add
new scripts.
(otf-script-alist): Update from latest version.
(setup-default-fontset): Add new scripts.
* lisp/international/characters.el: Update syntax and category
tables for new characters and scripts.
(char-width-table): Update for changes in Unicode 14.0.
* lisp/international/mule-cmds.el (ucs-names): Update used and
unused ranges per Unicode 14.0.

* test/lisp/international/ucs-normalize-tests.el
(ucs-normalize-tests--failing-lines-part1)
(ucs-normalize-tests--failing-lines-part2): Update per the test
results.

* doc/lispref/nonascii.texi (Character Properties): Update Unicode
version number.

* etc/NEWS: Announce support for Unicode 14.0.

* admin/notes/unicode: Minor copyedits.
This commit is contained in:
Eli Zaretskii 2021-09-15 14:40:13 +03:00
parent bce1013883
commit 83557511a7
16 changed files with 1814 additions and 560 deletions

View File

@ -12,11 +12,11 @@ Emacs uses the following files from the Unicode Character Database
. UnicodeData.txt
. Blocks.txt
. BidiBrackets.txt
. BidiCharacterTest.txt
. BidiMirroring.txt
. IVD_Sequences.txt
. NormalizationTest.txt
. SpecialCasing.txt
. BidiCharacterTest.txt
First, the first 7 files need to be copied into admin/unidata/, and
the file https://www.unicode.org/copyright.html should be copied over
@ -81,7 +81,12 @@ regarding failing lines.
The file BidiCharacterTest.txt should be copied to the test suite, and
if its format has changed, the file biditest.el there should be
modified to follow suit.
modified to follow suit. If there's trailing whitespace in
BidiCharacterTest.txt, it should be removed before committing the new
version.
etc/NEWS should be updated to announce the support for the new Unicode
version.
Problems, fixmes and other unicode-related issues
-------------------------------------------------------------

View File

@ -1,11 +1,11 @@
# BidiBrackets-13.0.0.txt
# Date: 2019-09-09, 19:31:00 GMT [AG, LI, KW]
# © 2019 Unicode®, Inc.
# BidiBrackets-14.0.0.txt
# Date: 2021-06-30, 23:59:00 GMT [AG, LI, KW]
# © 2021 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see http://www.unicode.org/terms_of_use.html
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see http://www.unicode.org/reports/tr44/
# For documentation, see https://www.unicode.org/reports/tr44/
#
# Bidi_Paired_Bracket and Bidi_Paired_Bracket_Type Properties
#
@ -56,7 +56,7 @@
# of each line.
#
# For information on bidirectional paired brackets, see UAX #9: Unicode
# Bidirectional Algorithm, at http://www.unicode.org/unicode/reports/tr9/
# Bidirectional Algorithm, at https://www.unicode.org/reports/tr9/
#
# This file was originally created by Andrew Glass and Laurentiu Iancu
# for Unicode 6.3.
@ -147,6 +147,14 @@
2E27; 2E26; c # RIGHT SIDEWAYS U BRACKET
2E28; 2E29; o # LEFT DOUBLE PARENTHESIS
2E29; 2E28; c # RIGHT DOUBLE PARENTHESIS
2E55; 2E56; o # LEFT SQUARE BRACKET WITH STROKE
2E56; 2E55; c # RIGHT SQUARE BRACKET WITH STROKE
2E57; 2E58; o # LEFT SQUARE BRACKET WITH DOUBLE STROKE
2E58; 2E57; c # RIGHT SQUARE BRACKET WITH DOUBLE STROKE
2E59; 2E5A; o # TOP HALF LEFT PARENTHESIS
2E5A; 2E59; c # TOP HALF RIGHT PARENTHESIS
2E5B; 2E5C; o # BOTTOM HALF LEFT PARENTHESIS
2E5C; 2E5B; c # BOTTOM HALF RIGHT PARENTHESIS
3008; 3009; o # LEFT ANGLE BRACKET
3009; 3008; c # RIGHT ANGLE BRACKET
300A; 300B; o # LEFT DOUBLE ANGLE BRACKET

View File

@ -1,10 +1,10 @@
# BidiMirroring-13.0.0.txt
# Date: 2019-09-09, 19:34:00 GMT [KW, LI, RP]
# © 2019 Unicode®, Inc.
# For terms of use, see http://www.unicode.org/terms_of_use.html
# BidiMirroring-14.0.0.txt
# Date: 2021-08-08, 22:55:00 GMT [KW, RP]
# © 2021 Unicode®, Inc.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see http://www.unicode.org/reports/tr44/
# For documentation, see https://www.unicode.org/reports/tr44/
#
# Bidi_Mirroring_Glyph Property
#
@ -15,7 +15,7 @@
# value, for which there is another Unicode character that typically has a glyph
# that is the mirror image of the original character's glyph.
#
# The repertoire covered by the file is Unicode 13.0.0.
# The repertoire covered by the file is Unicode 14.0.0.
#
# The file contains a list of lines with mappings from one code point
# to another one for character-based mirroring.
@ -40,7 +40,7 @@
# for character-based mirroring.
#
# For information on bidi mirroring, see UAX #9: Unicode Bidirectional Algorithm,
# at http://www.unicode.org/unicode/reports/tr9/
# at https://www.unicode.org/reports/tr9/
#
# This file was originally created by Markus Scherer.
# Extended for Unicode 3.2, 4.0, 4.1, 5.0, 5.1, 5.2, and 6.0 by Ken Whistler,
@ -96,10 +96,10 @@
208D; 208E # SUBSCRIPT LEFT PARENTHESIS
208E; 208D # SUBSCRIPT RIGHT PARENTHESIS
2208; 220B # ELEMENT OF
2209; 220C # NOT AN ELEMENT OF
2209; 220C # [BEST FIT] NOT AN ELEMENT OF
220A; 220D # SMALL ELEMENT OF
220B; 2208 # CONTAINS AS MEMBER
220C; 2209 # DOES NOT CONTAIN AS MEMBER
220C; 2209 # [BEST FIT] DOES NOT CONTAIN AS MEMBER
220D; 220A # SMALL CONTAINS AS MEMBER
2215; 29F5 # DIVISION SLASH
221F; 2BFE # RIGHT ANGLE
@ -453,6 +453,14 @@
2E27; 2E26 # RIGHT SIDEWAYS U BRACKET
2E28; 2E29 # LEFT DOUBLE PARENTHESIS
2E29; 2E28 # RIGHT DOUBLE PARENTHESIS
2E55; 2E56 # LEFT SQUARE BRACKET WITH STROKE
2E56; 2E55 # RIGHT SQUARE BRACKET WITH STROKE
2E57; 2E58 # LEFT SQUARE BRACKET WITH DOUBLE STROKE
2E58; 2E57 # RIGHT SQUARE BRACKET WITH DOUBLE STROKE
2E59; 2E5A # TOP HALF LEFT PARENTHESIS
2E5A; 2E59 # TOP HALF RIGHT PARENTHESIS
2E5B; 2E5C # BOTTOM HALF LEFT PARENTHESIS
2E5C; 2E5B # BOTTOM HALF RIGHT PARENTHESIS
3008; 3009 # LEFT ANGLE BRACKET
3009; 3008 # RIGHT ANGLE BRACKET
300A; 300B # LEFT DOUBLE ANGLE BRACKET

View File

@ -1,6 +1,6 @@
# Blocks-13.0.0.txt
# Date: 2019-07-10, 19:06:00 GMT [KW]
# © 2019 Unicode®, Inc.
# Blocks-14.0.0.txt
# Date: 2021-01-22, 23:29:00 GMT [KW]
# © 2021 Unicode®, Inc.
# For terms of use, see http://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
@ -52,6 +52,7 @@
0800..083F; Samaritan
0840..085F; Mandaic
0860..086F; Syriac Supplement
0870..089F; Arabic Extended-B
08A0..08FF; Arabic Extended-A
0900..097F; Devanagari
0980..09FF; Bengali
@ -215,7 +216,9 @@ FFF0..FFFF; Specials
104B0..104FF; Osage
10500..1052F; Elbasan
10530..1056F; Caucasian Albanian
10570..105BF; Vithkuqi
10600..1077F; Linear A
10780..107BF; Latin Extended-F
10800..1083F; Cypriot Syllabary
10840..1085F; Imperial Aramaic
10860..1087F; Palmyrene
@ -240,6 +243,7 @@ FFF0..FFFF; Specials
10E80..10EBF; Yezidi
10F00..10F2F; Old Sogdian
10F30..10F6F; Sogdian
10F70..10FAF; Old Uyghur
10FB0..10FDF; Chorasmian
10FE0..10FFF; Elymaic
11000..1107F; Brahmi
@ -259,13 +263,14 @@ FFF0..FFFF; Specials
11600..1165F; Modi
11660..1167F; Mongolian Supplement
11680..116CF; Takri
11700..1173F; Ahom
11700..1174F; Ahom
11800..1184F; Dogra
118A0..118FF; Warang Citi
11900..1195F; Dives Akuru
119A0..119FF; Nandinagari
11A00..11A4F; Zanabazar Square
11A50..11AAF; Soyombo
11AB0..11ABF; Unified Canadian Aboriginal Syllabics Extended-A
11AC0..11AFF; Pau Cin Hau
11C00..11C6F; Bhaiksuki
11C70..11CBF; Marchen
@ -277,11 +282,13 @@ FFF0..FFFF; Specials
12000..123FF; Cuneiform
12400..1247F; Cuneiform Numbers and Punctuation
12480..1254F; Early Dynastic Cuneiform
12F90..12FFF; Cypro-Minoan
13000..1342F; Egyptian Hieroglyphs
13430..1343F; Egyptian Hieroglyph Format Controls
14400..1467F; Anatolian Hieroglyphs
16800..16A3F; Bamum Supplement
16A40..16A6F; Mro
16A70..16ACF; Tangsa
16AD0..16AFF; Bassa Vah
16B00..16B8F; Pahawh Hmong
16E40..16E9F; Medefaidrin
@ -290,13 +297,15 @@ FFF0..FFFF; Specials
17000..187FF; Tangut
18800..18AFF; Tangut Components
18B00..18CFF; Khitan Small Script
18D00..18D8F; Tangut Supplement
18D00..18D7F; Tangut Supplement
1AFF0..1AFFF; Kana Extended-B
1B000..1B0FF; Kana Supplement
1B100..1B12F; Kana Extended-A
1B130..1B16F; Small Kana Extension
1B170..1B2FF; Nushu
1BC00..1BC9F; Duployan
1BCA0..1BCAF; Shorthand Format Controls
1CF00..1CFCF; Znamenny Musical Notation
1D000..1D0FF; Byzantine Musical Symbols
1D100..1D1FF; Musical Symbols
1D200..1D24F; Ancient Greek Musical Notation
@ -305,9 +314,12 @@ FFF0..FFFF; Specials
1D360..1D37F; Counting Rod Numerals
1D400..1D7FF; Mathematical Alphanumeric Symbols
1D800..1DAAF; Sutton SignWriting
1DF00..1DFFF; Latin Extended-G
1E000..1E02F; Glagolitic Supplement
1E100..1E14F; Nyiakeng Puachue Hmong
1E290..1E2BF; Toto
1E2C0..1E2FF; Wancho
1E7E0..1E7FF; Ethiopic Extended-B
1E800..1E8DF; Mende Kikakui
1E900..1E95F; Adlam
1EC70..1ECBF; Indic Siyaq Numbers

View File

@ -2,6 +2,9 @@
#
# History:
#
# 2020-11-06 Registration of additional sequences in the MSARG
# collection.
#
# 2017-12-12 Registration of additional sequences in the Adobe-Japan1
# collection. Combined registration of the KRName collection
# and of sequences in that collection. Registration of
@ -27,10 +30,10 @@
#
# This file is part of the Unicode Ideographic Variation Database (IVD).
# For more details on the IVD, see UTS #37:
# http://www.unicode.org/reports/tr37/
# https://www.unicode.org/reports/tr37/
#
# Copyright 2006-2017 Unicode, Inc.
# For terms of use, see: http://www.unicode.org/terms_of_use.html
# Copyright 2006-2020 Unicode, Inc.
# For terms of use, see: https://www.unicode.org/copyright.html#8
#
3402 E0100; Adobe-Japan1; CID+13698
3402 E0101; Adobe-Japan1; CID+13697
@ -864,6 +867,8 @@
4054 E0100; Hanyo-Denshi; IA3507
4054 E0101; Hanyo-Denshi; TK01062970
4058 E0100; Adobe-Japan1; CID+18198
4058 E0101; MSARG; MD_4058
4058 E0102; MSARG; ME_4058_001
4071 E0100; Hanyo-Denshi; IA0509
4071 E0100; Moji_Joho; MJ002897
4071 E0101; Hanyo-Denshi; JTB661
@ -1094,6 +1099,8 @@
4359 E0101; Moji_Joho; MJ003650
4395 E0100; Hanyo-Denshi; IA4278
4395 E0101; Hanyo-Denshi; TK01074010
4397 E0100; MSARG; MA_9967
4397 E0101; MSARG; ME_4397_001
43A8 E0100; Hanyo-Denshi; IA4294
43A8 E0101; Hanyo-Denshi; TK01074340
43A9 E0100; Hanyo-Denshi; IA0588
@ -1407,6 +1414,8 @@
460D E0101; Moji_Joho; MJ004388
460F E0100; Adobe-Japan1; CID+18634
4610 E0100; Adobe-Japan1; CID+19136
4615 E0100; MSARG; MA_8FEB
4615 E0101; MSARG; ME_4615_001
462F E0100; Hanyo-Denshi; IA4925
462F E0101; Hanyo-Denshi; TK01083800
4635 E0100; Hanyo-Denshi; IA4929
@ -1560,6 +1569,8 @@
4921 E0101; Hanyo-Denshi; TK01092060
4938 E0100; Hanyo-Denshi; IA5637
4938 E0101; Hanyo-Denshi; TK01093220
493E E0100; MSARG; MA_97E1
493E E0101; MSARG; ME_493E_001
493F E0100; Hanyo-Denshi; IA5643
493F E0100; Moji_Joho; MJ005179
493F E0101; Hanyo-Denshi; KS462830
@ -1608,6 +1619,8 @@
4A28 E0101; Hanyo-Denshi; IB1033
4A28 E0101; Moji_Joho; MJ005391
4A29 E0100; Adobe-Japan1; CID+18910
4A29 E0101; MSARG; MD_4A29
4A29 E0102; MSARG; ME_4A29_001
4A3C E0100; Hanyo-Denshi; IB1041
4A3C E0100; Moji_Joho; MJ005411
4A3C E0101; Hanyo-Denshi; JTBEBD
@ -2016,6 +2029,8 @@
4E75 E0101; Moji_Joho; MJ006419
4E75 E0102; Hanyo-Denshi; JTAD26
4E75 E0102; Moji_Joho; MJ006420
4E78 E0100; MSARG; MA_9AFB
4E78 E0101; MSARG; ME_4E78_001
4E79 E0100; Adobe-Japan1; CID+19143
4E7E E0100; Adobe-Japan1; CID+1505
4E7F E0100; Adobe-Japan1; CID+14306
@ -2633,6 +2648,8 @@
5029 E0101; Moji_Joho; MJ006862
5029 E0102; Hanyo-Denshi; FT2068
5029 E0102; Moji_Joho; MJ006863
5029 E0103; MSARG; MB_ADC5
5029 E0104; MSARG; ME_5029_001
502A E0100; Adobe-Japan1; CID+4157
502B E0100; Adobe-Japan1; CID+3993
502B E0101; Hanyo-Denshi; JA4649
@ -3511,6 +3528,8 @@
51CD E0100; Adobe-Japan1; CID+3162
51CF E0100; Adobe-Japan1; CID+19177
51D1 E0100; Adobe-Japan1; CID+19178
51D1 E0101; MSARG; MA_FAA4
51D1 E0102; MSARG; ME_51D1_001
51D2 E0100; Adobe-Japan1; CID+21186
51D3 E0100; Adobe-Japan1; CID+19179
51D4 E0100; Adobe-Japan1; CID+19180
@ -4443,6 +4462,8 @@
53D6 E0100; Adobe-Japan1; CID+2324
53D7 E0100; Adobe-Japan1; CID+2337
53D7 E0101; Adobe-Japan1; CID+13813
53D8 E0100; MSARG; MA_895A
53D8 E0101; MSARG; ME_53D8_001
53D9 E0100; Adobe-Japan1; CID+2432
53DA E0100; Adobe-Japan1; CID+14372
53DB E0100; Adobe-Japan1; CID+3412
@ -4943,6 +4964,8 @@
555A E0102; Hanyo-Denshi; KS044550
555A E0102; Moji_Joho; MJ008401
555A E0103; Moji_Joho; MJ057180
555A E0104; MSARG; MD_555A
555A E0105; MSARG; ME_555A_001
555B E0100; Adobe-Japan1; CID+21282
555C E0100; Adobe-Japan1; CID+4392
555D E0100; Adobe-Japan1; CID+4398
@ -4951,6 +4974,8 @@
555D E0102; Hanyo-Denshi; JTAEDA
555D E0102; Moji_Joho; MJ008405
555E E0100; Adobe-Japan1; CID+7633
555F E0100; MSARG; MB_B1D2
555F E0101; MSARG; ME_555F_001
5560 E0100; Adobe-Japan1; CID+14392
5561 E0100; Adobe-Japan1; CID+20308
5561 E0101; Adobe-Japan1; CID+14393
@ -4961,6 +4986,8 @@
5568 E0100; Moji_Joho; MJ008415
5568 E0101; Hanyo-Denshi; IB1035
5568 E0101; Moji_Joho; MJ008416
556B E0100; MSARG; MA_94DC
556B E0101; MSARG; ME_556B_001
557B E0100; Adobe-Japan1; CID+4404
557C E0100; Adobe-Japan1; CID+4409
557C E0101; Hanyo-Denshi; JA5138
@ -5428,6 +5455,8 @@
56A0 E0102; Hanyo-Denshi; JTAF22
56A0 E0102; Moji_Joho; MJ008750
56A2 E0100; Adobe-Japan1; CID+3311
56A4 E0100; MSARG; MA_97A3
56A4 E0101; MSARG; ME_56A4_001
56A5 E0100; Adobe-Japan1; CID+4446
56A5 E0101; Adobe-Japan1; CID+7822
56A5 E0102; Hanyo-Denshi; JA5175
@ -6654,6 +6683,8 @@
59F6 E0100; Adobe-Japan1; CID+1132
59F7 E0100; Adobe-Japan1; CID+21401
59F8 E0100; Adobe-Japan1; CID+19312
59F8 E0101; MSARG; MA_9D55
59F8 E0102; MSARG; ME_59F8_001
59FB E0100; Adobe-Japan1; CID+1213
59FF E0100; Adobe-Japan1; CID+2207
59FF E0101; Adobe-Japan1; CID+13792
@ -6870,6 +6901,8 @@
5ACC E0103; Hanyo-Denshi; FT1791S
5ACC E0103; Moji_Joho; MJ009909
5ACF E0100; Adobe-Japan1; CID+21424
5ACF E0101; MSARG; MA_92F4
5ACF E0102; MSARG; ME_5ACF_001
5AD0 E0100; Adobe-Japan1; CID+4603
5AD6 E0100; Adobe-Japan1; CID+4596
5AD7 E0100; Adobe-Japan1; CID+4593
@ -7540,6 +7573,8 @@
5C8D E0100; Moji_Joho; MJ010418
5C8D E0101; Hanyo-Denshi; JTB06C
5C8D E0101; Moji_Joho; MJ010419
5C8D E0102; MSARG; MB_CAC0
5C8D E0103; MSARG; ME_5C8D_001
5C8F E0100; Adobe-Japan1; CID+16840
5C90 E0100; Adobe-Japan1; CID+1584
5C91 E0100; Adobe-Japan1; CID+4663
@ -7569,6 +7604,8 @@
5CAB E0100; Adobe-Japan1; CID+4666
5CAC E0100; Adobe-Japan1; CID+3764
5CAD E0100; Adobe-Japan1; CID+17557
5CAD E0101; MSARG; MB_CC64
5CAD E0102; MSARG; ME_5CAD_001
5CB1 E0100; Adobe-Japan1; CID+2866
5CB2 E0100; Adobe-Japan1; CID+21464
5CB3 E0100; Adobe-Japan1; CID+1463
@ -10095,6 +10132,8 @@
62D0 E0103; Moji_Joho; MJ012273
62D0 E0104; Hanyo-Denshi; IB1917
62D0 E0104; Moji_Joho; MJ012274
62D0 E0105; MSARG; MB_A9E4
62D0 E0106; MSARG; ME_62D0_001
62D1 E0100; Adobe-Japan1; CID+4961
62D2 E0100; Adobe-Japan1; CID+1675
62D2 E0101; Adobe-Japan1; CID+13715
@ -12633,6 +12672,8 @@
690D E0104; Hanyo-Denshi; KS170180
690D E0104; Moji_Joho; MJ057753
690D E0105; Hanyo-Denshi; TK01044570
690D E0106; MSARG; MB_B4D3
690D E0107; MSARG; ME_690D_001
690E E0100; Adobe-Japan1; CID+3043
690F E0100; Adobe-Japan1; CID+5206
6910 E0100; Adobe-Japan1; CID+21777
@ -13038,6 +13079,8 @@
6A0B E0102; Moji_Joho; MJ014416
6A0B E0103; Hanyo-Denshi; FT1907
6A0B E0103; Moji_Joho; MJ014417
6A0B E0104; MSARG; MA_FD42
6A0B E0105; MSARG; ME_6A0B_001
6A0C E0100; Adobe-Japan1; CID+5295
6A0F E0100; Adobe-Japan1; CID+14658
6A11 E0100; Adobe-Japan1; CID+17851
@ -13898,6 +13941,8 @@
6C67 E0102; Hanyo-Denshi; IB2201
6C67 E0102; Moji_Joho; MJ015090
6C67 E0103; Hanyo-Denshi; TK01049100
6C67 E0104; MSARG; MB_CB4C
6C67 E0105; MSARG; ME_6C67_001
6C68 E0100; Adobe-Japan1; CID+5392
6C6A E0100; Adobe-Japan1; CID+5385
6C6A E0101; Hanyo-Denshi; JA6174
@ -14318,6 +14363,8 @@
6DCD E0100; Hanyo-Denshi; IP6DCD
6DCD E0101; Hanyo-Denshi; TK01050160
6DCE E0100; Adobe-Japan1; CID+17952
6DCE E0101; MSARG; MD_6DCE
6DCE E0102; MSARG; ME_6DCE_001
6DCF E0100; Adobe-Japan1; CID+8520
6DCF E0101; Hanyo-Denshi; JB3957
6DCF E0101; Moji_Joho; MJ015459
@ -15270,6 +15317,8 @@
701E E0106; Hanyo-Denshi; IB2283
701E E0106; Moji_Joho; MJ016159
701E E0107; Hanyo-Denshi; TK01053770
701E E0108; MSARG; MA_96EE
701E E0109; MSARG; ME_701E_001
701F E0100; Adobe-Japan1; CID+5546
701F E0101; Hanyo-Denshi; JA6347
701F E0101; Moji_Joho; MJ016166
@ -15692,6 +15741,8 @@
71A8 E0100; Adobe-Japan1; CID+5580
71AC E0100; Adobe-Japan1; CID+5581
71AE E0100; Adobe-Japan1; CID+18037
71AE E0101; MSARG; MD_71AE
71AE E0102; MSARG; ME_71AE_001
71AF E0100; Adobe-Japan1; CID+18038
71B0 E0100; Adobe-Japan1; CID+21938
71B1 E0100; Adobe-Japan1; CID+3300
@ -15704,6 +15755,8 @@
71B3 E0103; Moji_Joho; MJ016595
71B9 E0100; Adobe-Japan1; CID+5583
71BA E0100; Adobe-Japan1; CID+16965
71BB E0100; MSARG; MD_71BB
71BB E0101; MSARG; ME_71BB_001
71BE E0100; Adobe-Japan1; CID+5584
71BE E0101; Hanyo-Denshi; JA6385
71BE E0101; Moji_Joho; MJ016605
@ -15827,6 +15880,8 @@
7210 E0100; Adobe-Japan1; CID+5597
7213 E0100; Adobe-Japan1; CID+21946
7215 E0100; Adobe-Japan1; CID+16967
7215 E0101; MSARG; MA_FE41
7215 E0102; MSARG; ME_7215_001
7217 E0100; Adobe-Japan1; CID+19521
7217 E0101; Hanyo-Denshi; JB4235
7217 E0101; Moji_Joho; MJ016708
@ -17688,6 +17743,8 @@
771F E0105; Moji_Joho; MJ018174
771F E0106; Hanyo-Denshi; TK01062480
771F E0107; Moji_Joho; MJ018175
771F E0108; MSARG; MB_AF75
771F E0109; MSARG; ME_771F_001
7720 E0100; Adobe-Japan1; CID+3774
7722 E0100; Adobe-Japan1; CID+19578
7724 E0100; Adobe-Japan1; CID+5815
@ -18016,6 +18073,8 @@
784F E0101; Moji_Joho; MJ018495
784F E0102; Hanyo-Denshi; JTB674
784F E0102; Moji_Joho; MJ018496
784F E0103; MSARG; MD_784F
784F E0104; MSARG; ME_784F_001
7851 E0100; Adobe-Japan1; CID+15420
7852 E0100; Adobe-Japan1; CID+22078
785C E0100; Adobe-Japan1; CID+19603
@ -19233,6 +19292,8 @@
7AAE E0103; Moji_Joho; MJ019272
7AAF E0100; Adobe-Japan1; CID+3900
7AB0 E0100; Adobe-Japan1; CID+5938
7AB0 E0101; MSARG; MA_8E50
7AB0 E0102; MSARG; ME_7AB0_001
7AB1 E0100; Hanyo-Denshi; IP7AB1
7AB1 E0101; Hanyo-Denshi; TK01068710
7AB3 E0100; Adobe-Japan1; CID+14931
@ -19444,6 +19505,8 @@
7B51 E0103; Moji_Joho; MJ019458
7B51 E0104; Hanyo-Denshi; JTB7A5
7B51 E0104; Moji_Joho; MJ019456
7B51 E0105; MSARG; MB_B5AE
7B51 E0106; MSARG; ME_7B51_001
7B52 E0100; Adobe-Japan1; CID+3189
7B53 E0100; Adobe-Japan1; CID+14173
7B53 E0101; Hanyo-Denshi; IP7B53
@ -20362,6 +20425,8 @@
7D86 E0103; Moji_Joho; MJ020134
7D88 E0100; Adobe-Japan1; CID+18353
7D89 E0100; Adobe-Japan1; CID+6084
7D89 E0101; MSARG; MA_8EA7
7D89 E0102; MSARG; ME_7D89_001
7D8B E0100; Adobe-Japan1; CID+14980
7D8C E0100; Adobe-Japan1; CID+14981
7D8D E0100; Adobe-Japan1; CID+19665
@ -20427,6 +20492,8 @@
7DAA E0102; Hanyo-Denshi; IB0846
7DAA E0102; Moji_Joho; MJ020184
7DAB E0100; Adobe-Japan1; CID+6095
7DAB E0101; MSARG; MA_8EA8
7DAB E0102; MSARG; ME_7DAB_001
7DAC E0100; Adobe-Japan1; CID+2342
7DAD E0100; Adobe-Japan1; CID+1185
7DAD E0101; Hanyo-Denshi; JA1661
@ -21226,6 +21293,8 @@
7FEB E0103; Moji_Joho; MJ020716
7FEB E0104; Moji_Joho; MJ020718
7FEC E0100; Adobe-Japan1; CID+15007
7FEC E0101; MSARG; MB_E6F8
7FEC E0102; MSARG; ME_7FEC_001
7FEE E0100; Adobe-Japan1; CID+15008
7FEF E0100; Adobe-Japan1; CID+15009
7FF0 E0100; Adobe-Japan1; CID+1545
@ -21238,6 +21307,8 @@
7FF0 E0104; Moji_Joho; MJ020725
7FF0 E0105; Hanyo-Denshi; TK01074020
7FF0 E0106; Hanyo-Denshi; TK01074040
7FF1 E0100; MSARG; MB_BFAC
7FF1 E0101; MSARG; ME_7FF1_001
7FF2 E0100; Adobe-Japan1; CID+18402
7FF3 E0100; Adobe-Japan1; CID+6199
7FF3 E0101; Hanyo-Denshi; JA7042
@ -21250,6 +21321,9 @@
7FF9 E0102; Hanyo-Denshi; FT2443
7FF9 E0102; Moji_Joho; MJ020737
7FFA E0100; Adobe-Japan1; CID+15010
7FFA E0101; MSARG; MA_8ECB
7FFA E0102; MSARG; ME_7FFA_001
7FFA E0103; MSARG; ME_7FFA_002
7FFB E0100; Adobe-Japan1; CID+3723
7FFB E0101; Adobe-Japan1; CID+14040
7FFB E0102; Hanyo-Denshi; JA4361
@ -23080,6 +23154,8 @@
833A E0101; Moji_Joho; MJ021843
833A E0102; Hanyo-Denshi; KS346690S
833A E0102; Moji_Joho; MJ021844
833A E0103; MSARG; MB_D072
833A E0104; MSARG; ME_833A_001
833C E0100; Adobe-Japan1; CID+18489
833C E0101; Hanyo-Denshi; JB5581
833C E0101; Moji_Joho; MJ021846
@ -23543,6 +23619,8 @@
83C1 E0104; Hanyo-Denshi; FT2479
83C1 E0104; Moji_Joho; MJ022062
83C1 E0105; Hanyo-Denshi; TK01078230
83C1 E0106; MSARG; MB_B5D7
83C1 E0107; MSARG; ME_83C1_001
83C2 E0100; Hanyo-Denshi; KS350410
83C2 E0101; Hanyo-Denshi; TK01078690
83C5 E0100; Adobe-Japan1; CID+2625
@ -24360,6 +24438,8 @@
84A8 E0103; Hanyo-Denshi; IB0865
84A8 E0103; Moji_Joho; MJ022481
84A8 E0104; Hanyo-Denshi; TK01080090
84A8 E0105; MSARG; MB_E3C8
84A8 E0106; MSARG; ME_84A8_001
84A9 E0100; Adobe-Japan1; CID+22350
84A9 E0101; Hanyo-Denshi; JB5680
84A9 E0101; Moji_Joho; MJ022484
@ -24871,6 +24951,8 @@
8534 E0101; Moji_Joho; MJ022738
8534 E0102; Hanyo-Denshi; KS360300
8534 E0102; Moji_Joho; MJ022739
8534 E0103; MSARG; MA_8F77
8534 E0104; MSARG; ME_8534_001
8535 E0100; Adobe-Japan1; CID+2818
8535 E0101; Hanyo-Denshi; JA3402
8535 E0102; Hanyo-Denshi; TK01081050
@ -26682,6 +26764,7 @@
8846 E0106; Hanyo-Denshi; TK01083450
8846 E0107; MSARG; MA_8FBC
8846 E0108; MSARG; ME_8846_001
8846 E0109; MSARG; ME_8846_002
8848 E0100; Adobe-Japan1; CID+22465
8849 E0100; Adobe-Japan1; CID+22466
884A E0100; Adobe-Japan1; CID+18635
@ -27969,6 +28052,8 @@
8B66 E0101; Moji_Joho; MJ024795
8B66 E0102; Hanyo-Denshi; KS408750S
8B66 E0102; Moji_Joho; MJ024796
8B67 E0100; MSARG; MB_F4D4
8B67 E0101; MSARG; ME_8B67_001
8B69 E0100; Adobe-Japan1; CID+17130
8B69 E0101; Hanyo-Denshi; JC9220
8B69 E0101; Moji_Joho; MJ024799
@ -28664,6 +28749,8 @@
8E26 E0100; Adobe-Japan1; CID+19850
8E27 E0100; Adobe-Japan1; CID+18732
8E2A E0100; Adobe-Japan1; CID+6824
8E2D E0100; MSARG; MA_9E5A
8E2D E0101; MSARG; ME_8E2D_001
8E30 E0100; Adobe-Japan1; CID+6813
8E30 E0101; Hanyo-Denshi; JA7692
8E30 E0101; Moji_Joho; MJ025364
@ -30340,6 +30427,8 @@
90A8 E0103; Moji_Joho; MJ026250
90A8 E0104; Hanyo-Denshi; TK01090660
90A8 E0105; Hanyo-Denshi; TK01090690
90A8 E0106; MSARG; MA_9068
90A8 E0107; MSARG; ME_90A8_001
90AA E0100; Adobe-Japan1; CID+2309
90AA E0101; Adobe-Japan1; CID+13454
90AA E0102; Adobe-Japan1; CID+13806
@ -30615,6 +30704,8 @@
9175 E0101; Moji_Joho; MJ026482
9175 E0102; Hanyo-Denshi; JTBDA9
9175 E0102; Moji_Joho; MJ026483
9176 E0100; MSARG; MA_9E4A
9176 E0101; MSARG; ME_9176_001
9177 E0100; Adobe-Japan1; CID+2053
9177 E0101; Adobe-Japan1; CID+13776
9177 E0102; Hanyo-Denshi; JA2583
@ -31073,6 +31164,8 @@
92B7 E0102; Moji_Joho; MJ026858
92B8 E0100; Adobe-Japan1; CID+22751
92B9 E0100; Adobe-Japan1; CID+6997
92B9 E0101; MSARG; MA_F9D7
92B9 E0102; MSARG; ME_92B9_001
92BA E0100; Adobe-Japan1; CID+22752
92BB E0100; Adobe-Japan1; CID+19920
92BC E0100; Adobe-Japan1; CID+19921
@ -31310,6 +31403,8 @@
936E E0101; Moji_Joho; MJ027061
936E E0102; Hanyo-Denshi; FT2650
936E E0102; Moji_Joho; MJ027062
936E E0103; MSARG; MA_A05F
936E E0104; MSARG; ME_936E_001
936F E0100; Adobe-Japan1; CID+22777
9370 E0100; Adobe-Japan1; CID+8676
9371 E0100; Adobe-Japan1; CID+18852
@ -32094,6 +32189,8 @@
96B6 E0101; Moji_Joho; MJ027706
96B6 E0102; Hanyo-Denshi; KS475490
96B6 E0102; Moji_Joho; MJ027705
96B6 E0103; MSARG; MA_90C4
96B6 E0104; MSARG; ME_96B6_001
96B7 E0100; Adobe-Japan1; CID+4020
96B8 E0100; Adobe-Japan1; CID+7114
96B9 E0100; Adobe-Japan1; CID+7115
@ -32370,6 +32467,8 @@
9759 E0102; Hanyo-Denshi; JA3237
9759 E0103; Hanyo-Denshi; TK01097430
9759 E0104; Hanyo-Denshi; TK01097490
9759 E0105; MSARG; MD_9759
9759 E0106; MSARG; ME_9759_001
975A E0100; Adobe-Japan1; CID+15275
975A E0101; Hanyo-Denshi; JB7121
975A E0101; Moji_Joho; MJ027909
@ -32392,6 +32491,8 @@
975C E0105; Moji_Joho; MJ027915
975C E0106; Hanyo-Denshi; TK01097530
975C E0107; Hanyo-Denshi; TK01097550
975C E0108; MSARG; MB_C052
975C E0109; MSARG; ME_975C_001
975D E0100; Hanyo-Denshi; IB1040
975D E0100; Moji_Joho; MJ027919
975D E0101; Hanyo-Denshi; IP975D
@ -32439,6 +32540,8 @@
976D E0103; Moji_Joho; MJ027942
976D E0104; Hanyo-Denshi; HG1633
976D E0104; Moji_Joho; MJ027941
976D E0105; MSARG; MA_9E46
976D E0106; MSARG; ME_976D_001
976E E0100; Adobe-Japan1; CID+15277
9771 E0100; Adobe-Japan1; CID+7152
9771 E0101; Adobe-Japan1; CID+7710
@ -32593,6 +32696,8 @@
97C8 E0103; Moji_Joho; MJ028052
97C8 E0104; Hanyo-Denshi; FT2682
97C8 E0104; Moji_Joho; MJ028054
97C8 E0105; MSARG; MA_9F76
97C8 E0106; MSARG; ME_97C8_001
97C9 E0100; Adobe-Japan1; CID+17192
97C9 E0101; Hanyo-Denshi; JB7160
97C9 E0101; Moji_Joho; MJ028055
@ -33066,6 +33171,7 @@
98EB E0103; Moji_Joho; MJ028358
98EC E0100; MSARG; MA_914B
98EC E0101; MSARG; ME_98EC_001
98EC E0102; MSARG; ME_98EC_002
98ED E0100; Adobe-Japan1; CID+4289
98ED E0101; Hanyo-Denshi; JA5012
98ED E0101; Moji_Joho; MJ028362
@ -33366,6 +33472,8 @@
9936 E0100; Moji_Joho; MJ028491
9936 E0101; Hanyo-Denshi; IP9936
9936 E0101; Moji_Joho; MJ028492
9938 E0100; MSARG; MA_9652
9938 E0101; MSARG; ME_9938_001
9939 E0100; Adobe-Japan1; CID+22892
9939 E0101; Hanyo-Denshi; JB7268
9939 E0101; Moji_Joho; MJ028494
@ -33844,6 +33952,8 @@
9A5F E0103; Hanyo-Denshi; KS510550
9A5F E0103; Moji_Joho; MJ028831
9A62 E0100; Adobe-Japan1; CID+7261
9A63 E0100; MSARG; MA_9557
9A63 E0101; MSARG; ME_9A63_001
9A64 E0100; Adobe-Japan1; CID+7263
9A65 E0100; Adobe-Japan1; CID+7262
9A65 E0101; Adobe-Japan1; CID+14268
@ -35128,6 +35238,8 @@
9ED8 E0102; Hanyo-Denshi; FT2329
9ED8 E0102; Moji_Joho; MJ029902
9ED9 E0100; Adobe-Japan1; CID+3815
9ED9 E0101; MSARG; MD_9ED9
9ED9 E0102; MSARG; ME_9ED9_001
9EDB E0100; Adobe-Japan1; CID+2883
9EDB E0101; Adobe-Japan1; CID+7729
9EDB E0102; Hanyo-Denshi; JA3467
@ -35813,6 +35925,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
206EE E0101; Moji_Joho; MJ031295
206F9 E0100; Moji_Joho; MJ031302
206F9 E0101; Moji_Joho; MJ031303
2070E E0100; MSARG; MA_92C3
2070E E0101; MSARG; ME_2070E_001
2071B E0100; Hanyo-Denshi; KS023680
2071B E0101; Hanyo-Denshi; TK01009920
2074F E0100; Adobe-Japan1; CID+17312
@ -36093,6 +36207,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
21764 E0100; Moji_Joho; MJ033638
21764 E0101; Hanyo-Denshi; KS073700
21764 E0101; Moji_Joho; MJ057303
217B5 E0100; MSARG; MA_96FD
217B5 E0101; MSARG; ME_217B5_001
21800 E0100; Hanyo-Denshi; TK01020690
21800 E0101; Hanyo-Denshi; TK01020760
21898 E0100; Hanyo-Denshi; KS077190
@ -36168,6 +36284,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
21D45 E0100; Adobe-Japan1; CID+17545
21D58 E0100; Hanyo-Denshi; KS089870
21D58 E0101; Hanyo-Denshi; TK01024500
21D5E E0100; MSARG; MA_876E
21D5E E0101; MSARG; ME_21D5E_001
21D62 E0100; Adobe-Japan1; CID+17547
21D78 E0100; Adobe-Japan1; CID+17546
21D92 E0100; Adobe-Japan1; CID+17556
@ -36660,6 +36778,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
23780 E0101; Hanyo-Denshi; JTB3B8
23780 E0101; Moji_Joho; MJ038377
23780 E0102; Hanyo-Denshi; TK01046760
237C2 E0100; MSARG; MA_FCF0
237C2 E0101; MSARG; ME_237C2_001
237E7 E0100; Adobe-Japan1; CID+17875
237E7 E0101; Hanyo-Denshi; JD1574
237E7 E0101; Moji_Joho; MJ038420
@ -36986,6 +37106,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
254C9 E0101; Hanyo-Denshi; TK01063840
254D9 E0100; Adobe-Japan1; CID+18217
2550E E0100; Adobe-Japan1; CID+17009
25584 E0100; MSARG; MA_93C3
25584 E0101; MSARG; ME_25584_001
255A7 E0100; Adobe-Japan1; CID+18229
25607 E0100; Hanyo-Denshi; IB0328
25607 E0100; Moji_Joho; MJ042965
@ -37297,6 +37419,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
263C1 E0101; Hanyo-Denshi; JTC0CA
263C1 E0101; Moji_Joho; MJ045051
263C1 E0102; Moji_Joho; MJ045050
263F9 E0100; MSARG; MD_263F9
263F9 E0101; MSARG; ME_263F9_001
26402 E0100; Adobe-Japan1; CID+18398
26405 E0100; Hanyo-Denshi; KS319810
26405 E0101; Hanyo-Denshi; TK01073730
@ -37311,12 +37435,18 @@ FA29 E0100; Adobe-Japan1; CID+8687
26408 E0102; Hanyo-Denshi; TK01073790
26409 E0100; Hanyo-Denshi; KS319910
26409 E0101; Hanyo-Denshi; TK01073770
26410 E0100; MSARG; MA_90CC
26410 E0101; MSARG; ME_26410_001
26439 E0100; MSARG; MD_26439
26439 E0101; MSARG; ME_26439_001
26462 E0100; Hanyo-Denshi; KS321040
26462 E0100; Moji_Joho; MJ045176
26462 E0101; Hanyo-Denshi; KS321270
26462 E0101; Moji_Joho; MJ045177
26489 E0100; Hanyo-Denshi; TK01007100
26489 E0101; Hanyo-Denshi; TK01074030
26489 E0102; MSARG; MA_8ECC
26489 E0103; MSARG; ME_26489_001
264B3 E0100; Hanyo-Denshi; KS322190
264B3 E0100; Moji_Joho; MJ058361
264B3 E0101; Hanyo-Denshi; KS322200S
@ -37581,6 +37711,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
27088 E0101; Hanyo-Denshi; TK01082110
270F0 E0100; Hanyo-Denshi; KS369820
270F0 E0101; Hanyo-Denshi; TK01082380
270F0 E0102; MSARG; MA_8FA8
270F0 E0103; MSARG; ME_270F0_001
270F4 E0100; Adobe-Japan1; CID+17103
270F4 E0101; Hanyo-Denshi; JC9141
270F4 E0101; Moji_Joho; MJ047259
@ -37698,6 +37830,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
2770F E0100; Moji_Joho; MJ048373
2770F E0101; Moji_Joho; MJ048374
27723 E0100; Adobe-Japan1; CID+18652
27741 E0100; MSARG; MA_94C7
27741 E0101; MSARG; ME_27741_001
27752 E0100; Adobe-Japan1; CID+18656
27753 E0100; Hanyo-Denshi; KS393290
27753 E0100; Moji_Joho; MJ048418
@ -39019,6 +39153,8 @@ FA29 E0100; Adobe-Japan1; CID+8687
2CF4C E0101; Moji_Joho; MJ056893
2CF4C E0102; Moji_Joho; MJ056894
2CF4C E0103; Moji_Joho; MJ056898
2CF7A E0100; MSARG; MC_00045
2CF7A E0101; MSARG; ME_2CF7A_001
2D020 E0100; Moji_Joho; MJ056969
2D020 E0101; Moji_Joho; MJ056970
2D028 E0100; Moji_Joho; MJ059338

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,6 @@
# SpecialCasing-13.0.0.txt
# Date: 2019-09-08, 23:31:24 GMT
# © 2019 Unicode®, Inc.
# SpecialCasing-14.0.0.txt
# Date: 2021-03-08, 19:35:55 GMT
# © 2021 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see http://www.unicode.org/terms_of_use.html
#

File diff suppressed because it is too large Load Diff

View File

@ -118,7 +118,7 @@ <h1>Unicode® Copyright and Terms of Use</h1>
<ol type="A">
<li><u><a name="1"></a>Unicode Copyright</u>
<ol>
<li>Copyright © 1991-2020 Unicode, Inc. All rights reserved.</li>
<li>Copyright © 1991-2021 Unicode, Inc. All rights reserved.</li>
</ol>
</li>

View File

@ -459,7 +459,7 @@ of character properties. In particular, Emacs supports the
@uref{https://www.unicode.org/reports/tr23/, Unicode Character Property
Model}, and the Emacs character property database is derived from the
Unicode Character Database (@acronym{UCD}). See the
@uref{https://www.unicode.org/versions/Unicode12.1.0/ch04.pdf, Character
@uref{https://www.unicode.org/versions/Unicode14.0.0/ch04.pdf, Character
Properties chapter of the Unicode Standard}, for a detailed
description of Unicode character properties and their meaning. This
section assumes you are already familiar with that chapter of the

View File

@ -128,6 +128,9 @@ of files visited via 'C-x C-f' and other commands.
* Changes in Emacs 28.1
---
** Emacs now supports Unicode Standard version 14.0.
+++
** New command 'execute-extended-command-for-buffer'.
This new command, bound to 'M-S-x', works like

View File

@ -214,6 +214,9 @@ with L, LRE, or LRO Unicode bidi character type.")
(modify-category-entry '(#x31F0 . #x31FF) ?K)
(modify-category-entry '(#x30A0 . #x30FA) ?\|)
(modify-category-entry #x30FF ?\|)
(modify-category-entry '(#x1AFF0 . #x1B000) ?K)
(modify-category-entry '(#x1B120 . #x1B122) ?K)
(modify-category-entry '(#x1B164 . #x1B167) ?K)
;; Hiragana block
(modify-category-entry '(#x3040 . #x309F) ?H)
@ -221,8 +224,12 @@ with L, LRE, or LRO Unicode bidi character type.")
(modify-category-entry #x309F ?\|)
(modify-category-entry #x30A0 ?H)
(modify-category-entry #x30FC ?H)
(modify-category-entry #x1B001 ?H)
(modify-category-entry #x1B11F ?H)
(modify-category-entry '(#x1B150 . #x1B152) ?H)
(modify-category-entry '(#x1B002 . #x1B11E) ?H) ; Hentiagana
(modify-category-entry '(#x1B000 . #x1B1FF) ?j)
(modify-category-entry '(#x1AFF0 . #x1B1FF) ?j)
;; JISX0208
@ -295,7 +302,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(map-charset-chars #'modify-category-entry (car charsets) ?b)
(setq charsets (cdr charsets))))
(modify-category-entry '(#x600 . #x6ff) ?b)
(modify-category-entry '(#x8a0 . #x8ff) ?b)
(modify-category-entry '(#x870 . #x8ff) ?b)
(modify-category-entry '(#xfb50 . #xfdff) ?b)
(modify-category-entry '(#xfe70 . #xfefe) ?b)
@ -306,7 +313,9 @@ with L, LRE, or LRO Unicode bidi character type.")
;; Ethiopic character set
(modify-category-entry '(#x1200 . #x1399) ?e)
(modify-category-entry '(#x2d80 . #x2dde) ?e)
(modify-category-entry '(#X2D80 . #X2DDE) ?e)
(modify-category-entry '(#xAB01 . #xAB2E) ?e)
(modify-category-entry '(#x1E7E0 . #x1E7FE) ?e)
(let ((chars '(?፡ ?። ?፣ ?፤ ?፥ ?፦ ?፧ ?፨)))
(while chars
(modify-syntax-entry (car chars) ".")
@ -580,6 +589,12 @@ with L, LRE, or LRO Unicode bidi character type.")
(modify-category-entry c ?l)
(setq c (1+ c)))
;; Latin Extended-G
(setq c #x1DF00)
(while (<= c #x1DFFF)
(modify-category-entry c ?l)
(setq c (1+ c)))
;; Greek
(modify-category-entry '(#x0370 . #x03FF) ?g)
@ -1016,7 +1031,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x0D41 . #x0D44)
(#x0D4D . #x0D4D)
(#x0D62 . #x0D63)
(#x0D81 . #x0D81)
(#x0D81 . #x0D81)
(#x0DCA . #x0DCA)
(#x0DD2 . #x0DD6)
(#x0E31 . #x0E31)
@ -1045,7 +1060,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x1085 . #x1086)
(#x108D . #x108D)
(#x109D . #x109D)
(#x1160 . #x11FF)
(#x1160 . #x11FF)
(#x135D . #x135F)
(#x1712 . #x1714)
(#x1732 . #x1734)
@ -1111,7 +1126,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#xA806 . #xA806)
(#xA80B . #xA80B)
(#xA825 . #xA826)
(#xA82C . #xA82C)
(#xA82C . #xA82C)
(#xA8C4 . #xA8C5)
(#xA8E0 . #xA8F1)
(#xA926 . #xA92D)
@ -1136,7 +1151,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#xABE5 . #xABE5)
(#xABE8 . #xABE8)
(#xABED . #xABED)
(#xD7B0 . #xD7FB)
(#xD7B0 . #xD7FB)
(#xFB1E . #xFB1E)
(#xFE00 . #xFE0F)
(#xFE20 . #xFE2F)
@ -1148,7 +1163,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x10A01 . #x10A0F)
(#x10A38 . #x10A3F)
(#x10AE5 . #x10AE6)
(#x10EAB . #x10EAC)
(#x10EAB . #x10EAC)
(#x11001 . #x11001)
(#x11038 . #x11046)
(#x1107F . #x11081)
@ -1162,7 +1177,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x11180 . #x11181)
(#x111B6 . #x111BE)
(#x111CA . #x111CC)
(#x111CF . #x111CF)
(#x111CF . #x111CF)
(#x1122F . #x11231)
(#x11234 . #x11234)
(#x11236 . #x11237)
@ -1194,9 +1209,9 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x1171D . #x1171F)
(#x11722 . #x11725)
(#x11727 . #x1172B)
(#x1193B . #x1193C)
(#x1193E . #x1193E)
(#x11943 . #x11943)
(#x1193B . #x1193C)
(#x1193E . #x1193E)
(#x11943 . #x11943)
(#x11C30 . #x11C36)
(#x11C38 . #x11C3D)
(#x11C92 . #x11CA7)
@ -1206,7 +1221,7 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x16AF0 . #x16AF4)
(#x16B30 . #x16B36)
(#x16F8F . #x16F92)
(#x16FE4 . #x16FE4)
(#x16FE4 . #x16FE4)
(#x1BC9D . #x1BC9E)
(#x1BCA0 . #x1BCA3)
(#x1D167 . #x1D169)
@ -1280,18 +1295,19 @@ with L, LRE, or LRO Unicode bidi character type.")
(#xFF01 . #xFF60)
(#xFFE0 . #xFFE6)
(#x16FE0 . #x16FE4)
(#x16FF0 . #x16FF1)
(#x16FF0 . #x16FF1)
(#x17000 . #x187F7)
(#x18800 . #x18AFF)
(#x18B00 . #x18CD5)
(#x18B00 . #x18CD5)
(#x1AFF0 . #x1AFFF)
(#x1B000 . #x1B152)
(#x1B164 . #x1B167)
(#x1B170 . #x1B2FB)
(#x1B164 . #x1B167)
(#x1B170 . #x1B2FB)
(#x1F004 . #x1F004)
(#x1F0CF . #x1F0CF)
(#x1F18E . #x1F18E)
(#x1F191 . #x1F19A)
(#x1F1AD . #x1F1AD)
(#x1F1AD . #x1F1AD)
(#x1F200 . #x1F320)
(#x1F32D . #x1F335)
(#x1F337 . #x1F37C)
@ -1316,27 +1332,26 @@ with L, LRE, or LRO Unicode bidi character type.")
(#x1F680 . #x1F6C5)
(#x1F6CC . #x1F6CC)
(#x1F6D0 . #x1F6D2)
(#x1F6D5 . #x1F6D7)
(#x1F6D5 . #x1F6D7)
(#x1F6DD . #x1F6DF)
(#x1F6EB . #x1F6EC)
(#x1F6F4 . #x1F6FC)
(#x1F7E0 . #x1F7EB)
(#x1F7E0 . #x1F7F0)
(#x1F90C . #x1F93A)
(#x1F93C . #x1F945)
(#x1F947 . #x1F978)
(#x1F97A . #x1F9CB)
(#x1F9A5 . #x1F9AA)
(#x1F9AE . #x1F9CA)
(#x1F9CD . #x1F9FF)
(#x1FA00 . #x1FA53)
(#x1FA60 . #x1FA6D)
(#x1FA70 . #x1FA74)
(#x1FA78 . #x1FA7A)
(#x1FA80 . #x1FA86)
(#x1FA90 . #x1FAA8)
(#x1FAB0 . #x1FAB6)
(#x1FAC0 . #x1FAC2)
(#x1FAD0 . #x1FAD6)
(#x1FB00 . #x1FB92)
(#x1F93C . #x1F945)
(#x1F947 . #x1F9FF)
(#x1FA00 . #x1FA53)
(#x1FA60 . #x1FA6D)
(#x1FA70 . #x1FA74)
(#x1FA78 . #x1FA7C)
(#x1FA80 . #x1FA86)
(#x1FA90 . #x1FAAC)
(#x1FAB0 . #x1FABA)
(#x1FAC0 . #x1FAC5)
(#x1FAD0 . #x1FAD9)
(#x1FAE0 . #x1FAE7)
(#x1FAF0 . #x1FAF6)
(#x1FB00 . #x1FB92)
(#x20000 . #x2FFFF)
(#x30000 . #x3FFFF))))
(dolist (elt l)

View File

@ -191,7 +191,7 @@
(kanbun #x319D)
(han #x5B57)
(yi #xA288)
(javanese #xA980)
(javanese #xA980)
(cham #xAA00)
(tai-viet #xAA80)
(hangul #xAC00)
@ -209,9 +209,10 @@
(deseret #x10400)
(shavian #x10450)
(osmanya #x10480)
(osage #x104B0)
(osage #x104B0)
(elbasan #x10500)
(caucasian-albanian #x10530)
(vithkuqi #x10570)
(linear-a #x10600)
(cypriot-syllabary #x10800)
(palmyrene #x10860)
@ -220,79 +221,84 @@
(lydian #x10920)
(kharoshthi #x10A00)
(manichaean #x10AC0)
(hanifi-rohingya #x10D00)
(yezidi #x10E80)
(old-sogdian #x10F00)
(sogdian #x10F30)
(chorasmian #x10FB0)
(elymaic #x10FE0)
(hanifi-rohingya #x10D00)
(yezidi #x10E80)
(old-sogdian #x10F00)
(sogdian #x10F30)
(chorasmian #x10FB0)
(elymaic #x10FE0)
(old-uyghur #x10F70)
(mahajani #x11150)
(sinhala-archaic-number #x111E1)
(khojki #x11200)
(khudawadi #x112B0)
(grantha #x11305)
(newa #x11400)
(newa #x11400)
(tirhuta #x11481)
(siddham #x11580)
(modi #x11600)
(takri #x11680)
(dogra #x11800)
(dogra #x11800)
(warang-citi #x118A1)
(dives-akuru #x11900)
(nandinagari #x119a0)
(zanabazar-square #x11A00)
(soyombo #x11A50)
(dives-akuru #x11900)
(nandinagari #x119a0)
(zanabazar-square #x11A00)
(soyombo #x11A50)
(pau-cin-hau #x11AC0)
(bhaiksuki #x11C00)
(marchen #x11C72)
(masaram-gondi #x11D00)
(gunjala-gondi #x11D60)
(makasar #x11EE0)
(bhaiksuki #x11C00)
(marchen #x11C72)
(masaram-gondi #x11D00)
(gunjala-gondi #x11D60)
(makasar #x11EE0)
(cuneiform #x12000)
(cuneiform-numbers-and-punctuation #x12400)
(cypro-minoan #x12F90)
(egyptian #x13000)
(mro #x16A40)
(tangsa #x16A70 #x16AC0)
(bassa-vah #x16AD0)
(pahawh-hmong #x16B11)
(medefaidrin #x16E40)
(tangut #x17000)
(tangut-components #x18800)
(khitan-small-script #x18B00)
(nushu #x1B170)
(medefaidrin #x16E40)
(tangut #x17000)
(tangut-components #x18800)
(khitan-small-script #x18B00)
(nushu #x1B170)
(duployan-shorthand #x1BC20)
(znamenny-musical-notation #x1CF00 #x1CF42 #x1CF50)
(byzantine-musical-symbol #x1D000)
(musical-symbol #x1D100)
(ancient-greek-musical-notation #x1D200)
(tai-xuan-jing-symbol #x1D300)
(counting-rod-numeral #x1D360)
(nyiakeng-puachue-hmong #x1e100)
(wancho #x1e2c0)
(nyiakeng-puachue-hmong #x1e100)
(toto #x1E290)
(wancho #x1e2c0)
(mende-kikakui #x1E810)
(adlam #x1E900)
(indic-siyaq-number #x1ec71)
(ottoman-siyaq-number #x1ed01)
(adlam #x1E900)
(indic-siyaq-number #x1ec71)
(ottoman-siyaq-number #x1ed01)
(mahjong-tile #x1F000)
(domino-tile #x1F030)))
(defvar otf-script-alist)
;; The below was synchronized with the latest Aug 16, 2018 version of
;; The below was synchronized with the latest Oct 8, 2020 version of
;; https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags
(setq otf-script-alist
'((adlm . adlam)
(ahom . ahom)
(hluw . anatolian)
(arab . arabic)
(ahom . ahom)
(hluw . anatolian)
(arab . arabic)
(armi . aramaic)
(armn . armenian)
(avst . avestan)
(bali . balinese)
(bamu . bamum)
(bass . bassa-vah)
(bass . bassa-vah)
(batk . batak)
(bng2 . bengali)
(beng . bengali)
(bhks . bhaiksuki)
(bhks . bhaiksuki)
(bopo . bopomofo)
(brah . brahmi)
(brai . braille)
@ -301,10 +307,11 @@
(byzm . byzantine-musical-symbol)
(cans . canadian-aboriginal)
(cari . carian)
(aghb . caucasian-albanian)
(aghb . caucasian-albanian)
(cakm . chakma)
(cham . cham)
(cher . cherokee)
(chrs . chorasmian)
(copt . coptic)
(xsux . cuneiform)
(cprt . cypriot)
@ -312,29 +319,31 @@
(dsrt . deseret)
(deva . devanagari)
(dev2 . devanagari)
(dogr . dogra)
(dupl . duployan-shorthand)
(diak . dives-akuru)
(dogr . dogra)
(dupl . duployan-shorthand)
(egyp . egyptian)
(elba . elbasan)
(elba . elbasan)
(elym . elymaic)
(ethi . ethiopic)
(geor . georgian)
(glag . glagolitic)
(goth . gothic)
(gran . grantha)
(gran . grantha)
(grek . greek)
(gujr . gujarati)
(gjr2 . gujarati)
(gong . gunjala-gondi)
(gong . gunjala-gondi)
(guru . gurmukhi)
(gur2 . gurmukhi)
(hani . han)
(hang . hangul)
(jamo . hangul)
(rohg . hanifi-rohingya)
(rohg . hanifi-rohingya)
(hano . hanunoo)
(hatr . hatran)
(hatr . hatran)
(hebr . hebrew)
(hung . old-hungarian)
(hung . old-hungarian)
(phli . inscriptional-pahlavi)
(prti . inscriptional-parthian)
(java . javanese)
@ -344,77 +353,79 @@
(kana . kana) ; Hiragana
(kali . kayah-li)
(khar . kharoshthi)
(kits . khitan-small-script)
(khmr . khmer)
(khoj . khojki)
(sind . khudawadi)
(khoj . khojki)
(sind . khudawadi)
(lao\ . lao)
(latn . latin)
(lepc . lepcha)
(limb . limbu)
(lina . linear_a)
(linb . linear_b)
(lisu . lisu)
(lyci . lycian)
(lydi . lydian)
(mahj . mahajani)
(maka . makasar)
(marc . marchen)
(lisu . lisu)
(lyci . lycian)
(lydi . lydian)
(mahj . mahajani)
(maka . makasar)
(marc . marchen)
(mlym . malayalam)
(mlm2 . malayalam)
(mand . mandaic)
(mani . manichaean)
(gonm . masaram-gondi)
(mani . manichaean)
(gonm . masaram-gondi)
(math . mathematical)
(medf . medefaidrin)
(medf . medefaidrin)
(mtei . meetei-mayek)
(mend . mende-kikakui)
(mend . mende-kikakui)
(merc . meroitic)
(mero . meroitic)
(plrd . miao)
(modi . modi)
(plrd . miao)
(modi . modi)
(mong . mongolian)
(mroo . mro)
(mult . multani)
(mroo . mro)
(mult . multani)
(musc . musical-symbol)
(mym2 . burmese)
(mymr . burmese)
(nbat . nabataean)
(newa . newa)
(nbat . nabataean)
(newa . newa)
(nko\ . nko)
(nshu . nushu)
(nshu . nushu)
(hmnp . nyiakeng-puachue-hmong)
(ogam . ogham)
(olck . ol-chiki)
(ital . old_italic)
(xpeo . old_persian)
(narb . old-north-arabian)
(perm . old-permic)
(sogo . old-sogdian)
(ital . old-italic)
(xpeo . old-persian)
(narb . old-north-arabian)
(perm . old-permic)
(sogo . old-sogdian)
(sarb . old-south-arabian)
(orkh . old-turkic)
(orya . oriya)
(ory2 . oriya)
(osge . osage)
(osge . osage)
(osma . osmanya)
(hmng . pahawh-hmong)
(palm . palmyrene)
(pauc . pau-cin-hau)
(hmng . pahawh-hmong)
(palm . palmyrene)
(pauc . pau-cin-hau)
(phag . phags-pa)
(phli . inscriptional-pahlavi)
(phli . inscriptional-pahlavi)
(phnx . phoenician)
(phlp . psalter-pahlavi)
(prti . inscriptional-parthian)
(phlp . psalter-pahlavi)
(prti . inscriptional-parthian)
(rjng . rejang)
(runr . runic)
(samr . samaritan)
(saur . saurashtra)
(shrd . sharada)
(shaw . shavian)
(sidd . siddham)
(sgnw . sutton-sign-writing)
(sidd . siddham)
(sgnw . sutton-sign-writing)
(sinh . sinhala)
(sogd . sogdian)
(sogd . sogdian)
(sora . sora-sompeng)
(soyo . soyombo)
(soyo . soyombo)
(sund . sundanese)
(sylo . syloti_nagri)
(syrc . syriac)
@ -427,19 +438,21 @@
(takr . takri)
(taml . tamil)
(tml2 . tamil)
(tang . tangut)
(tang . tangut)
(telu . telugu)
(tel2 . telugu)
(thaa . thaana)
(thai . thai)
(tibt . tibetan)
(tfng . tifinagh)
(tirh . tirhuta)
(tirh . tirhuta)
(ugar . ugaritic)
(vai\ . vai)
(wara . warang-citi)
(yi\ \ . yi)
(zanb . zanabazar-square)))
(wcho . wancho)
(wara . warang-citi)
(yezi . yezidi)
(yi\ \ . yi)
(zanb . zanabazar-square)))
;; Set standard fontname specification of characters in the default
;; fontset to find an appropriate font for each script/charset. The
@ -740,6 +753,7 @@
shavian
osmanya
osage
vithkuqi
cypriot-syllabary
phoenician
lydian
@ -748,19 +762,23 @@
manichaean
chorasmian
elymaic
old-uyghur
makasar
dives-akuru
cuneiform-numbers-and-punctuation
cuneiform
egyptian
tangsa
bassa-vah
pahawh-hmong
medefaidrin
znamenny-musical-notation
byzantine-musical-symbol
musical-symbol
ancient-greek-musical-notation
tai-xuan-jing-symbol
counting-rod-numeral
toto
adlam
mahjong-tile
domino-tile))

View File

@ -3094,15 +3094,15 @@ on encoding."
;; (#x18800 . #x18AFF) Tangut Components
;; (#x18B00 . #x18CFF) Khitan Small Script
;; (#x18D00 . #x18D0F) Tangut Ideograph Supplement
;; (#x18D10 . #x1AFFF) unused
(#x1B000 . #x1B11F)
;; (#x1B120 . #x1B14F) unused
;; (#x18D10 . #x1AFEF) unused
(#x1AFF0 . #x1B12F)
;; (#x1B130 . #x1B14F) unused
(#x1B150 . #x1B16F)
(#x1B170 . #x1B2FF)
;; (#x1B300 . #x1BBFF) unused
(#x1BC00 . #x1BCAF)
;; (#x1BCB0 . #x1CFFF) unused
(#x1D000 . #x1FFFF)
;; (#x1BCB0 . #x1CEFF) unused
(#x1CF00 . #x1FFFF)
;; (#x20000 . #xDFFFF) CJK Ideograph Extension A, B, etc, unused
(#xE0000 . #xE01FF)))
(gc-cons-threshold (max gc-cons-threshold 10000000))

View File

@ -181,27 +181,34 @@ implementations:
(should-not (ucs-normalize-tests--rule1-failing-for-partX 0)))
(defconst ucs-normalize-tests--failing-lines-part1
(list 2152 2418 15133 15134 15135 15136 15137 15138
15139 15140 15141 15142 16152 16153 16154 16155
16156 16157 16158 16159 16160 16161 16162 16163
16164 16165 16166 16167 16168 16169 16170 16171
16172 16173 16174 16175 16176 16177 16178 16179
16180 16181 16182 16183 16184 16185 16186 16187
16188 16189 16190 16191 16192 16193 16194 16195
16196 16197 16198 16199 16200 16201 16202 16203
16204 16205 16206 16207 16208 16209 16210 16211
16212 16213 16214 16215 16216 16217 16218 16219
16220 16221 16222 16223 16224 16225 16226 16227
16228 16229 16230 16231 16232 16233 16234 16235
16236 16237 16238 16239 16240 16241 16242 16243
16244 16245 16246 16247 16248 16249 16250 16251
16252 16253 16254 16255 16256 16257 16258 16259
16260 16261 16262 16263 16264 16265 16266 16267
16268 16269 16270 16271 16272 16273 16274 16275
16276 16277 16278 16279 16280 16281 16282 16283
16284 16285 16286 16287 16288 16289 16290 16291
16292 16429 16430 16431 16432 16433 16434 16435
16436 16437 16438))
(list 2412 2413 2414 15133 15134 15135 15136 15137
15138 15139 15140 15141 15142 15143 15144 15145
15146 15147 15148 15149 15150 15151 15152 15153
15154 15155 15156 15157 15158 15159 15160 15161
15162 15163 15164 15165 15166 15167 15168 15169
15170 15171 15172 15173 15174 15175 15176 15177
15178 15179 15180 15181 15182 15183 15184 15185
15186 15187 15188 15192 15193 15194 15195 15196
15197 15198 15199 15200 15201 16211 16212 16213
16214 16215 16216 16217 16218 16219 16220 16221
16222 16223 16224 16225 16226 16227 16228 16229
16230 16231 16232 16233 16234 16235 16236 16237
16238 16239 16240 16241 16242 16243 16244 16245
16246 16247 16248 16249 16250 16251 16252 16253
16254 16255 16256 16257 16258 16259 16260 16261
16262 16263 16264 16265 16266 16267 16268 16269
16270 16271 16272 16273 16274 16275 16276 16277
16278 16279 16280 16281 16282 16283 16284 16285
16286 16287 16288 16289 16290 16291 16292 16293
16294 16295 16296 16297 16298 16299 16300 16301
16302 16303 16304 16305 16306 16307 16308 16309
16310 16311 16312 16313 16314 16315 16316 16317
16318 16319 16320 16321 16322 16323 16324 16325
16326 16327 16328 16329 16330 16331 16332 16333
16334 16335 16336 16337 16338 16339 16340 16341
16342 16343 16344 16345 16346 16347 16348 16349
16350 16351 16488 16489 16490 16491 16492 16493
16494 16495 16496 16497))
;; Keep a record of failures, for consulting afterwards (the ert
;; backtrace only shows a truncated version of these lists).
@ -260,28 +267,76 @@ implementations:
ucs-normalize-tests--failing-lines-part1)))
(defconst ucs-normalize-tests--failing-lines-part2
(list 17634 17635 17646 17647 17652 17653 17656 17657
17660 17661 17672 17673 17750 17751 17832 17834
17836 17837 17862 17863 17868 17869 18222 18270
18271 18368 18370 18400 18401 18402 18404 18406
18408 18410 18412 18413 18414 18416 18417 18418
18420 18421 18422 18423 18424 18426 18427 18428
18429 18430 18432 18434 18436 18438 18440 18442
18444 18446 18448 18450 18452 18454 18456 18458
18459 18460 18462 18464 18465 18466 18468 18469
18470 18472 18474 18475 18476 18478 18480 18481
18482 18484 18486 18487 18488 18490 18492 18494
18496 18498 18499 18500 18502 18504 18506 18508
18510 18512 18514 18516 18518 18520 18522 18524
18526 18528 18530 18531 18532 18533 18534 18602
18604 18606 18608 18610 18612 18614 18616 18618
18620 18622 18624 18626 18628 18630 18632 18634
18636 18638 18640 18642 18644 18646 18648 18650
18652 18654 18656 18658 18660 18662 18664 18666
18668 18670 18672 18674 18676 18678 18680 18682
18684 18686 18688 18690 18692 18694 18696 18698
18700 18702 18704 18706 18708 18710 18712 18714
18716 18718 18720 18722 18724 18726 18727))
(list 17087 17088 17089 17090 17091 17092 17093 17094
17098 17099 17100 17101 17102 17103 17104 17105
17106 17107 17108 17113 17114 17115 17116 17117
17118 17119 17120 17125 17126 17127 17128 17129
17130 17131 17132 17133 17134 17135 17136 17137
17138 17139 17140 17141 17142 17143 17144 17145
17146 17157 17158 17159 17160 17161 17162 17163
17164 17185 17186 17187 17188 17189 17190 17197
17198 17199 17200 17207 17208 17209 17210 17211
17212 17213 17214 17219 17220 17221 17222 17275
17276 17285 17286 17295 17296 17309 17310 17311
17312 17313 17314 17315 17316 17317 17318 17319
17320 17325 17326 17373 17374 17419 17420 17421
17422 17433 17434 17439 17440 17465 17466 17473
17474 17479 17480 17485 17486 17491 17492 17497
17498 17499 17500 17501 17502 17505 17506 17507
17508 17511 17512 17519 17520 17523 17524 17527
17528 17531 17532 17551 17552 17555 17556 17599
17600 17601 17602 17603 17604 17605 17607 17608
17609 17610 17611 17612 17613 17615 17617 17619
17621 17623 17625 17627 17629 17631 17632 17633
17634 17635 17636 17637 17638 17639 17640 17669
17670 17675 17676 17681 17682 17689 17690 17691
17692 17693 17694 17707 17708 17713 17714 17715
17716 17727 17728 17733 17734 17739 17740 17745
17746 17749 17750 17753 17754 17759 17760 17767
17768 17807 17808 17809 17810 17811 17812 17813
17814 17816 17843 17844 17845 17846 17851 17852
17861 17875 17876 17879 17880 17899 17900 17911
17912 17913 17914 17915 17916 17917 17918 17919
17920 17921 17922 17927 17928 17929 17930 17931
17932 17933 17935 17937 17938 17939 17940 17941
17943 17945 17947 17949 17951 17952 17953 17955
17957 17959 17961 17962 17967 17968 17987 17988
17993 17994 18003 18004 18005 18006 18007 18008
18009 18010 18011 18012 18017 18018 18019 18020
18021 18022 18023 18024 18041 18042 18053 18054
18069 18070 18079 18080 18163 18164 18165 18166
18171 18172 18175 18176 18211 18212 18219 18220
18221 18222 18223 18224 18225 18226 18301 18302
18389 18390 18391 18392 18393 18394 18397 18398
18407 18408 18439 18440 18441 18442 18443 18444
18445 18446 18447 18448 18449 18450 18451 18452
18457 18458 18459 18460 18471 18472 18479 18480
18485 18486 18499 18500 18501 18502 18509 18510
18513 18514 18515 18516 18517 18518 18519 18520
18521 18523 18524 18525 18527 18528 18531 18537
18538 18539 18541 18543 18545 18547 18549 18550
18551 18553 18554 18555 18557 18558 18559 18560
18561 18563 18564 18565 18566 18567 18569 18571
18573 18575 18577 18579 18581 18583 18585 18587
18589 18591 18593 18595 18596 18597 18599 18601
18602 18603 18605 18606 18607 18609 18611 18612
18613 18615 18617 18618 18619 18621 18623 18624
18625 18627 18629 18631 18633 18635 18636 18637
18639 18641 18643 18645 18647 18649 18651 18653
18655 18657 18659 18661 18663 18665 18667 18668
18669 18670 18671 18674 18676 18686 18688 18690
18692 18694 18695 18696 18697 18698 18699 18700
18701 18702 18703 18704 18705 18706 18707 18708
18709 18710 18721 18722 18723 18724 18739 18741
18743 18745 18747 18749 18751 18753 18755 18757
18759 18761 18763 18765 18767 18769 18771 18773
18775 18777 18779 18781 18783 18785 18787 18789
18791 18793 18795 18797 18799 18801 18803 18805
18807 18809 18811 18813 18815 18817 18819 18821
18823 18825 18827 18829 18831 18833 18835 18837
18839 18840 18841 18842 18843 18844 18845 18846
18847 18848 18849 18850 18851 18852 18853 18855
18857 18859 18861 18863 18865 18866))
(ert-deftest ucs-normalize-part2 ()
:tags '(:expensive-test)

View File

@ -1,6 +1,6 @@
# BidiCharacterTest-13.0.0.txt
# Date: 2019-09-09, 19:32:00 GMT [LI]
# © 2019 Unicode®, Inc.
# BidiCharacterTest-14.0.0.txt
# Date: 2020-03-30, 23:56:00 GMT [LI]
# © 2020 Unicode®, Inc.
# For terms of use, see http://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
@ -87,6 +87,32 @@
0661 0028 0662 0029 0331;0;0;2 1 2 1 1;4 3 2 1 0
0661 0028 0332 0662 0029 0333;0;0;2 1 1 2 1 1;5 4 3 2 1 0
# Nonspacing marks applied to paired brackets [added to test cases for Unicode 14.0]
# These cases exercise the ignoring of bc=BN characters (such as ZWJ or ZWSP)
# that appear between the base bracket character and the nonspacing mark,
# in a context where the brackets have been forced to a strong R direction.
#
# Note that due to an implementation error in the N0 rule in the Bidi Reference C
# test code for UBA 8.0, versions of that reference test code through UBA 12.0 will fail for
# precisely these newly added tests. The bug in the implementation of the N0 rule in the Bidi Reference C
# test code was fixed for Unicode 13.0, and that updated test code now performs correctly
# for all versions of UBA.
#
# These test cases first test a combining mark following a ZWJ after the trailing bracket of a pair:
0041 200F 005B 05D0 005D 200D 20D6;0;0;0 1 1 1 1 x 1;0 6 4 3 2 1
0041 200F 005B 05D0 005D 200D 20D6;1;1;2 1 1 1 1 x 1;6 4 3 2 1 0
# Then a combining mark following a ZWJ after the leading bracket of a pair:
0041 200F 005B 200D 20D6 05D0 005D;0;0;0 1 1 x 1 1 1;0 6 5 4 2 1
0041 200F 005B 200D 20D6 05D0 005D;1;1;2 1 1 x 1 1 1;6 5 4 2 1 0
# Then a combining mark following a ZWJ after both brackets of a pair:
0041 200F 005B 200D 20D6 05D0 005D 200D 20D6;0;0;0 1 1 x 1 1 1 x 1;0 8 6 5 4 2 1
0041 200F 005B 200D 20D6 05D0 005D 200D 20D6;1;1;2 1 1 x 1 1 1 x 1;8 6 5 4 2 1 0
# Then the intervention of a ZWSP in these same sequences.
# (The ZWSP formally breaks the combining character sequence, but should
# not block the identification of the combining mark for the application of rule N0.)
0041 200F 005B 200D 200B 20D6 05D0 005D 200B 200D 20D6;0;0;0 1 1 x x 1 1 1 x x 1;0 10 7 6 5 2 1
0041 200F 005B 200D 200B 20D6 05D0 005D 200B 200D 20D6;1;1;2 1 1 x x 1 1 1 x x 1;10 7 6 5 2 1 0
# Nested bracket pairs that reach and exceed the fixed capacity of the bracket stack
# a ( ( ... ( b ) ) ... ) with 62, 63, and 64 nested bracket pairs
0061 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0028 0062 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029 0029;1;1;2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2;0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125