1
0
mirror of https://git.savannah.gnu.org/git/emacs.git synced 2024-11-22 07:09:54 +00:00
emacs/admin/unidata/README

57 lines
1.6 KiB
Plaintext
Raw Normal View History

Some files in this directory are taken from the Unicode Character
Database and the Unicode Ideographic Variation Database. These files
are governed by the Unicode Terms of Use contained in the file
copyright.html.
The names and URLs for these files are as follows. Each file (with the
exception of UnicodeData.txt) contains the date at which Unicode last
updated it.
BidiBrackets.txt
2022-10-15 09:17:51 +00:00
https://www.unicode.org/Public/UNIDATA/BidiBrackets.txt
BidiMirroring.txt
2022-10-15 09:17:51 +00:00
https://www.unicode.org/Public/UNIDATA/BidiMirroring.txt
Blocks.txt
https://www.unicode.org/Public/UNIDATA/Blocks.txt
IVD_Sequences.txt (accessed via the date in the 'Version' column)
2022-10-15 09:17:51 +00:00
https://www.unicode.org/ivd/
2021-09-20 15:43:10 +00:00
NormalizationTest.txt
2022-10-15 09:17:51 +00:00
https://www.unicode.org/Public/UNIDATA/NormalizationTest.txt
Support casing characters which map into multiple code points (bug#24603) Implement unconditional special casing rules defined in Unicode standard. Among other things, they deal with cases when a single code point is replaced by multiple ones because single character does not exist (e.g. ‘fi’ ligature turning into ‘FL’) or is not commonly used (e.g. ß turning into SS). * admin/unidata/SpecialCasing.txt: New data file pulled from Unicode standard distribution. * admin/unidata/README: Mention SpecialCasing.txt. * admin/unidata/unidata-get.el (unidata-gen-table-special-casing, unidata-gen-table-special-casing--do-load): New functions generating ‘special-uppercase’, ‘special-lowercase’ and ‘special-titlecase’ character Unicode properties built from the SpecialCasing.txt Unicode data file. * src/casefiddle.c (struct casing_str_buf): New structure for representing short strings used to handle one-to-many character mappings. (case_character_imlp): New function which can handle one-to-many character mappings. (case_character, case_single_character): Wrappers for the above functions. The former may map one character to multiple (or no) code points while the latter does what the former used to do (i.e. handles one-to-one mappings only). (do_casify_natnum, do_casify_unibyte_string, do_casify_unibyte_region): Use case_single_character. (do_casify_multibyte_string, do_casify_multibyte_region): Support new features of case_character. * (do_casify_region): Updated to reflact do_casify_multibyte_string changes. (casify_word): Handle situation when one character-length of a word can change affecting where end of the word is. (upcase, capitalize, upcase-initials): Update documentation to mention limitations when working on characters. * test/src/casefiddle-tests.el (casefiddle-tests-char-properties): Add test cases for the newly introduced character properties. (casefiddle-tests-casing): Update test cases which are now passing. * test/lisp/char-fold-tests.el (char-fold--ascii-upcase, char-fold--ascii-downcase): New functions which behave like old ‘upcase’ and ‘downcase’. (char-fold--test-match-exactly): Use the new functions. This is needed because otherwise fi and similar characters are turned into their multi- -character representation. * doc/lispref/strings.texi: Describe issue with casing characters versus strings. * doc/lispref/nonascii.texi: Describe the new character properties.
2016-10-04 22:06:01 +00:00
SpecialCasing.txt
2022-10-15 09:17:51 +00:00
https://unicode.org/Public/UNIDATA/SpecialCasing.txt
2021-09-20 15:43:10 +00:00
UnicodeData.txt
2022-10-15 09:17:51 +00:00
https://www.unicode.org/Public/UNIDATA/UnicodeData.txt
2021-09-20 15:43:10 +00:00
emoji-data.txt
https://www.unicode.org/Public/UNIDATA/emoji/emoji-data.txt
emoji-zwj-sequences.txt
https://www.unicode.org/Public/emoji/latest/emoji-zwj-sequences.txt
emoji-sequences.txt
https://www.unicode.org/Public/emoji/latest/emoji-sequences.txt
emoji-test.txt
https://www.unicode.org/Public/emoji/latest/emoji-test.txt
emoji-variation-sequences.txt
https://www.unicode.org/Public/UNIDATA/emoji/emoji-variation-sequences.txt
ScriptExtensions.txt
https://www.unicode.org/Public/UCD/latest/ucd/ScriptExtensions.txt
Scripts.txt
https://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt
PropertyValueAliases.txt
https://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
IdnaMappingTable.txt
https://www.unicode.org/Public/idna/latest/IdnaMappingTable.txt