emacs

mirror of git://git.sv.gnu.org/emacs.git synced 2026-05-25 22:59:44 -07:00

Author	SHA1	Message	Date
Robert Pluim	de289d58a4	Support for Unicode emoji sequences This covers both sequences using Zero-Width-Joiner codepoints and those without. Bug#39799, I hope. * .gitignore: Add emoji-zwj.el * admin/notes/unicode: Add emoji-zwj-sequences.txt and emoji-sequences.txt references. Describe how to test after updating to a newer Unicode version. * admin/unidata/Makefile.in (all): add emoji-zwj.el as a dependency. (emoji-zwj.el): Add target plus rules for building. (gen-clean): Add emoji-zwj.el. * admin/unidata/README: Add emoji-zwj-sequences.txt and emoji-sequences.txt references. * admin/unidata/blocks.awk: Force emoji script to be used for certain codepoints that are used by the Unicode sequences. * admin/unidata/emoji-sequences.txt: New file. * admin/unidata/emoji-zwj-sequences.txt: New file. * admin/unidata/emoji-zwj.awk: New file. Derives composition-function-table rules from emoji-zwj-sequences.txt, plus hardcodes some derived manually from emoji-sequences.txt. * etc/NEWS: Announce change. * lisp/international/characters.el: Load the generated emoji-zwj.el * src/Makefile.in (emoji-zwj): New target. (temacs): Add emoji-zwj as a dependency.	2021-09-20 22:35:34 +02:00
Glenn Morris	6825f5660f	; admin/unidata/README: remove mistaken addition of local file	2021-09-20 10:47:02 -07:00
Glenn Morris	ab676214bd	; admin/unidata/README: sort entries	2021-09-20 08:43:10 -07:00
Glenn Morris	58f3370091	; admin/unidata/README: update file dates I'm not sure how useful it is to keep this information in the README. Also, add missing EastAsianWidth.txt.	2021-09-20 08:41:56 -07:00
Robert Pluim	12d2fb58c4	Split Unicode emoji into their own script * admin/notes/unicode: Describe how to update emoji for new Unicode release. * admin/unidata/Makefile.in: Pass emoji-data.txt to blocks.awk script. * admin/unidata/README: Add pointer to emoji-data.txt file. * admin/unidata/blocks.awk: Parse emoji-data.txt, add emoji codepoints to the 'emoji' script (except for the ASCII ones). * admin/unidata/emoji-data.txt: New file. * etc/NEWS: Describe new 'emoji' script. * etc/TODO: Update item about 'emoji' script. * lisp/international/fontset.el (script-representative-chars): Add 'emoji' script. (setup-default-fontset): Add 'emoji' script. Use "Noto Color Emoji" as default font for it.	2021-09-17 14:45:44 +02:00
Eli Zaretskii	fd3bcfa36e	Update Unicode data and files to Unicode 10.0 * admin/notes/unicode: * admin/unidata/README: * admin/unidata/BidiBrackets.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/Blocks.txt: * admin/unidata/IVD_Sequences.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/UnicodeData.txt: * lisp/international/characters.el: * lisp/international/fontset.el (script-representative-chars): * lisp/international/mule-cmds.el (ucs-names): Update per Unicode 10.0.	2017-07-08 13:02:47 +03:00
Michal Nazarewicz	b3b9b258c4	Support casing characters which map into multiple code points (bug#24603) Implement unconditional special casing rules defined in Unicode standard. Among other things, they deal with cases when a single code point is replaced by multiple ones because single character does not exist (e.g. ‘ﬁ’ ligature turning into ‘FL’) or is not commonly used (e.g. ß turning into SS). * admin/unidata/SpecialCasing.txt: New data file pulled from Unicode standard distribution. * admin/unidata/README: Mention SpecialCasing.txt. * admin/unidata/unidata-get.el (unidata-gen-table-special-casing, unidata-gen-table-special-casing--do-load): New functions generating ‘special-uppercase’, ‘special-lowercase’ and ‘special-titlecase’ character Unicode properties built from the SpecialCasing.txt Unicode data file. * src/casefiddle.c (struct casing_str_buf): New structure for representing short strings used to handle one-to-many character mappings. (case_character_imlp): New function which can handle one-to-many character mappings. (case_character, case_single_character): Wrappers for the above functions. The former may map one character to multiple (or no) code points while the latter does what the former used to do (i.e. handles one-to-one mappings only). (do_casify_natnum, do_casify_unibyte_string, do_casify_unibyte_region): Use case_single_character. (do_casify_multibyte_string, do_casify_multibyte_region): Support new features of case_character. * (do_casify_region): Updated to reflact do_casify_multibyte_string changes. (casify_word): Handle situation when one character-length of a word can change affecting where end of the word is. (upcase, capitalize, upcase-initials): Update documentation to mention limitations when working on characters. * test/src/casefiddle-tests.el (casefiddle-tests-char-properties): Add test cases for the newly introduced character properties. (casefiddle-tests-casing): Update test cases which are now passing. * test/lisp/char-fold-tests.el (char-fold--ascii-upcase, char-fold--ascii-downcase): New functions which behave like old ‘upcase’ and ‘downcase’. (char-fold--test-match-exactly): Use the new functions. This is needed because otherwise ﬁ and similar characters are turned into their multi- -character representation. * doc/lispref/strings.texi: Describe issue with casing characters versus strings. * doc/lispref/nonascii.texi: Describe the new character properties.	2017-04-06 20:54:58 +02:00
Noam Postavsky	eed3b46ca1	Add tests for ucs-normalize.el Some tests are marked as expected to fail. * test/lisp/international/ucs-normalize-tests.el: New tests. * admin/unidata/NormalizationTest.txt: Add data for tests. * admin/unidata/README: Add URL for NormalizationTest.txt. * admin/notes/unicode: Add note about running (and updating the data for) the new tests. Remove note about normalization being unsupported.	2016-07-16 13:01:04 -04:00
Glenn Morris	d67d49ceb3	Generate char-script-table from Unicode source. (Bug#20789) * admin/unidata/Makefile.in (AWK): New, set by configure. (all): Add charscript.el. (blocks): New variable. (charscript.el, ${unidir}/charscript.el): New targets. (extraclean): Also remove generated charscript.el. * admin/unidata/blocks.awk: New script. * admin/unidata/Blocks.txt: New data file, from unicode.org. * lisp/international/characters.el: Load charscript. * src/Makefile.in (charscript): New variable. (${charscript}): New target. (${lispintdir}/characters.elc): Depend on charscript.elc. (temacs$(EXEEXT)): Depend on charscript. ; * admin/unidata/README: Mention Blocks.txt. ; * .gitignore: Add lisp/international/charscript.el.	2015-06-16 23:43:03 -07:00
Glenn Morris	38852a7695	Update admin/unidata data files to latest versions * admin/unidata/BidiMirroring.txt: Update to 7.0.0 (only comment changes). * admin/unidata/UnicodeData.txt: Update to 7.0.0. * admin/unidata/IVD_Sequences.txt: Update to 2014-05-16 version. * admin/unidata/README: Update for above changes.	2014-06-21 15:06:04 -07:00
Paul Eggert	cf2f54c4e3	Include sources used to create macuvs.h. * admin/unidata/IVD_Sequences.txt: New file. * admin/unidata/Makefile.in (${top_srcdir}/src/macuvs.h): New rule. (all): Build it. (extraclean): Remove it. * admin/unidata/README: Mention BidiMirroring.txt and IVD_Sequences.txt. * admin/unidata/copyright.html: Update to current version from Unicode Consortium. * admin/unidata/uvs.el: Rename from admin/mac/uvs.el. (uvs-print-table-ivd): Output a header in the form that unidata-gen.el generates. * lisp/international/README: Refer to the Unicode Terms of Use rather than copying it bodily here, as that simplifies maintenance. * src/Makefile.in ($(srcdir)/macuvs.h): New rule. * src/macuvs.h: Use automatically-generated header.	2014-05-26 08:48:28 -07:00
Eli Zaretskii	b142f1584c	Update the Unicode database and derived files for Unicode 6.1. admin/unidata/README: admin/unidata/copyright.html: admin/unidata/BidiMirroring.txt: admin/unidata/UnicodeData.txt: Update for the latest version 6.1 of the Unicode Standard. lisp/international/uni-bidi.el: lisp/international/uni-category.el: lisp/international/uni-combining.el: lisp/international/uni-decimal.el: lisp/international/uni-decomposition.el: lisp/international/uni-digit.el: lisp/international/uni-lowercase.el: lisp/international/uni-mirrored.el: lisp/international/uni-name.el: lisp/international/uni-numeric.el: lisp/international/uni-titlecase.el: lisp/international/uni-uppercase.el: Update for Unicode 6.1.	2012-04-07 17:26:14 +03:00
Kenichi Handa	5ccee76964	* empty log message *	2009-10-13 05:17:40 +00:00
Kenichi Handa	203c8e22b8	Adjusted for Unicode 5.0.	2006-08-21 03:18:42 +00:00
Kenichi Handa	bf903420b4	* empty log message *	2005-05-07 02:55:01 +00:00
Kenichi Handa	f600cf3af9	New file.	2005-01-30 11:22:05 +00:00

16 commits