* src/character.c (syms_of_character) <ambiguous-width-chars>: New
char-table.
* lisp/international/characters.el (ambiguous-width-chars): Fill
the table.
(update-cjk-ambiguous-char-widths): New function.
(cjk-ambiguous-chars-are-wide): New defcustom, uses
'update-cjk-ambiguous-char-widths' as its :set function.
(use-cjk-char-width-table): Obey 'cjk-ambiguous-chars-are-wide' by
adding another child char-table for ambiguous-width characters,
where the width is set according to the option.
* lisp/language/chinese.el ("Chinese-GB", "Chinese-BIG5")
("Chinese-CNS", "Chinese-EUC-TW", "Chinese-GBK"):
* lisp/language/japanese.el ("Japanese"):
* lisp/language/korean.el ("Korean"): Add new language-info slot
'cjk-locale-symbol'.
Bug#64420
* src/character.c (Fmax_char): Accept an optional argument
UNICODE, and, if non-nil, return the maximum codepoint defined by
Unicode.
* lisp/emacs-lisp/comp.el (comp-known-type-specifiers): Update the
signature of 'max-char'.
* etc/NEWS:
* doc/lispref/nonascii.texi (Character Codes): Update the
documentation of 'max-char'.
* src/character.h (str_to_multibyte):
* src/character.c (str_to_multibyte): Change signature and simplify;
the conversion is no longer done in-place.
* src/fns.c (string_to_multibyte): Drop temporary buffer and memcpy;
adapt to new str_to_multibyte signature.
* src/print.c (print_string): Drop memcpy; adapt call to str_to_multibyte.
* test/src/fns-tests.el (fns--string-to-unibyte): Rename to...
(fns--string-to-unibyte-multibyte): ... this and strengthen, so that
the test covers string-to-multibyte reasonably well.
This function is used in many places to calculate the length of
a unibyte string converted to multibyte.
* src/character.c (count_size_as_multibyte): Move the overflow test
outside the loop, which makes it much faster. Standard compilers
will even vectorise it if asked to (-O2 in Clang, -O3 in GCC).
* src/composite.c (find_automatic_composition): Accept one
additional argument BACKLIM; don't look back in buffer or string
farther than that. Add an assertion for BACKLIM.
(composition_adjust_point, Ffind_composition_internal): Callers
adjusted.
* src/composite.h (find_automatic_composition): Adjust prototype.
* src/character.c (lisp_string_width): Call
'find_automatic_composition' with the value of BACKLIM equal to POS,
to avoid costly and unnecessary search back in the string, since
those previous characters were already checked for automatic
compositions. (Bug#48734) (Bug#48839)
'lisp_string_width' is called from 'format' and 'format-message',
which can be called both very early into Emacs initialization and in
other contexts where using the font backend is impossible or
undesirable. So this commit changes 'lisp_string_width' to try
accounting for automatic compositions only when explicitly requested,
and only 'string-width' does that; 'format' and 'format-message'
don't.
* src/character.c (lisp_string_width): Accept an additional
argument AUTO_COMP; attempt accounting for auto-compositions only
if that argument is non-zero. (Bug#48732)
* src/editfns.c (styled_format):
* src/character.c (Fstring_width): Callers of 'lisp_string_width'
adjusted.
* src/character.c (lisp_string_width): Compute C pointer to data
of STRING immediately before using it, since STRING could be
relocated by GC triggered by processing compositions. (Bug#48711)
* src/character.c (lisp_string_width): Compute the width when
automatic compositions can happen more accurately, by using the
pixel widths of the grapheme clusters, divided by the default
face's font width. Disregard the current state of
'auto-composition-mode', for consistency with 'current-column' .
* src/composite.c (find_automatic_composition): Now extern.
(char_composable_p): Don't assume 'unicode-category-table' is
always available.
* src/composite.h (find_automatic_composition): Add prototype.
* src/character.c (lisp_string_width): Support automatic
compositions; call 'find_automatic_composition' when
'auto-composition-mode' is ON.
* src/character.c (Fstring_width, lisp_string_width): Accept two
optional arguments FROM and TO, to indicate the substring to be
considered.
(Fstring_width): Add caveats in the doc string about display
features ignored by the function. (Bug#47712)
* src/character.h (lisp_string_width): Update prototype.
* src/editfns.c (styled_format): Adjust call of lisp_string_width
to its changed signature.
* test/src/character-tests.el (character-test-string-width): New
file with tests for 'string-width'.
* doc/lispref/display.texi (Size of Displayed Text): Document
caveats of using 'string-width'.
* etc/NEWS: Announce the change.
The variable now only controls whether characters are printed, not
the radix. Control chars are printed in human-readable syntax
only when special escapes such as ?\n are available. Spaces,
formatting and combining chars are excluded (bug#44155).
Done in collaboration with Juri Linkov.
* src/character.c (graphic_base_p):
* src/print.c (named_escape): New functions.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format. Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics. The test now passes.
This tweak improved the CPU time performance of
‘make compile-always’ by about 1.7% on my platform.
* src/character.c (string_char): Remove; no longer used.
* src/character.h (string_char_and_length): Redo so that it
needn’t call string_char. This helps the caller, which can now
become a leaf function.
* src/character.c (parse_str_as_multibyte, str_as_multibyte):
Let the fast loop run a little longer, fixing what appears
to be an off-by-1 performance nit.
* src/character.h (MULTIBYTE_LENGTH, MULTIBYTE_LENGTH_NO_CHECK):
Remove, replacing with ...
(multibyte_length): ... this new function. All callers changed.
The new function rejects overlong multibyte forms.
* test/src/buffer-tests.el (buffer-multibyte-overlong-sequences):
New test.
* src/buffer.h (fetch_char_advance, fetch_char_advance_no_check)
(buf_next_char_len, next_char_len, buf_prev_char_len)
(prev_char_len, inc_both, dec_both): New inline functions,
replacing the old character.h macros FETCH_CHAR_ADVANCE,
FETCH_CHAR_ADVANCE_NO_CHECK, BUF_INC_POS, INC_POS, BUF_DEC_POS,
DEC_POS, INC_BOTH, DEC_BOTH respectively. All callers changed.
These new functions all assume buffer primitives and so need
to be here rather than in character.h.
* src/casefiddle.c (make_char_unibyte): New static function,
replacing the old MAKE_CHAR_UNIBYTE macro. All callers changed.
(do_casify_unibyte_string): Use SINGLE_BYTE_CHAR_P instead
of open-coding it.
* src/ccl.c (GET_TRANSLATION_TABLE): New static function,
replacing the old macro of the same name.
* src/character.c (string_char): Omit 2nd arg. 3rd arg can no
longer be NULL. All callers changed.
* src/character.h (SINGLE_BYTE_CHAR_P): Move up.
(MAKE_CHAR_UNIBYTE, MAKE_CHAR_MULTIBYTE, PREV_CHAR_BOUNDARY)
(STRING_CHAR_AND_LENGTH, STRING_CHAR_ADVANCE)
(FETCH_STRING_CHAR_ADVANCE)
(FETCH_STRING_CHAR_AS_MULTIBYTE_ADVANCE)
(FETCH_STRING_CHAR_ADVANCE_NO_CHECK, FETCH_CHAR_ADVANCE)
(FETCH_CHAR_ADVANCE_NO_CHECK, INC_POS, DEC_POS, INC_BOTH)
(DEC_BOTH, BUF_INC_POS, BUF_DEC_POS): Remove.
(make_char_multibyte): New static function, replacing
the old macro MAKE_CHAR_MULTIBYTE. All callers changed.
(CHAR_STRING_ADVANCE): Remove; all callers changed to use
CHAR_STRING.
(NEXT_CHAR_BOUNDARY): Remove; it was unused.
(raw_prev_char_len): New inline function, replacing the
old PREV_CHAR_BOUNDARY macro. All callers changed.
(string_char_and_length): New inline function, replacing the
old STRING_CHAR_AND_LENGTH macro. All callers changed.
(STRING_CHAR): Rewrite in terms of string_char_and_length.
(string_char_advance): New inline function, replacing the old
STRING_CHAR_ADVANCE macro. All callers changed.
(fetch_string_char_advance): New inline function, replacing the
old FETCH_STRING_CHAR_ADVANCE macro. All callers changed.
(fetch_string_char_as_multibyte_advance): New inline function,
replacing the old FETCH_STRING_CHAR_AS_MULTIBYTE_ADVANCE macro.
All callers changed.
(fetch_string_char_advance_no_check): New inline function,
replacing the old FETCH_STRING_CHAR_ADVANCE_NO_CHECK macro. All
callers changed.
* src/regex-emacs.c (HEAD_ADDR_VSTRING): Remove; no longer used.
* src/syntax.c (scan_lists): Use dec_bytepos instead of
open-coding it.
* src/xdisp.c (string_char_and_length): Rename from
string_char_and_length to avoid name conflict with new function in
character.h. All callers changed.
* src/bignum.c (check_integer_range, check_uinteger_max)
(check_int_nonnegative): New functions.
* src/frame.c (check_frame_pixels): New function.
(Fset_frame_height, Fset_frame_width, Fset_frame_size): Use it.
* src/lisp.h (CHECK_RANGED_INTEGER, CHECK_TYPE_RANGED_INTEGER):
Remove these macros. Unless otherwise specified, all callers
replaced by calls to check_integer_range, check_uinteger_range,
check_int_nonnegative.
* src/frame.c (gui_set_right_divider_width)
(gui_set_bottom_divider_width):
* src/nsfns.m (ns_set_internal_border_width):
* src/xfns.c (x_set_internal_border_width):
Using check_int_nonnegative means these functions no longer
incorrectly reject negative bignums; they treat them as 0,
just like negative fixnums.
* src/character.c (Fstring, Funibyte_string):
Redo to avoid the need for a temporary array allocation
and then a copying from that array to the destination.
If a position argument to get-byte etc. is an out-of-range integer,
treat it the same regardless of whether it is a fixnum or a bignum.
* src/buffer.c (fix_position): New function.
* src/buffer.c (validate_region):
* src/character.c (Fget_byte):
* src/coding.c (Ffind_coding_systems_region_internal)
(Fcheck_coding_systems_region):
* src/composite.c (Ffind_composition_internal):
* src/editfns.c (Fposition_bytes, Fchar_after, Fchar_before)
(Finsert_buffer_substring, Fcompare_buffer_substrings)
(Fnarrow_to_region):
* src/fns.c (Fsecure_hash_algorithms):
* src/font.c (Finternal_char_font, Ffont_at):
* src/fringe.c (Ffringe_bitmaps_at_pos):
* src/search.c (search_command):
* src/textprop.c (get_char_property_and_overlay):
* src/window.c (Fpos_visible_in_window_p):
* src/xdisp.c (Fwindow_text_pixel_size):
Use it instead of CHECK_FIXNUM_COERCE_MARKER, so that
the code is simpler and treats bignums consistently with fixnums.
* src/buffer.h (CHECK_FIXNUM_COERCE_MARKER): Define here
rather than in lisp.h, and reimplement in terms of fix_position
so that it treats bignums consistently with fixnums.
* src/lisp.h (CHECK_FIXNUM_COERCE_MARKER): Move to buffer.h.
* src/textprop.c (validate_interval_range): Signal with original
bounds rather than modified ones.
The old code assumed that hexdigit initialization (needed by
non-GCC) could be done in syms_of_character, but that is no longer
true with pdumper. Instead, simplify hexdigit init so that it can
be done statically on all C99 platforms. Problem discovered on
Solaris 10 sparc + Oracle Solaris Studio 12.6.
* src/character.c (hexdigit): Add 1 to every value; all uses
changed. This simplifies the initialization so that it can be
done statically on any C99 compiler. hexdigit is now always const.
(syms_of_character): Omit no-longer-necessary initialization.
* src/character.h (HEXDIGIT_CONST, HEXDIGIT_IS_CONST):
Remove. All uses removed.
This mostly changes http: to https: in URLs. It also updates
some URLs that have moved, removes some URLs that no longer
work, recommends against using procmail (procmail.org no
longer works), and removes some mentions of the
no-longer-existing Gmane, LPF and VTW.
It doesn't update all URLs, just the ones I had time for.
* GNUmakefile (help):
* admin/admin.el (manual-doctype-string):
* admin/charsets/Makefile.in (${charsetdir}/ALTERNATIVNYJ.map):
* admin/charsets/mapconv:
* lisp/net/soap-client.el (soap-create-envelope):
* lisp/org/org.el (org-doi-server-url):
* lisp/textmodes/bibtex.el (bibtex-generate-url-list):
Prefer https: to http: un URLs.