* fns.c (hash_string): New function, taken from sxhash_string.
Do not discard information about ASCII character case; this
discarding is no longer needed.
(sxhash-string): Use it. Change sig to match it. Caller changed.
* lisp.h: Declare it.
* lread.c (hash_string): Remove, since we now use fns.c's version.
The fns.c version returns a wider integer if --with-wide-int is
specified, so this should help the quality of the hashing a bit.
* fns.c (Frandom): Use GET_EMACS_TIME for random seed, and add the
subseconds part to the entropy, as that's a bit more random.
Prefer signed to unsigned, since the signedness doesn't matter and
in general we prefer signed. When given a limit, use a
denominator equal to INTMASK + 1, not to VALMASK + 1, because the
latter isn't right if USE_2_TAGS_FOR_INTS.
* sysdep.c (get_random): Return a value in the range 0..INTMASK,
not 0..VALMASK. Don't discard "excess" bits that random () returns.
* composite.c (find_automatic_composition): Omit needless 'return 0;'
that Sun C diagnosed.
* fns.c (secure_hash): Fix pointer signedness issue.
* intervals.c (static_offset_intervals): New function.
(offset_intervals): Use it.
(Fsafe_length): Return a float if the value is not representable
as a fixnum. This shouldn't happen except in contrived situations.
Use same QUIT_COUNT_HEURISTIC as Flength now does.
Use EMACS_INT, not int, to avoid unwanted truncation on 64-bit hosts.
Check for QUIT every 1024 entries rather than every other entry;
that's faster and is responsive enough. Report an error instead of
overflowing an integer.
The previous code was bogus. For example, next_almost_prime (32)
returned 39, which is undesirable as it is a multiple of 3; and
next_almost_prime (24) returned 25, which is a multiple of 5 so
why was the code bothering to check for multiples of 7?
This partly undoes my 2011-03-30 change, which replaced int with size_t.
Back then I didn't know that the Emacs coding style prefers signed int.
Also, in the meantime I found a few more instances where arguments
were being counted with int, which may truncate counts on 64-bit
machines, or EMACS_INT, which may be unnecessarily wide.
* lisp.h (struct Lisp_Subr.function.aMANY)
(DEFUN_ARGS_MANY, internal_condition_case_n, safe_call):
Arg counts are now ptrdiff_t, not size_t.
All variadic functions and their callers changed accordingly.
(struct gcpro.nvars): Now size_t, not size_t. All uses changed.
* bytecode.c (exec_byte_code): Check maxdepth for overflow,
to avoid potential buffer overrun. Don't assume arg counts fit in 'int'.
* callint.c (Fcall_interactively): Check arg count for overflow,
to avoid potential buffer overrun. Use signed char, not 'int',
for 'varies' array, so that we needn't bother to check its size
calculation for overflow.
* editfns.c (Fformat): Use ptrdiff_t, not EMACS_INT, to count args.
* eval.c (apply_lambda):
* fns.c (Fmapconcat): Use XFASTINT, not XINT, to get args length.
(struct textprop_rec.argnum): Now ptrdiff_t, not int. All uses changed.
(mapconcat): Use ptrdiff_t, not int and EMACS_INT, to count args.
This doesn't fix any bugs. Use int to hold character, instead
of constantly refetching from Emacs object. Use XFASTINT, not
XINT, for value known to be a character. Don't bother comparing
a single byte to 0400, as it's always less.
Otherwise, CHAR_STRING would do the wrong thing on a 64-bit platform,
by silently ignoring the top 32 bits, allowing some values
that were far too large to be valid characters.
* character.h: Include <verify.h>.
(CHAR_STRING, CHAR_STRING_ADVANCE): Verify that the character
arguments are no wider than unsigned, as a compile-time check
to prevent future regressions in this area.
* data.c (Faset):
* editfns.c (Fchar_to_string, general_insert_function, Finsert_char):
(Fsubst_char_in_region):
* fns.c (concat):
* xdisp.c (decode_mode_spec_coding):
Adjust to CHAR_STRING's new requirement.
* editfns.c (Finsert_char, Fsubst_char_in_region):
* fns.c (concat): Check that character args are actually
characters. Without this test, these functions did the wrong
thing with wildly out-of-range values on 64-bit hosts.
* category.c (hash_get_category_set): Use 'EMACS_UINT' and 'EMACS_INT'
for hashes and hash indexes, instead of 'unsigned' and 'int'.
* ccl.c (ccl_driver): Likewise.
* charset.c (Fdefine_charset_internal): Likewise.
* charset.h (struct charset.hash_index): Likewise.
* composite.c (get_composition_id, gstring_lookup_cache):
(composition_gstring_put_cache): Likewise.
* composite.h (struct composition.hash_index): Likewise.
* dispextern.h (struct image.hash): Likewise.
* fns.c (next_almost_prime, larger_vector, cmpfn_eql):
(cmpfn_equal, cmpfn_user_defined, hashfn_eq, hashfn_eql):
(hashfn_equal, hashfn_user_defined, make_hash_table):
(maybe_resize_hash_table, hash_lookup, hash_put):
(hash_remove_from_table, hash_clear, sweep_weak_table, SXHASH_COMBINE):
(sxhash_string, sxhash_list, sxhash_vector, sxhash_bool_vector):
(Fsxhash, Fgethash, Fputhash, Fmaphash): Likewise.
* image.c (make_image, search_image_cache, lookup_image):
(xpm_put_color_table_h): Likewise.
* lisp.h (struct Lisp_Hash_Table): Likewise, for 'count', 'cmpfn',
and 'hashfn' members.
* minibuf.c (Ftry_completion, Fall_completions, Ftest_completion):
Likewise.
* print.c (print): Likewise.
* alloc.c (allocate_vectorlike): Check for overflow in vector size
calculations.
* ccl.c (ccl_driver): Check for overflow when converting EMACS_INT
to int.
* fns.c, image.c: Remove unnecessary static decls that would otherwise
need to be updated by these changes.
* fns.c (make_hash_table, maybe_resize_hash_table): Check for integer
overflow with large hash tables.
(make_hash_table, maybe_resize_hash_table, Fmake_hash_table):
Prefer the faster XFLOAT_DATA to XFLOATINT where either will do.
(SXHASH_REDUCE): New macro.
(sxhash_string, sxhash_list, sxhash_vector, sxhash_bool_vector):
Use it instead of discarding useful hash info with large hash values.
(sxhash_float): New function.
(sxhash): Use it. No more need for "& INTMASK" due to above changes.
* lisp.h (FIXNUM_BITS): New macro, useful for SXHASH_REDUCE etc.
(MOST_NEGATIVE_FIXNUM, MOST_POSITIVE_FIXNUM, INTMASK): Rewrite
to use FIXNUM_BITS, as this simplifies things.
(next_almost_prime, larger_vector, sxhash, hash_lookup, hash_put):
Adjust signatures to match updated version of code.
(consing_since_gc): Now EMACS_INT, since a single hash table can
use more than INT_MAX bytes.
* fns.c (to_uchar): Define.
(crypto_hash_function): Use it to convert some newly-signed
variables to unsigned, to avoid sign-extension bugs. For example,
without this change, (md5 "truc") would evaluate to
45723a2aff78ff4fff7fff1114760e62 rather than the expected
45723a2af3788c4ff17f8d1114760e62. Reported by Antoine Levitt in
http://thread.gmane.org/gmane.emacs.devel/139824
* character.c, character.h (count_size_as_multibyte):
Renamed from parse_str_to_multibyte; all uses changed.
Check for integer overflow.
* insdel.c, lisp.h (count_size_as_multibyte): Remove,
since it's now a duplicate of the other. This is more of
a character than a buffer op, so better that it's in character.c.
* fns.c, print.c: Adjust to above changes.
GCC 4.6.0 optimizes based on type-based alias analysis. For
example, if b is of type struct buffer * and v of type struct
Lisp_Vector *, then gcc -O2 was incorrectly assuming that &b->size
!= &v->size, and therefore "v->size = 1; b->size = 2; return
v->size;" must therefore return 1. This assumption is incorrect
for Emacs, since it type-puns struct Lisp_Vector * with many other
types. To fix this problem, this patch adds a new type struct
vector_header that documents the constraints on layout of vectors
and pseudovectors, and helps optimizing compilers not get fooled
by Emacs's type punning. It also adds the macros XSETTYPED_PVECTYPE
XSETTYPED_PSEUDOVECTOR, TYPED_PSEUDOVECTORP, for similar reasons.
* lisp.h (XVECTOR_SIZE): New convenience macro. All previous uses of
XVECTOR (foo)->size replaced to use this macro, to avoid the hassle
of writing XVECTOR (foo)->header.size.
(XVECTOR_HEADER_SIZE): New macro, for use in XSETPSEUDOVECTOR.
(XSETTYPED_PVECTYPE): New macro, specifying the name of the size
member.
(XSETPVECTYPE): Rewrite in terms of new macro.
(XSETPVECTYPESIZE): New macro, specifying both type and size.
This is a bit clearer, and further avoids the possibility of
undesirable aliasing.
(XSETTYPED_PSEUDOVECTOR): New macro, specifying the size.
(XSETPSEUDOVECTOR): Rewrite in terms of XSETTYPED_PSEUDOVECTOR
and XVECTOR_HEADER_SIZE.
(XSETSUBR): Rewrite in terms of XSETTYPED_PSEUDOVECTOR and XSIZE,
since Lisp_Subr is a special case (no "next" field).
(ASIZE): Rewrite in terms of XVECTOR_SIZE.
(struct vector_header): New type.
(TYPED_PSEUDOVECTORP): New macro, also specifying the C type of the
object, to help avoid aliasing.
(PSEUDOVECTORP): Rewrite in terms of TYPED_PSEUDOVECTORP.
(SUBRP): Likewise, since Lisp_Subr is a special case.
* lisp.h (struct Lisp_Vector, struct Lisp_Char_Table):
(struct Lisp_Sub_Char_Table, struct Lisp_Bool_Vector):
(struct Lisp_Hash_Table): Combine first two members into a single
struct vector_header member. All uses of "size" and "next" members
changed to be "header.size" and "header.next".
* buffer.h (struct buffer): Likewise.
* font.h (struct font_spec, struct font_entity, struct font): Likewise.
* frame.h (struct frame): Likewise.
* process.h (struct Lisp_Process): Likewise.
* termhooks.h (struct terminal): Likewise.
* window.c (struct save_window_data, struct saved_window): Likewise.
* window.h (struct window): Likewise.
* alloc.c (allocate_buffer, Fmake_bool_vector, allocate_pseudovector):
Use XSETPVECTYPESIZE, not XSETPVECTYPE, to avoid aliasing problems.
* buffer.c (init_buffer_once): Likewise.
* lread.c (defsubr): Use XSETTYPED_PVECTYPE, since Lisp_Subr is a
special case.
* process.c (Fformat_network_address): Use local var for size,
for brevity.
* lisp.h (pI): New macro, generalizing old pEd macro to other
conversion specifiers. For example, use "...%"pI"d..." rather
than "...%"pEd"...".
(pEd): Remove. All uses replaced with similar uses of pI.
* src/m/amdx86-64.h, src/m/ia64.h, src/m/ibms390x.h: Likewise.
* alloc.c (check_pure_size): Don't overflow by converting size to int.
* bidi.c (bidi_dump_cached_states): Use pI to avoid cast.
* data.c (Fnumber_to_string): Use pI instead of if-then-else-abort.
* dbusbind.c (xd_append_arg): Use pI to avoid cast.
(Fdbus_method_return_internal, Fdbus_method_error_internal): Likewise.
* font.c (font_unparse_xlfd): Avoid potential buffer overrun on
64-bit hosts.
(font_unparse_xlfd, font_unparse_fcname): Use pI to avoid casts.
* keyboard.c (record_char, modify_event_symbol): Use pI to avoid casts.
* print.c (safe_debug_print, print_object): Likewise.
(print_object): Don't overflow by converting EMACS_INT or EMACS_UINT
to int.
Use pI instead of if-then-else-abort. Use %p to avoid casts.
* process.c (Fmake_network_process): Use pI to avoid cast.
* region-cache.c (pp_cache): Likewise.
* xdisp.c (decode_mode_spec): Likewise.
* xrdb.c (x_load_resources) [USE_MOTIF]: Use pI to avoid undefined
behavior on 64-bit hosts with printf arg.
* xselect.c (x_queue_event): Use %p to avoid casts.
(x_stop_queuing_selection_requests): Likewise.
(x_get_window_property): Don't truncate byte count to an 'int'
when tracing.