This should help when building with --enable-checking and
compiling with gcc -O0. Problem reorted by Stefan Monnier in:
https://lists.gnu.org/r/emacs-devel/2024-01/msg00770.html
* src/lisp.h (lisp_h_builtin_lisp_symbol): New macro,
with a body equivalent in effect to the old ‘builtin_lisp_symbol’
but faster when not optimizing.
(builtin_lisp_symbol): Use it.
If DEFINE_KEY_OPS_AS_MACROS, also define as macro.
This removes hacks from code that had to be careful not to use
Qunbound as a hash table key, at the cost of a minor hack in
the GC marker.
* src/lisp.h (INVALID_LISP_VALUE, HASH_UNUSED_ENTRY_KEY):
Define as a null-pointer float.
* src/alloc.c (process_mark_stack): Add hack to ignore that value.
* src/pdumper.c (dump_object_needs_dumping_p)
(pdumper_init_symbol_unbound, pdumper_load):
* src/print.c (PRINT_CIRCLE_CANDIDATE_P): Remove hacks for Qunbound.
* src/lisp.h (make_lisp_symbol): In eassert use XBARE_SYMBOL
rather than XSYMBOL. This is safe because the symbol must be
bare. The change speeds up make_lisp_symbol when debugging.
* src/lisp.h (XSYMBOL): Simplify and tune. There is no need to
examine symbols_with_pos_enabled here, since the arg must be a symbol
so if it's not a bare symbol then it must be a symbol_with_pos;
and checking whether a symbol is bare is cheap.
With Ubuntu 23.10 on a Xeon W-1350, this shrank Emacs’s executable
text size by 0.1% and sped up a default build of all *.elc files by
0.4%.
Remove unnecessary eassert, since XBARE_SYMBOL and XSYMBOL_WITH_POS
have easserts that suffice.
Be more systematic about putting space before paren in calls,
and in avoiding unnecessary parentheses in macros.
This was partly inspired by my wading through gcc -E output
while debugging something else, and seeing too many parens.
This patch does not change the generated .o files on my platform.
* src/lisp.h (XBARE_SYMBOL, XSYMBOL): Omit parentheses that are no
longer needed now that we have symbols with positions and these
symbols are never macros.
This saves a lot of memory and is quite sufficient. Hash functions
are adapted to produce a hash_hash_t eventually, which eliminates some
useless and information-destroying intermediate hash reduction steps.
We still use EMACS_UINT for most of the actual hashing steps before
producing the final value; this may be slightly wasteful on 32-bit
platforms with 64-bit EMACS_UINT.
* src/lisp.h (hash_hash_t): Change to uint32_t.
* src/fns.c (reduce_emacs_uint_to_hash_hash): New.
(hashfn_eq, hashfn_equal, hashfn_user_defined): Reduce return values
to hash_hash_t.
(sxhash_string): Remove. Caller changed to hash_string.
(sxhash_float, sxhash_list, sxhash_vector, sxhash_bool_vector)
(sxhash_bignum): Remove wasteful calls to SXHASH_REDUCE.
(hash_hash_to_fixnum): New.
(Fsxhash_eq, Fsxhash_eql, Fsxhash_equal)
(Fsxhash_equal_including_properties): Convert return values to fixnum.
Previously, free hash table entries were indicated by both hash value
hash_unused and key Qunbound; we now rely on the latter only.
This allows us to change the hash representation to one that does not
have an unused value.
* src/lisp.h (hash_unused): Remove.
All uses adapted to calling hash_unused_entry_key_p on the key instead.
The hash values for unused hash table entries are now undefined; all
initialisation and assignment to hash_unused has been removed.
The algorithms no longer use the rehash_threshold and rehash_size
float constants, but vary depending on size. In particular, the table
now grows faster, especially from smaller sizes.
The default size is now 0, starting empty, which effectively postpones
allocation until the first insertion (unless make-hash-table was
called with a positive :size); this is a clear gain as long as the
table remains empty. The first inserted item will use an initial size
of 8 because most tables are small.
* src/fns.c (std_rehash_size, std_rehash_threshold): Remove.
(hash_index_size): Integer-only computation.
(maybe_resize_hash_table): Grow more aggressively.
(Fhash_table_rehash_size, Fhash_table_rehash_threshold):
Use the constants directly.
* src/lisp.h (DEFAULT_HASH_SIZE): New value.
This saves several words in the hash table object at the cost of an
indirection at runtime. This seems to be a gain in overall
performance.
FIXME: We cache hash test objects in a rather clumsy way. A better
solution is sought.
* src/lisp.h (struct Lisp_Hash_Table): Use a pointer to the test
struct. All references adapted.
* src/alloc.c (garbage_collect):
* src/fns.c (struct hash_table_user_test, hash_table_user_tests)
(mark_fns, get_hash_table_user_test): New state for caching test
structs, and functions managing it.
Now hash_idx_t is a typedef for ptrdiff_t so there is no actual code
change, but this allows us to decouple the index width from the Lisp
word size.
* src/lisp.h (hash_idx_t): New typedef for ptrdiff_t.
(struct Lisp_Hash_Table): Use it for indices and sizes:
index, next, table_size, index_size, count and next_free.
All uses adapted.
This improves performance in several ways. Separate functions are
used depending on whether the caller has a hash value computed or not.
* src/fns.c (hash_lookup_with_hash, hash_lookup_get_hash): New.
(hash_lookup): Remove hash return argument.
All callers adapted.
hash_lookup_with_hash hash_hash_t arg
This improves typing, saves pointless tagging and untagging, and
prepares for further changes. The new typedef hash_hash_t is an alias
for EMACS_UINT, and hash values are still limited to the fixnum range.
We now use hash_unused instead of Qnil to mark unused entries.
* src/lisp.h (hash_hash_t): New typedef for EMACS_UINT.
(hash_unused): New constant.
(struct hash_table_test): `hashfn` now returns
hash_hash_t. All callers and implementations changed.
(struct Lisp_Hash_Table): Retype hash vector to an array of
hash_hash_t. All code using it changed accordingly.
(HASH_HASH, hash_from_key):
* src/fns.c (set_hash_index_slot, hash_index_index)
(hash_lookup_with_hash, hash_lookup_get_hash, hash_put):
(hash_lookup, hash_put): Retype hash value arguments
and return values. All callers adapted.
Using xmalloc for allocating these arrays is much cheaper than using
Lisp vectors since they are no longer marked or swept by the GC, and
deallocated much sooner. This makes GC faster and less frequent, and
improves temporal locality.
Zero-sized tables use NULL for their (0-length) vectors except the
index vector which has size 1 and uses a shared constant static vector
since it cannot be modified anyway. This makes creation and
destruction of zero-sized hash tables very fast; they consume no
memory outside the base object.
* src/lisp.h (struct Lisp_Hash_Table): Retype the index, next, hash
and key_and_value vectors from Lisp_Object to appropriately typed
arrays (although hash values are still stored as Lisp fixnums). Add
explicit table_size and index_size members. All users updated.
* src/alloc.c (gcstat): Add total_hash_table_bytes.
(hash_table_allocated_bytes): New.
(cleanup_vector): Free hash table vectors when sweeping
the object.
(hash_table_alloc_bytes, hash_table_free_bytes): New.
(sweep_vectors): Update gcstat.total_hash_table_bytes.
(total_bytes_of_live_objects): Use it.
(purecopy_hash_table): Adapt allocation of hash table vectors.
(process_mark_stack): No more Lisp slots in the struct to trace.
* src/fns.c (empty_hash_index_vector): New.
(allocate_hash_table): Allocate without automatically GCed slots.
(alloc_larger_vector): Remove.
(make_hash_table, copy_hash_table, maybe_resize_hash_table):
Adapt vector allocation and initialisation.
* src/pdumper.c (hash_table_freeze, hash_table_thaw, dump_hash_table)
(dump_hash_table_contents):
Adapt dumping and loading to field changes.
This avoids any extra allocation for such vectors, including empty
tables read by the Lisp reader, and provides extra safety essentially
for free.
* src/fns.c (make_hash_table): Allow tables to be 0-sized. The index
will always have at least one entry, to avoid extra look-up costs.
* src/alloc.c (process_mark_stack): Don't mark pure objects,
because empty vectors are pure.
Only dump the actual data, and the test encoded as an enum. This
simplifies dumping, makes dump files smaller and saves space at run
time.
* src/lisp.h (hash_table_std_test_t): New enum.
(struct Lisp_Hash_Table): Add frozen_test member, consuming no extra space.
* src/fns.c (hashfn_user_defined): Now static.
(hash_table_test_from_std): New.
(hash_table_rehash): Rename to...
(hash_table_thaw): ...this and rewrite.
* src/pdumper.c (hash_table_contents): Only include actual data, not
unused space.
(hash_table_std_test): New.
(hash_table_freeze): Set frozen_test from test.
(dump_hash_table): Dump frozen_test, not the whole test struct.
Don't bother other dumping fields that can be derived.
These parameters have no visible semantics and are hardly ever used,
so just use the default values for all hash tables. This saves
memory, shrinks the external representation, and will improve
performance.
* src/fns.c (std_rehash_size, std_rehash_threshold): New.
(hash_index_size): Use std_rehash_threshold. Remove table argument.
All callers updated.
(make_hash_table): Remove rehash_size and rehash_threshold args.
All callers updated.
(maybe_resize_hash_table)
(Fhash_table_rehash_size, Fhash_table_rehash_threshold):
Use std_rehash_size and std_rehash_threshold.
(Fmake_hash_table): Ignore :rehash-size and :rehash-threshold args.
* src/lisp.h (struct Lisp_Hash_Table):
Remove rehash_size and rehash_threshold fields.
(DEFAULT_REHASH_THRESHOLD, DEFAULT_REHASH_SIZE): Remove.
* src/lread.c (hash_table_from_plist): Don't read rehash-size or
rehash-threshold.
(syms_of_lread): Remove unused symbols.
* src/print.c (print_object): Don't print rehash-size or rehash-threshold.
* src/pdumper.c (dump_hash_table): Don't dump removed fields.
This takes less space (saves an entire word) and is more type-safe.
No change in behaviour.
* src/lisp.h (hash_table_weakness_t): New.
(struct Lisp_Hash_Table): Replace Lisp object `weak` with enum
`weakness`.
* src/fns.c
(keep_entry_p, hash_table_weakness_symbol): New.
(make_hash_table): Retype argument. All callers updated.
(sweep_weak_table, Fmake_hash_table, Fhash_table_weakness):
* src/alloc.c (purecopy_hash_table, purecopy, process_mark_stack):
* src/pdumper.c (dump_hash_table):
* src/print.c (print_object): Use retyped field.
Qunbound is used for many things; using a predicate and constant for
the specific purpose of unused hash entry keys allows us to locate
them and make changes much more easily.
* src/lisp.h (HASH_UNUSED_ENTRY_KEY, hash_unused_entry_key_p):
New constant and function.
* src/comp.c (compile_function, Fcomp__compile_ctxt_to_file):
* src/composite.c (composition_gstring_cache_clear_font):
* src/emacs-module.c (module_global_reference_p):
* src/fns.c (make_hash_table, maybe_resize_hash_table, hash_put)
(hash_remove_from_table, hash_clear, sweep_weak_table, Fmaphash):
* src/json.c (lisp_to_json_nonscalar_1):
* src/minibuf.c (Ftry_completion, Fall_completions, Ftest_completion):
* src/print.c (print, print_object):
Use them.
The profiler stored data being collected in Lisp hash tables but
relied heavily on their exact internal representation, which made it
difficult and error-prone to change the hash table implementation.
In particular, the profiler has special run-time requirements that are
not easily met using standard Lisp data structures: accesses and
updates are made from async signal handlers in almost any messy
context you can think of and are therefore very constrained in what
they can do.
The new profiler tables are designed specifically for their purpose
and are more efficient and, by not being coupled to Lisp hash tables,
easier to keep safe.
The old profiler morphed internal hash tables to ones usable from Lisp
and thereby made them impossible to use internally; now export_log
just makes new hash table objects for Lisp. The Lisp part of the
profiler remains entirely unchanged.
* src/alloc.c (garbage_collect): Mark profiler tables.
* src/eval.c (get_backtrace): Fill an array of Lisp values instead of
a Lisp vector.
* src/profiler.c (log_t): No longer a Lisp hash table but a custom
data structure: a fully associative fixed-sized cache that maps
fixed-size arrays of Lisp objects to counts.
(make_log): Build new struct.
(mark_log, free_log, get_log_count, set_log_count, get_key_vector)
(log_hash_index, remove_log_entry, trace_equal, trace_hash)
(make_profiler_log, free_profiler_log, mark_profiler): New.
(cmpfn_profiler, hashtest_profiler, hashfn_profiler)
(syms_of_profiler_for_pdumper): Remove.
(approximate_median, evict_lower_half, record_backtrace, export_log)
(Fprofiler_cpu_log, Fprofiler_memory_log, syms_of_profiler):
Adapt to the new data structure.
Reimplement `backtrace-on-redisplay-error` using `push_handler_bind`.
This moves the code from `signal_or_quit` to `xdisp.c` and
`debug-early.el`.
* lisp/emacs-lisp/debug-early.el (debug-early-backtrace):
Add `base` arg to strip "internal" frames.
(debug--early): New function, extracted from `debug-early`.
(debug-early, debug-early--handler): Use it.
(debug-early--muted): New function, extracted (translated) from
`signal_or_quit`; trim the buffer to a max of 10 backtraces.
* src/xdisp.c (funcall_with_backtraces): New function.
(dsafe_calln): Use it.
(syms_of_xdisp): Defsym `Qdebug_early__muted`.
* src/eval.c (redisplay_deep_handler): Delete var.
(init_eval, internal_condition_case_n): Don't set it any more.
(backtrace_yet): Delete var.
(signal_or_quit): Remove special case for `backtrace_on_redisplay_error`.
* src/keyboard.c (command_loop_1): Don't set `backtrace_yet` any more.
* src/lisp.h (backtrace_yet): Don't declare.
Move ad-hoc code meant to ease debugging of bootstrap (and batch mode)
to `top_level_2` so it doesn't pollute `signal_or_quit`.
* src/lisp.h (pop_handler, push_handler_bind): Declare.
* src/keyboard.c (top_level_2): Setup an error handler to call
`debug-early` when noninteractive.
* src/eval.c (pop_handler): Not static any more.
(signal_or_quit): Remove special case for noninteractive use.
(push_handler_bind): New function, extracted from `Fhandler_bind_1`.
(Fhandler_bind_1): Use it.
(syms_of_eval): Declare `Qdebug_early__handler`.
* lisp/emacs-lisp/debug-early.el (debug-early-backtrace): Weed out
frames below `debug-early`.
(debug-early--handler): New function.
AFAIK, this provides the same semantics as Common Lisp's `handler-bind`,
modulo the differences about how error objects and conditions are
represented.
* lisp/subr.el (handler-bind): New macro.
* src/eval.c (pop_handler): New function.
(Fhandler_Bind_1): New function.
(signal_or_quit): Handle new handlertypes `HANDLER` and `SKIP_CONDITIONS`.
(find_handler_clause): Simplify.
(syms_of_eval): Defsubr `Fhandler_bind_1`.
* doc/lispref/control.texi (Handling Errors): Add `handler-bind`.
* test/src/eval-tests.el (eval-tests--handler-bind): New test.
* lisp/emacs-lisp/lisp-mode.el (lisp-font-lock-keywords):
Move 'handler-bind' from CL-only to generic Lisp.
(handler-bind): Remove indentation setting, it now lives in the macro
definition.
* src/buffer.c (Fget_truename_buffer): Expose `get_truename_buffer' to
Elisp.
(Ffind_buffer): New subr searching for a live buffer with a given
value of buffer-local variable.
(syms_of_buffer): Register the new added subroutines.
* src/filelock.c (lock_file): Use the new `Fget_truename_buffer' name.
* src/lisp.h:
* test/manual/etags/c-src/emacs/src/lisp.h: Remove no-longer-necessary
extern declarations for `get_truename_buffer'.
* lisp/files.el (find-buffer-visiting): Refactor, using subroutines to
search for buffers instead of slow manual Elisp iterations.
The `safe_call/eval` family of functions started its life in `xdisp.c`
for the needs of redisplay but quickly became popular outside of it.
This is not ideal because despite their name, they are somewhat
specific to the needs of redisplay.
So we split them into `safe_call/eval` (in `eval.c`) and `dsafe_call/eval`
(in `xdisp.c`). We took this opportunity to slightly change their
calling convention to be friendly to the CALLN-style macros.
While at it, we introduce a new `calln` macro as well which does
all that `call[1-8]` used to do.
* src/eval.c (safe_eval_handler, safe_funcall, safe_eval): New functions,
Copied from `xdisp.c`. Don't obey `inhibit_eval_during_redisplay` any more.
Adjust error message to not claim it happened during redisplay.
* src/lisp.h (calln): New macro.
(call1, call2, call3, call4, call5, call6, call7, call8): Turn them
into aliases of `calln`.
(safe_funcall): Declare.
(safe_calln): New macro.
(safe_call1, safe_call2): Redefine as compatibility macros.
(safe_call, safe_call1, safe_call2): Delete.
Replace all callers with calls to `safe_calln`.
* src/xdisp.c (dsafe_eval_handler): Rename from `safe_eval_handler`.
Adjust all users.
(dsafe__call): Rename from `safe_call` and change calling convention to
work with something like CALLMANY. Adjust all users.
(safe_call, safe__call1, safe_call2): Delete functions.
(SAFE_CALLMANY, dsafe_calln): New macros.
(dsafe_call1, dsafe_eval): Rename from `safe_call1` and `safe_eval`,
and rewrite using them. Adjust all users.
(clear_message, prepare_menu_bars, redisplay_window): Use `dsafe_calln`.
(run_window_scroll_functions): Don't let-bind `Qinhibit_quit`
since `safe_run_hooks_2` does it for us.
* src/indent.c (line_number_display_width): No longer static.
* src/lisp.h (line_number_display_width): Add prototype.
* src/keyboard.c (save_line_number_display_width)
(line_number_mode_hscroll): New functions.
(make_lispy_event): Call 'save_line_number_display_width' and
'line_number_mode_hscroll' to avoid interpreting up-event as drag
event when redisplay scrolls the text horizontally between the
down- and up-event to account for the changed width of the
line-number display. (Bug#66655)
This makes for safer code when tagging null pointers in particular,
since pointer arithmetic on NULL is undefined and therefore can be
assumed, by the compiler, not to occur.
* src/lisp.h (untagged_ptr): Remove.
(TAG_PTR): Cast to uintptr_t instead of untagged_ptr.
* src/lisp.h (XUNTAG):
Instead of casting a Lisp value to char * and subtracting the tag,
cast it to a suitable integral type and work on that.
This should result in identical or at least equivalent code, except
that it avoids potential problems arising from the restrictions on
pointer arithmetic in C. In particular, a null pointer can be neither
an operand in nor the result of pointer arithmetic.
C compilers know this and would, prior to this change, optimise
XUNTAG(obj, Lisp_Int0, mytype) != NULL
to 1. This means, for example, that make_pointer_integer and
XFIXNUMPTR could not be entrusted with null pointers, and
next_vector in alloc.c was unsafe to use.
* src/lisp.h (XUNTAG):
Instead of casting a Lisp value to char * and subtracting the tag,
cast it to a suitable integral type and work on that.
This should result in identical or at least equivalent code, except
that it avoids potential problems arising from the restrictions on
pointer arithmetic in C. In particular, a null pointer can be neither
an operand in nor the result of pointer arithmetic.
C compilers know this and would, prior to this change, optimise
XUNTAG(obj, Lisp_Int0, mytype) != NULL
to 1. This means, for example, that make_pointer_integer and
XFIXNUMPTR could not be entrusted with null pointers, and
next_vector in alloc.c was unsafe to use.
* src/lisp.h (union vectorlike_header): Rewrite the description of the
header word layout, with some useful added precision and the customary
ASCII art for bit fields.
Problem reported by Po Lu in:
https://lists.gnu.org/r/emacs-devel/2023-08/msg00195.html
* src/comp.c (comp_hash_source_file, Fcomp__release_ctxt):
* src/sysdep.c (get_up_time, procfs_ttyname, procfs_get_total_memory):
Be more systematic about using emacs_fclose on streams that were
opened with emacs_fopen or emacs_fdopen. Do this even if not
Android, as this simplifies checking that it's done consistently.
* src/lisp.h (emacs_fclose): If it’s just fclose,
make it a macro rather than a function, to avoid confusing gcc
-Wmismatched-dealloc.
(emacs_fopen): Move decl here from sysstdio.h, because sysstdio.h
is included from non-Emacs executables and emacs_fopen is good
only inside Emacs.
* src/sysdep.c (emacs_fclose): Define as a function only if Android.
* src/lisp.h (emacs_fdopen): Now ATTRIBUTE_MALLOC
ATTRIBUTE_DEALLOC (emacs_fclose, 1), to pacify gcc
-Wsuggest-attribute=malloc on non-Android platforms.