Previously, a unibyte target buffer could be put in an incorrect state
if json-insert was used to insert non-ASCII characters.
* src/json.c (Fjson_insert): Simplify. Don't attempt to decode the data
being inserted: it is guaranteed to be correct UTF-8 and is correct for
both unibyte and multibyte buffers.
* test/src/json-tests.el (json-serialize/roundtrip)
(json-serialize/roundtrip-scalars): Extend tests.
* src/json.c (Fjson_insert): Precise the behaviour when the current
buffer is multibyte and unibyte, respectively.
* doc/lispref/text.texi (Parsing JSON): Refer to the right function.
* src/json.c (Fjson_serialize, Fjson_insert, Fjson_parse_string)
(Fjson_parse_buffer): Make the text more readable, fix minor
errors and avoid terminology confusion.
Speed up JSON parsing substantially by only UTF-8-parsing string
literals and only exactly once. Previously, json-parse-string always
first parsed the entire input and copied it to a new string, and then
validated each string literal twice.
We no longer create an extra new string when interning an alist key,
nor do we garble plist keys with Unicode characters.
* src/lread.c (intern_c_multibyte): New.
* src/json.c (json_encode): Remove.
(utf8_error): New.
(json_parse_string): Faster and more careful UTF-8 decoding.
Create and return a new multibyte string or symbol without extra
decoding. All callers adapted.
(Fjson_parse_string): Skip expensive input pre-decoding.
* test/src/json-tests.el (json-parse-string/object-unicode-keys)
(json-parse-string/short): New.
(json-parse-string/string, json-parse-string/invalid-unicode):
Adapt tests.
* etc/NEWS: Mentioned change in errors.
* src/json.c (json_parse_args, json_out_t, symset_t, symset_size)
(Fjson_serialize, Fjson_insert): Tabify and modify all sentences
to be punctuated with two spaces.
* src/Makefile.in (base_obj): Add the missing json.o. Without
this, we get link error.
* src/json.c (json_serialize): Don't use too sophisticated C99
features, as they confuse make-docfile. Initialize all the
members explicitly.
It is in general at least 2x faster than the old encoder and does not
depend on any external library. Using our own code also gives us
control over translation details: for example, we now have full
bignum support and tighter float formatting.
* src/json.c (json_delete, json_initialized, init_json_functions)
(json_malloc, json_free, init_json, json_out_of_memory)
(json_releae_object, check_string_without_embedded_nulls, json_check)
(json_check_utf8, lisp_to_json_nonscalar_1, lisp_to_json_nonscalar)
(lisp_to_json, json_available_p, ensure_json_available, json_insert)
(json_handle_nonlocal_exit, json_insert_callback):
Remove. Remaining uses updated.
* src/json.c (json_out_t, symset_t, struct symset_tbl)
(symset_size, make_symset_table, push_symset, pop_symset)
(cleanup_symset_tables, symset_hash, symset_expand, symset_add)
(json_out_grow_buf, cleanup_json_out, json_make_room, JSON_OUT_STR)
(json_out_str, json_out_byte, json_out_fixnum, string_not_unicode)
(json_plain_char, json_out_string, json_out_nest, json_out_unnest)
(json_out_object_cons, json_out_object_hash), json_out_array)
(json_out_float, json_out_bignum, json_out_something)
(json_out_to_string, json_serialize): New.
(Fjson_serialize, Fjson_insert):
New JSON encoder implementation.
* test/src/json-tests.el (json-serialize/object-with-duplicate-keys)
(json-serialize/string): Update tests.
This leads to simpler code in the users, and more efficient machine
code because we don't repeatedly need to fetch the `table_size`
and `key_and_value` fields of the hash table object.
* src/lisp.h (DOHASH): Rewrite.
* src/composite.c (composition_gstring_lookup_cache): Simplify.
(composition_gstring_cache_clear_font):
* src/print.c (print):
* src/pdumper.c (hash_table_contents):
* src/minibuf.c (Ftest_completion):
* src/json.c (lisp_to_json_nonscalar_1):
* src/emacs-module.c (module_global_reference_p):
* src/comp.c (compile_function, Fcomp__compile_ctxt_to_file):
* src/fns.c (Fmaphash): Adjust to new calling convention.
This improves performance in several ways. Separate functions are
used depending on whether the caller has a hash value computed or not.
* src/fns.c (hash_lookup_with_hash, hash_lookup_get_hash): New.
(hash_lookup): Remove hash return argument.
All callers adapted.
hash_lookup_with_hash hash_hash_t arg
This improves typing, saves pointless tagging and untagging, and
prepares for further changes. The new typedef hash_hash_t is an alias
for EMACS_UINT, and hash values are still limited to the fixnum range.
We now use hash_unused instead of Qnil to mark unused entries.
* src/lisp.h (hash_hash_t): New typedef for EMACS_UINT.
(hash_unused): New constant.
(struct hash_table_test): `hashfn` now returns
hash_hash_t. All callers and implementations changed.
(struct Lisp_Hash_Table): Retype hash vector to an array of
hash_hash_t. All code using it changed accordingly.
(HASH_HASH, hash_from_key):
* src/fns.c (set_hash_index_slot, hash_index_index)
(hash_lookup_with_hash, hash_lookup_get_hash, hash_put):
(hash_lookup, hash_put): Retype hash value arguments
and return values. All callers adapted.
Qunbound is used for many things; using a predicate and constant for
the specific purpose of unused hash entry keys allows us to locate
them and make changes much more easily.
* src/lisp.h (HASH_UNUSED_ENTRY_KEY, hash_unused_entry_key_p):
New constant and function.
* src/comp.c (compile_function, Fcomp__compile_ctxt_to_file):
* src/composite.c (composition_gstring_cache_clear_font):
* src/emacs-module.c (module_global_reference_p):
* src/fns.c (make_hash_table, maybe_resize_hash_table, hash_put)
(hash_remove_from_table, hash_clear, sweep_weak_table, Fmaphash):
* src/json.c (lisp_to_json_nonscalar_1):
* src/minibuf.c (Ftry_completion, Fall_completions, Ftest_completion):
* src/print.c (print, print_object):
Use them.
* src/json.c (json_available_p): Use original code. Always return
true for !WINDOWSNT.
(ensure_json_available): Now defined only on WINDOWSNT.
(Fjson_serialize, Fjson_insert, Fjson_parse_string)
(Fjson_parse_buffer): Call ensure_json_available only on
WINDOWSNT.
* lisp/subr.el (json-available-p): Simplify.
* configure.ac (HAVE_TREE_SITTER, TREE_SITTER_OBJ): New variables.
(DYNAMIC_LIB_SUFFIX): new variable, I copied code from MODULES_SUFFIX
so the diff looks this way.
* doc/lispref/elisp.texi (Top): Add tree-sitter manual.
* doc/lispref/modes.texi (Font Lock Mode): mention tree-sitter.
(Parser-based Font Lock): New section.
(Auto-Indentation): Mention tree-sitter.
(Parser-based Indentation): New section.
* doc/lispref/parsing.texi (Parsing Program Source): New chapter.
* lisp/emacs-lisp/cl-preloaded.el (cl--typeof-types): Add
treesit-parser and treesit-node type.
* lisp/treesit.el: New file.
* src/Makefile.in (TREE_SITTER_LIBS, TREE_SITTER_FLAGS,
TREE_SITTER_OBJ): New variables.
* src/alloc.c:
(cleanup_vector): Add cleanup code for treesit-parser and
treesit-node.
* src/casefiddle.c (casify_region): Notify tree-sitter parser of
buffer change.
* src/data.c (Ftype_of): Add treesit-parser and treesit-node type
(Qtreesit_parser, Qtreesit_node): New symbol.
* src/emacs.c (main): Add symbols in treesit.c.
* src/eval.c (define_error): Move the function to here.
* src/insdel.c (insert_1_both, insert_from_string_1, insert_from_gap,
insert_from_buffer_1, replace_range, del_range_2): Notify tree-sitter
parser of buffer change.
* src/json.c (define_error): Move this function out.
* src/lisp.h (DEFINE_GDB_SYMBOL_BEGIN): Add treesit-parser and
treesit-node.
* src/lread.c (Vdynamic_library_suffixes): New variable.
* src/print.c (print_vectorlike): Add code for printing
treesit-parser and treesit-node.
* src/treesit.c: New file.
* src/treesit.h: New file.
* test/src/treesit-tests.el: New file.
* src/json.c (Fjson_serialize, Fjson_insert)
(Fjson_parse_string, Fjson_parse_buffer, syms_of_json): Signal
`json-unavailable' if jansson isn't available (bug#48228).
The JSON serialization and parsing functions don't need to modify
these structures.
* src/json.c (lisp_to_json_nonscalar_1, lisp_to_json_nonscalar)
(lisp_to_json, json_to_lisp): Mark configuration object parameter as
const.
Newer standards like RFC 8259, which obsoletes the earlier RFC 4627,
now allow any top-level value unconditionally, so Emacs should too.
* src/json.c (Fjson_serialize, Fjson_insert): Pass JSON_ENCODE_ANY to
allow serialization of any JSON value. Call 'lisp_to_json' instead of
'lisp_to_json_toplevel'. Remove obsolete comments
(neither JSON_DECODE_ANY nor JSON_ALLOW_NUL are allowed here). Reword
documentation strings.
(Fjson_parse_string, Fjson_parse_buffer): Pass JSON_DECODE_ANY to
allow deserialization of any JSON value. Reword documentation
strings.
(lisp_to_json_nonscalar, lisp_to_json_nonscalar_1): Rename from
"toplevel" to avoid confusion.
(lisp_to_json): Adapt caller.
* test/src/json-tests.el (json-serialize/roundtrip-scalars): New unit
test.
* doc/lispref/text.texi (Parsing JSON): Update documentation.
Now that decode_string_utf_8 is available, we can use it to signal
errors on invalid input.
* src/coding.c (syms_of_coding): Move Qutf_8_string_p from json.c
since it’s now used outside json.c.
* src/emacs-module.c (module_decode_utf_8): New helper function.
(module_make_function, module_copy_string_contents): Use it.
* src/coding.c (get_char_bytes, encode_string_utf_8)
(decode_string_utf_8): Fix commentary.
(encode_string_utf_8): Return the original ASCII string only
if NOCOPY is non-zero.
(decode_string_utf_8): Accept 2 additional arguments STR and
STR_LEN, which allow to pass the input text as a C string.
(make_string_from_utf8): Delegate the job to decode_string_utf_8.
* src/coding.h: Update the prototype of decode_string_utf_8.
* src/json.c (json_encode): Call encode_string_utf_8.