mirror of
git://git.sv.gnu.org/emacs.git
synced 2025-12-06 06:20:55 -08:00
Add line-column tracking for tree-sitter parsers. Copied from comments in treesit.c: Technically we had to send tree-sitter the line and column position of each edit. But in practice we just send it dummy values, because tree-sitter doesn't use it for parsing and mostly just carries the line and column positions around and return it when e.g. reporting node positions[1]. This has been working fine until we encountered grammars that actually utilizes the line and column information for parsing (Haskell)[2]. [1] https://github.com/tree-sitter/tree-sitter/issues/445 [2] https://github.com/tree-sitter/tree-sitter/issues/4001 So now we have to keep track of line and column positions and pass valid values to tree-sitter. (It adds quite some complexity, but only linearly; one can ignore all the linecol stuff when trying to understand treesit code and then come back to it later.) Eli convinced me to disable tracking by default, and only enable it for languages that needs it. So the buffer starts out not tracking linecol. And when a parser is created, if the language is in treesit-languages-require-line-column-tracking, we enable tracking in the buffer, and enable tracking for the parser. To simplify things, once a buffer starts tracking linecol, it never disables tracking, even if parsers that need tracking are all deleted; and for parsers, tracking is determined at creation time, if it starts out tracking/non-tracking, it stays that way, regardless of later changes to treesit-languages-require-line-column-tracking. To make calculating line/column positons fast, we store linecol caches for begv, point, and zv in the buffer (buf->ts_linecol_cache_xxx); and in the parser object, we store linecol cache for visible beg/end of that parser. In buffer editing functions, we need the linecol for start/old_end/new_end, those can be calculated by scanning newlines (treesit_linecol_of_pos) from the buffer point cache, which should be always near the point. And we usually set the calculated linecol of new_end back to the buffer point cache. We also need to calculate linecol for the visible_beg/end for each parser, and linecol for the buffer's begv/zv, these positions are usually far from point, so we have caches for all of them (in either the parser object or the buffer). These positions are far from point, so it's inefficient to scan newlines from point to there to get up-to-date linecol for them; but in the same time, because they're far and outside the changed region, we can calculate their change in line and column number by simply counting how much newlines are added/removed in the changed region (compute_new_linecol_by_change). * doc/lispref/parsing.texi (Using Parser): Mention line-column tracking in manual. * etc/NEWS: Add news. * lisp/treesit.el: (treesit-languages-need-line-column-tracking): New variable. * src/buffer.c: Include treesit.h (for TREESIT_EMPTY_LINECOL). (Fget_buffer_create): (Fmake_indirect_buffer): Initialize new buffer fields. (Fbuffer_swap_text): Add new buffer fields. * src/buffer.h (ts_linecol): New struct. (buffer): New buffer fields. (BUF_TS_LINECOL_BEGV): (BUF_TS_LINECOL_POINT): (BUF_TS_LINECOL_ZV): (SET_BUF_TS_LINECOL_BEGV): (SET_BUF_TS_LINECOL_POINT): (SET_BUF_TS_LINECOL_ZV): New inline functions. * src/casefiddle.c (casify_region): Record linecol info. * src/editfns.c (Fsubst_char_in_region): (Ftranslate_region_internal): (Ftranspose_regions): Record linecol info. * src/insdel.c (insert_1_both): (insert_from_string_1): (insert_from_gap_1): (insert_from_buffer): (replace_range): (del_range_2): Record linecol info. * src/treesit.c (TREESIT_BOB_LINECOL): (TREESIT_EMPTY_LINECOL): (TREESIT_TS_POINT_1_0): New constants. (treesit_debug_print_linecol): (treesit_buf_tracks_linecol_p): (restore_restriction_and_selective_display): (treesit_count_lines): (treesit_debug_validate_linecol): (treesit_linecol_of_pos): (treesit_make_ts_point): (Ftreesit_tracking_line_column_p): (Ftreesit_parser_tracking_line_column_p): New functions. (treesit_tree_edit_1): Accept real TSPoint and pass to tree-sitter. (compute_new_linecol_by_change): New function. (treesit_record_change_1): Rename from treesit_record_change, handle linecol if tracking is enabled. (treesit_linecol_maybe): New function. (treesit_record_change): New wrapper around treesit_record_change_1 that handles some boilerplate and sets buffer state. (treesit_sync_visible_region): Handle linecol if tracking is enabled. (make_treesit_parser): Setup parser's linecol cache if tracking is enabled. (Ftreesit_parser_create): Enable tracking if the parser's language requires it. (Ftreesit__linecol_at): (Ftreesit__linecol_cache_set): (Ftreesit__linecol_cache): New functions for debugging and testing. (syms_of_treesit): New variable Vtreesit_languages_require_line_column_tracking. * src/treesit.h (Lisp_TS_Parser): New fields. (TREESIT_BOB_LINECOL): (TREESIT_EMPTY_LINECOL): New constants. * test/src/treesit-tests.el (treesit-linecol-basic): (treesit-linecol-search-back-across-newline): (treesit-linecol-col-same-line): (treesit-linecol-enable-disable): New tests. * src/lisp.h: Declare display_count_lines. * src/xdisp.c (display_count_lines): Remove static keyword. |
||
|---|---|---|
| .. | ||
| abbrevs.texi | ||
| anti.texi | ||
| back.texi | ||
| backups.texi | ||
| book-spine.texi | ||
| buffers.texi | ||
| ChangeLog.1 | ||
| commands.texi | ||
| compile.texi | ||
| control.texi | ||
| customize.texi | ||
| debugging.texi | ||
| display.texi | ||
| doclicense.texi | ||
| edebug.texi | ||
| elisp.texi | ||
| elisp_type_hierarchy.jpg | ||
| elisp_type_hierarchy.txt | ||
| errors.texi | ||
| eval.texi | ||
| files.texi | ||
| frames.texi | ||
| functions.texi | ||
| gpl.texi | ||
| hash.texi | ||
| help.texi | ||
| hooks.texi | ||
| index.texi | ||
| internals.texi | ||
| intro.texi | ||
| keymaps.texi | ||
| lay-flat.texi | ||
| lists.texi | ||
| loading.texi | ||
| macros.texi | ||
| Makefile.in | ||
| maps.texi | ||
| markers.texi | ||
| minibuf.texi | ||
| modes.texi | ||
| nonascii.texi | ||
| numbers.texi | ||
| objects.texi | ||
| os.texi | ||
| package.texi | ||
| parsing.texi | ||
| peg.texi | ||
| positions.texi | ||
| processes.texi | ||
| README | ||
| records.texi | ||
| searching.texi | ||
| sequences.texi | ||
| spellfile | ||
| streams.texi | ||
| strings.texi | ||
| symbols.texi | ||
| syntax.texi | ||
| text.texi | ||
| threads.texi | ||
| tips.texi | ||
| two-volume-cross-refs.txt | ||
| two-volume.make | ||
| variables.texi | ||
| windows.texi | ||
Copyright (C) 2001-2025 Free Software Foundation, Inc. -*- outline -*-
See the end of the file for license conditions.
README for the Emacs Lisp Reference Manual.
* This directory contains the texinfo source files for the Emacs Lisp
Reference Manual.
* Report bugs in the Lisp Manual (or in Emacs) using M-x report-emacs-bug.
To ask questions, use the help-gnu-emacs mailing list.
* The Emacs Lisp Reference Manual is quite large. It totals around
1100 pages in smallbook format; the info files total around 3.0 megabytes.
* You can format this manual for Info, for printing hardcopy using TeX,
or for HTML.
* You can buy nicely printed copies from the Free Software Foundation.
Buying a manual from the Free Software Foundation helps support our GNU
development work. See <https://shop.fsf.org/>.
(At time of writing, this manual is out of print.)
* The master file for formatting this manual for Tex is called 'elisp.texi'.
It contains @include commands to include all the chapters that make up
the manual.
* This distribution contains a Makefile that you can use with GNU Make.
** To make an Info file, you need to install Texinfo, then run 'make info'.
** Use 'make elisp.pdf' or 'make elisp.html' to create PDF or HTML versions.
This file is part of GNU Emacs.
GNU Emacs is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
GNU Emacs is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.