mirror of
git://git.sv.gnu.org/emacs.git
synced 2026-01-30 12:21:25 -08:00
Add line-column tracking for tree-sitter parsers. Copied from comments in treesit.c: Technically we had to send tree-sitter the line and column position of each edit. But in practice we just send it dummy values, because tree-sitter doesn't use it for parsing and mostly just carries the line and column positions around and return it when e.g. reporting node positions[1]. This has been working fine until we encountered grammars that actually utilizes the line and column information for parsing (Haskell)[2]. [1] https://github.com/tree-sitter/tree-sitter/issues/445 [2] https://github.com/tree-sitter/tree-sitter/issues/4001 So now we have to keep track of line and column positions and pass valid values to tree-sitter. (It adds quite some complexity, but only linearly; one can ignore all the linecol stuff when trying to understand treesit code and then come back to it later.) Eli convinced me to disable tracking by default, and only enable it for languages that needs it. So the buffer starts out not tracking linecol. And when a parser is created, if the language is in treesit-languages-require-line-column-tracking, we enable tracking in the buffer, and enable tracking for the parser. To simplify things, once a buffer starts tracking linecol, it never disables tracking, even if parsers that need tracking are all deleted; and for parsers, tracking is determined at creation time, if it starts out tracking/non-tracking, it stays that way, regardless of later changes to treesit-languages-require-line-column-tracking. To make calculating line/column positons fast, we store linecol caches for begv, point, and zv in the buffer (buf->ts_linecol_cache_xxx); and in the parser object, we store linecol cache for visible beg/end of that parser. In buffer editing functions, we need the linecol for start/old_end/new_end, those can be calculated by scanning newlines (treesit_linecol_of_pos) from the buffer point cache, which should be always near the point. And we usually set the calculated linecol of new_end back to the buffer point cache. We also need to calculate linecol for the visible_beg/end for each parser, and linecol for the buffer's begv/zv, these positions are usually far from point, so we have caches for all of them (in either the parser object or the buffer). These positions are far from point, so it's inefficient to scan newlines from point to there to get up-to-date linecol for them; but in the same time, because they're far and outside the changed region, we can calculate their change in line and column number by simply counting how much newlines are added/removed in the changed region (compute_new_linecol_by_change). * doc/lispref/parsing.texi (Using Parser): Mention line-column tracking in manual. * etc/NEWS: Add news. * lisp/treesit.el: (treesit-languages-need-line-column-tracking): New variable. * src/buffer.c: Include treesit.h (for TREESIT_EMPTY_LINECOL). (Fget_buffer_create): (Fmake_indirect_buffer): Initialize new buffer fields. (Fbuffer_swap_text): Add new buffer fields. * src/buffer.h (ts_linecol): New struct. (buffer): New buffer fields. (BUF_TS_LINECOL_BEGV): (BUF_TS_LINECOL_POINT): (BUF_TS_LINECOL_ZV): (SET_BUF_TS_LINECOL_BEGV): (SET_BUF_TS_LINECOL_POINT): (SET_BUF_TS_LINECOL_ZV): New inline functions. * src/casefiddle.c (casify_region): Record linecol info. * src/editfns.c (Fsubst_char_in_region): (Ftranslate_region_internal): (Ftranspose_regions): Record linecol info. * src/insdel.c (insert_1_both): (insert_from_string_1): (insert_from_gap_1): (insert_from_buffer): (replace_range): (del_range_2): Record linecol info. * src/treesit.c (TREESIT_BOB_LINECOL): (TREESIT_EMPTY_LINECOL): (TREESIT_TS_POINT_1_0): New constants. (treesit_debug_print_linecol): (treesit_buf_tracks_linecol_p): (restore_restriction_and_selective_display): (treesit_count_lines): (treesit_debug_validate_linecol): (treesit_linecol_of_pos): (treesit_make_ts_point): (Ftreesit_tracking_line_column_p): (Ftreesit_parser_tracking_line_column_p): New functions. (treesit_tree_edit_1): Accept real TSPoint and pass to tree-sitter. (compute_new_linecol_by_change): New function. (treesit_record_change_1): Rename from treesit_record_change, handle linecol if tracking is enabled. (treesit_linecol_maybe): New function. (treesit_record_change): New wrapper around treesit_record_change_1 that handles some boilerplate and sets buffer state. (treesit_sync_visible_region): Handle linecol if tracking is enabled. (make_treesit_parser): Setup parser's linecol cache if tracking is enabled. (Ftreesit_parser_create): Enable tracking if the parser's language requires it. (Ftreesit__linecol_at): (Ftreesit__linecol_cache_set): (Ftreesit__linecol_cache): New functions for debugging and testing. (syms_of_treesit): New variable Vtreesit_languages_require_line_column_tracking. * src/treesit.h (Lisp_TS_Parser): New fields. (TREESIT_BOB_LINECOL): (TREESIT_EMPTY_LINECOL): New constants. * test/src/treesit-tests.el (treesit-linecol-basic): (treesit-linecol-search-back-across-newline): (treesit-linecol-col-same-line): (treesit-linecol-enable-disable): New tests. * src/lisp.h: Declare display_count_lines. * src/xdisp.c (display_count_lines): Remove static keyword. |
||
|---|---|---|
| .. | ||
| charsets | ||
| e | ||
| forms | ||
| gnus | ||
| images | ||
| nxml | ||
| org | ||
| refcards | ||
| schema | ||
| srecode | ||
| themes | ||
| tutorials | ||
| AUTHORS | ||
| CALC-NEWS | ||
| ChangeLog.1 | ||
| compilation.txt | ||
| COPYING | ||
| copyright-assign.txt | ||
| DEBUG | ||
| DEVEL.HUMOR | ||
| DISTRIB | ||
| edt-user.el | ||
| EGLOT-NEWS | ||
| emacs-buffer.gdb | ||
| emacs-mail.desktop | ||
| emacs.desktop | ||
| emacs.icon | ||
| emacs.metainfo.xml | ||
| emacs.service | ||
| emacs_lldb.py | ||
| emacsclient-mail.desktop | ||
| emacsclient.desktop | ||
| enriched.txt | ||
| ERC-NEWS | ||
| ETAGS.EBNF | ||
| ETAGS.README | ||
| future-bug | ||
| gnus-tut.txt | ||
| grep.txt | ||
| HELLO | ||
| HISTORY | ||
| JOKES | ||
| MACHINES | ||
| MH-E-NEWS | ||
| NEWS | ||
| NEWS.1-17 | ||
| NEWS.18 | ||
| NEWS.19 | ||
| NEWS.20 | ||
| NEWS.21 | ||
| NEWS.22 | ||
| NEWS.23 | ||
| NEWS.24 | ||
| NEWS.25 | ||
| NEWS.26 | ||
| NEWS.27 | ||
| NEWS.28 | ||
| NEWS.29 | ||
| NEWS.30 | ||
| NEXTSTEP | ||
| NXML-NEWS | ||
| ORG-NEWS | ||
| org.gnu.emacs.defaults.gschema.xml | ||
| package-keyring.gpg | ||
| PROBLEMS | ||
| ps-prin0.ps | ||
| ps-prin1.ps | ||
| publicsuffix.txt | ||
| README | ||
| rgb.txt | ||
| ses-example.ses | ||
| spook.lines | ||
| symbol-releases.eld | ||
| TERMS | ||
| TODO | ||
| w32-feature.el | ||
| yow.lines | ||
This directory contains the architecture-independent files used by or with Emacs. This includes some text files of documentation for GNU Emacs or of interest to Emacs users, and the file of dumped docstrings for Emacs functions and variables. COPYRIGHT AND LICENSE INFORMATION FOR IMAGE FILES File: emacs.icon Author: Sun Microsystems, Inc Copyright (C) 1999, 2001-2025 Free Software Foundation, Inc. License: GNU General Public License version 3 or later (see COPYING)