1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2025-12-15 10:30:25 -08:00

; Fix typos

This commit is contained in:
Stefan Kangas 2024-07-07 17:40:31 +02:00
parent 41dc28244f
commit a6cab228d4
74 changed files with 250 additions and 256 deletions

View file

@ -3,14 +3,14 @@ TREE-SITTER PERFORMANCE NOTES -*- org -*-
* Facts
Incremental parsing of a few characters worth of edit usually takes
less than 0.1ms. If it takes longer than that, something is wrong.
less than 0.1ms. If it takes longer than that, something is wrong.
Theres one time where I found tree-sitter-c takes ~30ms to
incremental parse. Updating to the latest version of tree-sitter-c
incremental parse. Updating to the latest version of tree-sitter-c
solves it, so I didnt investigate further.
The ranges set for a parser doesnt grow when you insert text into a
range, so you have to update the ranges every time before
parsing. Fortunately, changing ranges doesnt invalidate incremental
parsing. Fortunately, changing ranges doesnt invalidate incremental
parsing, so there isnt any performance lost in update ranges
frequently.

View file

@ -35,8 +35,8 @@ merged) and rebuild Emacs.
* Install language definitions
Tree-sitter by itself doesnt know how to parse any particular
language. We need to install language definitions (or “grammars”) for
a language to be able to parse it. There are a couple of ways to get
language. We need to install language definitions (or “grammars”) for
a language to be able to parse it. There are a couple of ways to get
them.
You can use this script that I put together here:
@ -45,7 +45,7 @@ You can use this script that I put together here:
This script automatically pulls and builds language definitions for C,
C++, Rust, JSON, Go, HTML, JavaScript, CSS, Python, Typescript,
C#, etc. Better yet, I pre-built these language definitions for
C#, etc. Better yet, I pre-built these language definitions for
GNU/Linux and macOS, they can be downloaded here:
https://github.com/casouri/tree-sitter-module/releases/tag/v2.1
@ -56,19 +56,19 @@ To build them yourself, run
cd tree-sitter-module
./batch.sh
and language definitions will be in the /dist directory. You can
and language definitions will be in the /dist directory. You can
either copy them to standard dynamic library locations of your system,
eg, /usr/local/lib, or leave them in /dist and later tell Emacs where
e.g., /usr/local/lib, or leave them in /dist and later tell Emacs where
to find language definitions by setting treesit-extra-load-path.
Language definition sources can be found on GitHub under
tree-sitter/xxx, like tree-sitter/tree-sitter-python. The tree-sitter
tree-sitter/xxx, like tree-sitter/tree-sitter-python. The tree-sitter
organization has all the "official" language definitions:
https://github.com/tree-sitter
Alternatively, you can use treesit-install-language-grammar command
and follow its instructions. If everything goes right, it should
and follow its instructions. If everything goes right, it should
automatically download and compile the language grammar for you.
* Setting up for adding major mode features
@ -91,7 +91,7 @@ Tree-sitter modes should be separate major modes, so other modes
inheriting from the original mode don't break if tree-sitter is
enabled. For example js2-mode inherits js-mode, we can't enable
tree-sitter in js-mode, lest js-mode would not setup things that
js2-mode expects to inherit from. So it's best to use separate major
js2-mode expects to inherit from. So it's best to use separate major
modes.
If the tree-sitter variant and the "native" variant could share some
@ -115,12 +115,12 @@ symbol (variable, function).
Tree-sitter works like this: You provide a query made of patterns and
capture names, tree-sitter finds the nodes that match these patterns,
tag the corresponding capture names onto the nodes and return them to
you. The query function returns a list of (capture-name . node). For
font-lock, we use face names as capture names. And the captured node
you. The query function returns a list of (capture-name . node). For
font-lock, we use face names as capture names. And the captured node
will be fontified in their capture name.
The capture name could also be a function, in which case (NODE
OVERRIDE START END) is passed to the function for fontification. START
OVERRIDE START END) is passed to the function for fontification. START
and END are the start and end of the region to be fontified. The
function should only fontify within that region. The function should
also allow more optional arguments with (&rest _), for future
@ -131,11 +131,11 @@ treesit-font-lock-rules.
There are two types of nodes, named, like (identifier),
(function_definition), and anonymous, like "return", "def", "(",
"}". Parent-child relationship is expressed as
"}". Parent-child relationship is expressed as
(parent (child) (child) (child (grand_child)))
Eg, an argument list (1, "3", 1) could be:
For example, an argument list (1, "3", 1) could be:
(argument_list "(" (number) (string) (number) ")")
@ -167,7 +167,7 @@ But how do one come up with the queries? Take python for an example,
open any python source file, type M-x treesit-explore-mode RET. Now
you should see the parse-tree in a separate window, automatically
updated as you select text or edit the buffer. Besides this, you can
consult the grammar of the language definition. For example, Pythons
consult the grammar of the language definition. For example, Pythons
grammar file is at
https://github.com/tree-sitter/tree-sitter-python/blob/master/grammar.js
@ -182,24 +182,24 @@ The manual explains how to read grammar files in the bottom of section
** Debugging queries
If your query has problems, use treesit-query-validate to debug the
query. It will pop a buffer containing the query (in text format) and
query. It will pop a buffer containing the query (in text format) and
mark the offending part in red.
** Code
To enable tree-sitter font-lock, set treesit-font-lock-settings and
treesit-font-lock-feature-list buffer-locally and call
treesit-major-mode-setup. For example, see
python--treesit-settings in python.el. Below is a snippet of it.
treesit-major-mode-setup. For example, see
python--treesit-settings in python.el. Below is a snippet of it.
Just like the current font-lock, if the to-be-fontified region already
has a face (ie, an earlier match fontified part/all of the region),
the new face is discarded rather than applied. If you want later
the new face is discarded rather than applied. If you want later
matches always override earlier matches, use the :override keyword.
Each rule should have a :feature, like function-name,
string-interpolation, builtin, etc. Users can then enable/disable each
feature individually. See Appendix 1 at the bottom for a set of common
string-interpolation, builtin, etc. Users can then enable/disable each
feature individually. See Appendix 1 at the bottom for a set of common
features names.
#+begin_src elisp
@ -267,17 +267,17 @@ Indent works like this: We have a bunch of rules that look like
(MATCHER ANCHOR OFFSET)
When the indentation process starts, point is at the BOL of a line, we
want to know which column to indent this line to. Let NODE be the node
want to know which column to indent this line to. Let NODE be the node
at point, we pass this node to the MATCHER of each rule, one of them
will match the node (eg, "this node is a closing bracket!"). Then we
pass the node to the ANCHOR, which returns a point, eg, the BOL of the
previous line. We find the column number of that point (eg, 4), add
OFFSET to it (eg, 0), and that is the column we want to indent the
will match the node (e.g., "this node is a closing bracket!"). Then we
pass the node to the ANCHOR, which returns a point, e.g., the BOL of the
previous line. We find the column number of that point (e.g., 4), add
OFFSET to it (e.g., 0), and that is the column we want to indent the
current line to (4 + 0 = 4).
Matchers and anchors are functions that takes (NODE PARENT BOL &rest
_). Matches return nil/non-nil for no match/match, and anchors return
the anchor point. Below are some convenient builtin matchers and anchors.
_). Matches return nil/non-nil for no match/match, and anchors return
the anchor point. Below are some convenient builtin matchers and anchors.
For MATCHER we have
@ -289,8 +289,8 @@ For MATCHER we have
(match NODE-TYPE PARENT-TYPE NODE-FIELD
NODE-INDEX-MIN NODE-INDEX-MAX)
=> checks everything. If an argument is nil, dont match that. Eg,
(match nil TYPE) is the same as (parent-is TYPE)
=> checks everything. If an argument is nil, dont match that.
E.g., (match nil TYPE) is the same as (parent-is TYPE)
For ANCHOR we have
@ -305,8 +305,8 @@ For ANCHOR we have
There is also a manual section for indent: "Parser-based Indentation".
When writing indent rules, you can use treesit-check-indent to
check if your indentation is correct. To debug what went wrong, set
treesit--indent-verbose to non-nil. Then when you indent, Emacs
check if your indentation is correct. To debug what went wrong, set
treesit--indent-verbose to non-nil. Then when you indent, Emacs
tells you which rule is applied in the echo area.
#+begin_src elisp
@ -355,7 +355,7 @@ Set treesit-simple-imenu-settings and call
* Navigation
Set treesit-defun-type-regexp and call
treesit-major-mode-setup. You can additionally set
treesit-major-mode-setup. You can additionally set
treesit-defun-name-function.
* Which-func
@ -370,7 +370,7 @@ find the current function by treesit-defun-at-point.
Obviously this list is just a starting point, if there are features in
the major mode that would benefit from a parse tree, adding tree-sitter
support for that would be great. But in the minimal case, just adding
support for that would be great. But in the minimal case, just adding
font-lock is awesome.
* Common tasks
@ -403,12 +403,12 @@ BTW treesit-node-string does different things.
* Manual
I suggest you read the manual section for tree-sitter in Info. The
section is Parsing Program Source. Typing
I suggest you read the manual section for tree-sitter in Info. The
section is Parsing Program Source. Typing
C-h i d m elisp RET g Parsing Program Source RET
will bring you to that section. You dont need to read through every
will bring you to that section. You dont need to read through every
sentence, just read the text paragraphs and glance over function
names.
@ -439,13 +439,13 @@ error highlight parse error
Abstract features:
assignment: the LHS of an assignment (thing being assigned to), eg:
assignment: the LHS of an assignment (thing being assigned to), e.g.:
a = b <--- highlight a
a.b = c <--- highlight b
a[1] = d <--- highlight a
definition: the thing being defined, eg:
definition: the thing being defined, e.g.:
int a(int b) { <--- highlight a
return 0

View file

@ -47,7 +47,7 @@ EXCEPTIONS
There are a couple of functions that replaces characters in-place
rather than insert/delete. They are in casefiddle.c and editfns.c.
rather than insert/delete. They are in casefiddle.c and editfns.c.
In casefiddle.c, do_casify_unibyte_region and
do_casify_multibyte_region modifies buffer, but they are static
@ -177,7 +177,7 @@ all safe.
json.c:790: signal_after_change (PT, 0, inserted);
Called in json-insert, calls either decode_coding_gap or
insert_from_gap_1, both are safe. Calls memmove but its for
insert_from_gap_1, both are safe. Calls memmove but its for
decode_coding_gap.
keymap.c:2873: /* Insert calls signal_after_change which may GC. */