TiddlyWiki5/plugins/tiddlywiki/freelinks
s793016 cda8d7ca8c
Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084)
* Enhance Freelinks with Aho-Corasick for long titles and large wikis

Replaces regex with Aho-Corasick, adds chunking (100 titles/chunk), cache toggle, and Chinese full-width symbol support. Tested with 11,714 tiddlers.

* delete comment

* Create AhoCorasick.js

* Update text.js

* Update text.js

* Update AhoCorasick.js

* Update text.js

* Update text.js

* move AhoCorasick to AhoCorasick.js

* update AhoCorasick.js

* Delete core/modules/utils/AhoCorasick.js

wrong place

* Update text.js

* indentation modify

* remove function {}

* remove function {}

* Rename AhoCorasick.js to aho-corasick.js

correct filename

* Update tiddlywiki.info add freelink

* missing a comma here

* clean up comments & use old style

* try add it to editions/tw5.com-server for testing

* try add it to editions/prerelease for testing

* optimized

* optimized

* add setting for "Persist AhoCorasick cache"

* add  dynamic limits

* remove comment

* revert to 5f0b98d1fd

* try sort alphabet

* try sort alphabet

* try sort alphabet

* typo freelink -> freelinks

* typo freelink -> freelinks

* typo freelink -> freelinks

* Update readme.tid

* Update aho-corasick.js

Dynamically adjust limit parameters to avoid problems caused by hard-coded limits.

* Update text.js

Dynamically adjust limit parameters to avoid problems caused by hard-coded limits.

* Update tiddlywiki.info

remove other plugin for test plugin conflict

* Update tiddlywiki.info

* Update tiddlywiki.info

* Update aho-corasick.js

Description of major changes

Improve state transition logic - Ensure to go back to root node correctly in case of mismatch, and check root node for current character transition 
Fix failed link traversal - Add condition node ! == this.trie to avoid infinite loop at root node 
Enhance output collection - collect output not only from current node, but also from all nodes on failed link path, which is key to Aho-Corasick algorithm 
Add safety limit - collectCount < 10 to prevent failed link loops

Translated with DeepL.com (free version)

* Update aho-corasick.js

Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:

Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters 
At least one non-unicode character must be present before and after the match for it to be considered valid.

* Update text.js

Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:

Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters 
At least one non-unicode character must be present before and after the match for it to be considered valid.

* Update settings.tid

Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:

Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters 
At least one non-unicode character must be present before and after the match for it to be considered valid.

* fix Word Boundary logic

* remove PersistentCache @ text.js

* remove PersistentCache @settings.tid

* Update readme.tid for Word Boundary Check

* Update aho-corasick.js Organize and delete comments

* Initial commit of freelinks plugin

* Update settings.tid Organize and delete comments

* Update tiddlywiki.info add back other plugin

* Update tiddlywiki.info alphabet sort

* Update readme.tid for new future

The plugin supports non-Western language tiddler titles (e.g., Chinese) and prioritizes longer tiddler titles for matching, ensuring accurate linking in diverse contexts. 

Furthermore, the current tiddler title within its own content is excluded from generating links to avoid self-referencing.

* Update readme.tid

* Update plugins/tiddlywiki/freelinks/text.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/aho-corasick.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/aho-corasick.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/text.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/aho-corasick.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update text.js

Added locale configuration support - Added LOCALE_CONFIG_TIDDLER constant to make the sorting locale configurable instead of hardcoded "zh"
Optimized title processing - Combined the filtering and escaping logic into a single pass to reduce duplication
Added trim() for ignoreCase - Applied .trim() to the ignore case variable for consistency
Enhanced refresh logic - Added locale configuration tiddler to the refresh check
Improved comments - Added explanation for why sorting is necessary (prioritizing longer titles)

* Update text.js

we don't need to specify 'zh' at all

* Update aho-corasick.js

This single line change would add support for:

Accented letters: á, é, í, ó, ú, à, è, ì, ò, ù, ä, ë, ï, ö, ü, ñ, ç, etc.
Most Western European languages (Spanish, French, German, Italian, Portuguese, etc.)

* Update aho-corasick.js useage

* Update readme.tid for Writing Style

* Update tiddlywiki.info

revert all the changes

* Update tiddlywiki.info

revert all the changes

* Update tiddlywiki.info

revert all the changes

* Update tiddlywiki.info

revert

* Update text.js

plugins/tiddlywiki/freelinks/text.js#L25
[ESLint PR code] reported by reviewdog 🐶
Strings must use doublequote.

* Update aho-corasick.js

plugins/tiddlywiki/freelinks/aho-corasick.js#L193
[ESLint PR code] reported by reviewdog 🐶
Strings must use doublequote.

Raw Output:
{"ruleId":"@stylistic/quotes","severity":2,"message":"Strings must use doublequote.","line":193,"column":50,"nodeType":"Literal","messageId":"wrongQuotes","endLine":193,"endColumn":52,"fix":{"range":[5743,5745],"text":"\"\""}}

---------

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
2025-10-29 17:41:35 +00:00
..
aho-corasick.js Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084) 2025-10-29 17:41:35 +00:00
config-Freelinks-Enable.tid Initial commit of freelinks plugin 2020-01-03 10:40:35 +00:00
config-Freelinks-WordBoundary.tid Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084) 2025-10-29 17:41:35 +00:00
macros-view.tid Freelinks plugin: Add support for ignoring case 2020-05-02 14:07:39 +01:00
plain-text.js Remove module function wrapper and add matching configurations for dprint and eslint (#7596) 2025-03-21 17:22:57 +00:00
plugin.info Add plugin stability badges (#8198) 2024-05-21 11:22:39 +01:00
readme.tid Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084) 2025-10-29 17:41:35 +00:00
settings.tid Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084) 2025-10-29 17:41:35 +00:00
styles.tid Add a faint background to freelinks 2020-01-04 16:33:52 +00:00
text.js Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084) 2025-10-29 17:41:35 +00:00

title: $:/plugins/tiddlywiki/freelinks/readme

This plugin adds automatic generation of links to tiddler titles.

The plugin uses the Aho-Corasick algorithm for efficient pattern matching with large numbers of tiddlers.

!! Configuration

Freelinking is activated for runs of text that have the following variables set:

* `tv-wikilinks` is NOT equal to `no`
* `tv-freelinks` is set to `yes`

Freelinks are case sensitive by default but can be configured to ignore case in the settings panel.

Word boundary checking can be configured in the settings panel. When enabled (default for Western languages), links are created only for complete words. When disabled, partial matches within words are also linked.

When multiple tiddler titles match the same text, longer titles take precedence over shorter ones.

Tiddlers do not create links to themselves.

Use `$:/config/Freelinks/TargetFilter` to define which tiddlers are eligible for auto-linking.

Within view templates, the variable `tv-freelinks` is automatically set to the content of `$:/config/Freelinks/Enable`, which can be set via the settings panel of this plugin.

!! Notes

To change within which tiddlers freelinking occurs requires customising the shadow tiddler [[$:/plugins/tiddlywiki/freelinks/macros/view]]. This tiddler is tagged `$:/tags/Macro/View` which means that it will be included as a local macro in each view template. By default, its content is:

```
<$set name="tv-freelinks" value={{$:/config/Freelinks/Enable}}/>
```

That means that for each tiddler the variable `tv-freelinks` will be set to the tiddler `$:/config/Freelinks/Enable`, which is set to "yes" or "no" by the settings in control panel.

Instead, we can use a filter expression to, say, only freelink within the tiddler with the title "HelloThere":

```
<$set name="tv-freelinks" value={{{ [<currentTiddler>match[HelloThere]then[yes]else[no]] }}}/>
```

Or, we can make a filter that will only freelink within tiddlers with the tag "MyTag":

```
<$set name="tv-freelinks" value={{{ [<currentTiddler>tag[MyTags]then[yes]else[no]] }}}/>
```

Or we can combine both approaches:

```
<$set name="tv-freelinks" value={{{ [<currentTiddler>match[HelloThere]] ~[<currentTiddler>tag[MyTag]] +[then[yes]else[no]] }}}/>
```

!! Implementation Details

The Aho-Corasick algorithm implementation includes:

* Unicode character support for international text
* Prevention of self-referential links within the current tiddler
* Performance safeguards including depth protection and result limiting
* Graceful fallback handling for invalid patterns

Longer tiddler titles take precedence over shorter ones when multiple matches are possible.