mirror of
https://github.com/Jermolene/TiddlyWiki5.git
synced 2025-12-05 18:20:38 -08:00
Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084)
* Enhance Freelinks with Aho-Corasick for long titles and large wikis
Replaces regex with Aho-Corasick, adds chunking (100 titles/chunk), cache toggle, and Chinese full-width symbol support. Tested with 11,714 tiddlers.
* delete comment
* Create AhoCorasick.js
* Update text.js
* Update text.js
* Update AhoCorasick.js
* Update text.js
* Update text.js
* move AhoCorasick to AhoCorasick.js
* update AhoCorasick.js
* Delete core/modules/utils/AhoCorasick.js
wrong place
* Update text.js
* indentation modify
* remove function {}
* remove function {}
* Rename AhoCorasick.js to aho-corasick.js
correct filename
* Update tiddlywiki.info add freelink
* missing a comma here
* clean up comments & use old style
* try add it to editions/tw5.com-server for testing
* try add it to editions/prerelease for testing
* optimized
* optimized
* add setting for "Persist AhoCorasick cache"
* add dynamic limits
* remove comment
* revert to 5f0b98d1fd
* try sort alphabet
* try sort alphabet
* try sort alphabet
* typo freelink -> freelinks
* typo freelink -> freelinks
* typo freelink -> freelinks
* Update readme.tid
* Update aho-corasick.js
Dynamically adjust limit parameters to avoid problems caused by hard-coded limits.
* Update text.js
Dynamically adjust limit parameters to avoid problems caused by hard-coded limits.
* Update tiddlywiki.info
remove other plugin for test plugin conflict
* Update tiddlywiki.info
* Update tiddlywiki.info
* Update aho-corasick.js
Description of major changes
Improve state transition logic - Ensure to go back to root node correctly in case of mismatch, and check root node for current character transition
Fix failed link traversal - Add condition node ! == this.trie to avoid infinite loop at root node
Enhance output collection - collect output not only from current node, but also from all nodes on failed link path, which is key to Aho-Corasick algorithm
Add safety limit - collectCount < 10 to prevent failed link loops
Translated with DeepL.com (free version)
* Update aho-corasick.js
Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:
Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters
At least one non-unicode character must be present before and after the match for it to be considered valid.
* Update text.js
Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:
Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters
At least one non-unicode character must be present before and after the match for it to be considered valid.
* Update settings.tid
Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:
Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters
At least one non-unicode character must be present before and after the match for it to be considered valid.
* fix Word Boundary logic
* remove PersistentCache @ text.js
* remove PersistentCache @settings.tid
* Update readme.tid for Word Boundary Check
* Update aho-corasick.js Organize and delete comments
* Initial commit of freelinks plugin
* Update settings.tid Organize and delete comments
* Update tiddlywiki.info add back other plugin
* Update tiddlywiki.info alphabet sort
* Update readme.tid for new future
The plugin supports non-Western language tiddler titles (e.g., Chinese) and prioritizes longer tiddler titles for matching, ensuring accurate linking in diverse contexts.
Furthermore, the current tiddler title within its own content is excluded from generating links to avoid self-referencing.
* Update readme.tid
* Update plugins/tiddlywiki/freelinks/text.js
Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
* Update plugins/tiddlywiki/freelinks/aho-corasick.js
Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
* Update plugins/tiddlywiki/freelinks/aho-corasick.js
Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
* Update plugins/tiddlywiki/freelinks/text.js
Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
* Update plugins/tiddlywiki/freelinks/aho-corasick.js
Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
* Update text.js
Added locale configuration support - Added LOCALE_CONFIG_TIDDLER constant to make the sorting locale configurable instead of hardcoded "zh"
Optimized title processing - Combined the filtering and escaping logic into a single pass to reduce duplication
Added trim() for ignoreCase - Applied .trim() to the ignore case variable for consistency
Enhanced refresh logic - Added locale configuration tiddler to the refresh check
Improved comments - Added explanation for why sorting is necessary (prioritizing longer titles)
* Update text.js
we don't need to specify 'zh' at all
* Update aho-corasick.js
This single line change would add support for:
Accented letters: á, é, í, ó, ú, à, è, ì, ò, ù, ä, ë, ï, ö, ü, ñ, ç, etc.
Most Western European languages (Spanish, French, German, Italian, Portuguese, etc.)
* Update aho-corasick.js useage
* Update readme.tid for Writing Style
* Update tiddlywiki.info
revert all the changes
* Update tiddlywiki.info
revert all the changes
* Update tiddlywiki.info
revert all the changes
* Update tiddlywiki.info
revert
* Update text.js
plugins/tiddlywiki/freelinks/text.js#L25
[ESLint PR code] reported by reviewdog 🐶
Strings must use doublequote.
* Update aho-corasick.js
plugins/tiddlywiki/freelinks/aho-corasick.js#L193
[ESLint PR code] reported by reviewdog 🐶
Strings must use doublequote.
Raw Output:
{"ruleId":"@stylistic/quotes","severity":2,"message":"Strings must use doublequote.","line":193,"column":50,"nodeType":"Literal","messageId":"wrongQuotes","endLine":193,"endColumn":52,"fix":{"range":[5743,5745],"text":"\"\""}}
---------
Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
This commit is contained in:
parent
f38e9f0822
commit
cda8d7ca8c
5 changed files with 507 additions and 113 deletions
242
plugins/tiddlywiki/freelinks/aho-corasick.js
Normal file
242
plugins/tiddlywiki/freelinks/aho-corasick.js
Normal file
|
|
@ -0,0 +1,242 @@
|
|||
/*\
|
||||
title: $:/core/modules/utils/aho-corasick.js
|
||||
type: application/javascript
|
||||
module-type: utils
|
||||
|
||||
Optimized Aho-Corasick string matching algorithm implementation with enhanced performance
|
||||
and error handling for TiddlyWiki freelinking functionality.
|
||||
|
||||
Useage:
|
||||
|
||||
Initialization:
|
||||
Create an AhoCorasick instance: var ac = new AhoCorasick();
|
||||
After initialization, the trie and failure structures are automatically created to store patterns and failure links.
|
||||
|
||||
Adding Patterns:
|
||||
Call addPattern(pattern, index) to add a pattern, e.g., ac.addPattern("[[Link]]", 0);.
|
||||
pattern is the string to match, and index is an identifier for tracking results.
|
||||
Multiple patterns can be added, stored in the trie structure.
|
||||
|
||||
Building Failure Links:
|
||||
Call buildFailureLinks() to construct failure links for efficient multi-pattern matching.
|
||||
Includes a maximum node limit (default 100,000 or 15 times the pattern count) to prevent excessive computation.
|
||||
|
||||
Performing Search:
|
||||
Use search(text, useWordBoundary) to find pattern matches in the text.
|
||||
text is the input string, and useWordBoundary (boolean) controls whether to enforce word boundary checks.
|
||||
Returns an array of match results, each containing pattern (matched pattern), index (start position), length (pattern length), and titleIndex (pattern identifier).
|
||||
|
||||
Word Boundary Check:
|
||||
If useWordBoundary is true, only matches surrounded by non-word characters (letters, digits, or underscores) are returned.
|
||||
|
||||
Cleanup and Statistics:
|
||||
Use clear() to reset the trie and failure links, freeing memory.
|
||||
Use getStats() to retrieve statistics, including node count (nodeCount), pattern count (patternCount), and failure link count (failureLinks).
|
||||
|
||||
Notes
|
||||
Performance Considerations: The Aho-Corasick trie may consume significant memory with a large number of patterns. Limit the number of patterns (e.g., <10,000) for optimal performance.
|
||||
Error Handling: The module includes maximum node and failure depth limits (maxFailureDepth) to prevent infinite loops or memory overflow.
|
||||
Word Boundary: Enabling useWordBoundary ensures more precise matches, ideal for link detection scenarios.
|
||||
Compatibility: Ensure compatibility with other TiddlyWiki modules (e.g., wikiparser.js) when processing WikiText.
|
||||
Debugging: Use getStats() to inspect the trie structure's size and ensure it does not overload browser memory.
|
||||
|
||||
\*/
|
||||
|
||||
"use strict";
|
||||
|
||||
function AhoCorasick() {
|
||||
this.trie = {};
|
||||
this.failure = {};
|
||||
this.maxFailureDepth = 100;
|
||||
this.patternCount = 0;
|
||||
}
|
||||
|
||||
AhoCorasick.prototype.addPattern = function(pattern, index) {
|
||||
if(!pattern || typeof pattern !== "string" || pattern.length === 0) {
|
||||
return;
|
||||
}
|
||||
|
||||
var node = this.trie;
|
||||
|
||||
for(var i = 0; i < pattern.length; i++) {
|
||||
var char = pattern[i];
|
||||
if(!node[char]) {
|
||||
node[char] = {};
|
||||
}
|
||||
node = node[char];
|
||||
}
|
||||
|
||||
if(!node.$) {
|
||||
node.$ = [];
|
||||
}
|
||||
node.$.push({
|
||||
pattern: pattern,
|
||||
index: index,
|
||||
length: pattern.length
|
||||
});
|
||||
|
||||
this.patternCount++;
|
||||
};
|
||||
|
||||
AhoCorasick.prototype.buildFailureLinks = function() {
|
||||
var queue = [];
|
||||
var root = this.trie;
|
||||
this.failure[root] = root;
|
||||
|
||||
for(var char in root) {
|
||||
if(root[char] && char !== "$") {
|
||||
this.failure[root[char]] = root;
|
||||
queue.push(root[char]);
|
||||
}
|
||||
}
|
||||
|
||||
var processedNodes = 0;
|
||||
var maxNodes = Math.max(100000, this.patternCount * 15);
|
||||
|
||||
while(queue.length > 0 && processedNodes < maxNodes) {
|
||||
var node = queue.shift();
|
||||
processedNodes++;
|
||||
|
||||
for(var char in node) {
|
||||
if(node[char] && char !== "$") {
|
||||
var child = node[char];
|
||||
var fail = this.failure[node];
|
||||
var failureDepth = 0;
|
||||
|
||||
while(fail && !fail[char] && failureDepth < this.maxFailureDepth) {
|
||||
fail = this.failure[fail];
|
||||
failureDepth++;
|
||||
}
|
||||
|
||||
var failureLink = (fail && fail[char]) ? fail[char] : root;
|
||||
this.failure[child] = failureLink;
|
||||
|
||||
var failureOutput = this.failure[child];
|
||||
if(failureOutput && failureOutput.$) {
|
||||
if(!child.$) {
|
||||
child.$ = [];
|
||||
}
|
||||
child.$.push.apply(child.$, failureOutput.$);
|
||||
}
|
||||
|
||||
queue.push(child);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if(processedNodes >= maxNodes) {
|
||||
throw new Error("Aho-Corasick: buildFailureLinks exceeded maximum nodes (" + maxNodes + ")");
|
||||
}
|
||||
};
|
||||
|
||||
AhoCorasick.prototype.search = function(text, useWordBoundary) {
|
||||
if(!text || typeof text !== "string" || text.length === 0) {
|
||||
return [];
|
||||
}
|
||||
|
||||
var matches = [];
|
||||
var node = this.trie;
|
||||
var textLength = text.length;
|
||||
var maxMatches = Math.min(textLength * 2, 10000);
|
||||
|
||||
for(var i = 0; i < textLength; i++) {
|
||||
var char = text[i];
|
||||
var transitionCount = 0;
|
||||
|
||||
while(node && !node[char] && node !== this.trie && transitionCount < this.maxFailureDepth) {
|
||||
node = this.failure[node] || this.trie;
|
||||
transitionCount++;
|
||||
}
|
||||
|
||||
if(node && node[char]) {
|
||||
node = node[char];
|
||||
} else {
|
||||
node = this.trie;
|
||||
if(this.trie[char]) {
|
||||
node = this.trie[char];
|
||||
}
|
||||
}
|
||||
|
||||
var currentNode = node;
|
||||
var collectCount = 0;
|
||||
while(currentNode && collectCount < 10) {
|
||||
if(currentNode.$) {
|
||||
var outputs = currentNode.$;
|
||||
for(var j = 0; j < outputs.length && matches.length < maxMatches; j++) {
|
||||
var output = outputs[j];
|
||||
var matchStart = i - output.length + 1;
|
||||
var matchEnd = i + 1;
|
||||
|
||||
if(useWordBoundary && !this.isWordBoundaryMatch(text, matchStart, matchEnd)) {
|
||||
continue;
|
||||
}
|
||||
|
||||
matches.push({
|
||||
pattern: output.pattern,
|
||||
index: matchStart,
|
||||
length: output.length,
|
||||
titleIndex: output.index
|
||||
});
|
||||
}
|
||||
}
|
||||
currentNode = this.failure[currentNode];
|
||||
if(currentNode === this.trie) break;
|
||||
collectCount++;
|
||||
}
|
||||
}
|
||||
|
||||
return matches;
|
||||
};
|
||||
|
||||
AhoCorasick.prototype.isWordBoundaryMatch = function(text, start, end) {
|
||||
var beforeChar = start > 0 ? text[start - 1] : "";
|
||||
var afterChar = end < text.length ? text[end] : "";
|
||||
|
||||
var isWordChar = function(char) {
|
||||
return /[a-zA-Z0-9_\u00C0-\u00FF]/.test(char);
|
||||
};
|
||||
|
||||
var beforeIsWord = beforeChar && isWordChar(beforeChar);
|
||||
var afterIsWord = afterChar && isWordChar(afterChar);
|
||||
|
||||
return !beforeIsWord && !afterIsWord;
|
||||
};
|
||||
|
||||
AhoCorasick.prototype.clear = function() {
|
||||
this.trie = {};
|
||||
this.failure = {};
|
||||
this.patternCount = 0;
|
||||
};
|
||||
|
||||
AhoCorasick.prototype.getStats = function() {
|
||||
var nodeCount = 0;
|
||||
var patternCount = 0;
|
||||
var failureCount = 0;
|
||||
|
||||
function countNodes(node) {
|
||||
if(!node) return;
|
||||
nodeCount++;
|
||||
if(node.$) {
|
||||
patternCount += node.$.length;
|
||||
}
|
||||
for(var key in node) {
|
||||
if(node[key] && typeof node[key] === "object" && key !== "$") {
|
||||
countNodes(node[key]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
countNodes(this.trie);
|
||||
|
||||
for(var key in this.failure) {
|
||||
failureCount++;
|
||||
}
|
||||
|
||||
return {
|
||||
nodeCount: nodeCount,
|
||||
patternCount: this.patternCount,
|
||||
failureLinks: failureCount
|
||||
};
|
||||
};
|
||||
|
||||
exports.AhoCorasick = AhoCorasick;
|
||||
|
|
@ -0,0 +1,2 @@
|
|||
title: $:/config/Freelinks/WordBoundary
|
||||
text: yes
|
||||
|
|
@ -2,26 +2,36 @@ title: $:/plugins/tiddlywiki/freelinks/readme
|
|||
|
||||
This plugin adds automatic generation of links to tiddler titles.
|
||||
|
||||
''Note that automatic link generation can be very slow when there are a large number of tiddlers''.
|
||||
The plugin uses the Aho-Corasick algorithm for efficient pattern matching with large numbers of tiddlers.
|
||||
|
||||
!! Configuration
|
||||
|
||||
Freelinking is activated for runs of text that have the following variables set:
|
||||
|
||||
* `tv-wikilinks` is NOT equal to `no`
|
||||
* `tv-freelinks` is set to `yes`
|
||||
|
||||
Freelinks are case sensitive by default but can be configured to ignore case in the settings tab.
|
||||
Freelinks are case sensitive by default but can be configured to ignore case in the settings panel.
|
||||
|
||||
Within view templates, the variable `tv-freelinks` is automatically set to the content of $:/config/Freelinks/Enable, which can be set via the settings panel of this plugin.
|
||||
Word boundary checking can be configured in the settings panel. When enabled (default for Western languages), links are created only for complete words. When disabled, partial matches within words are also linked.
|
||||
|
||||
When multiple tiddler titles match the same text, longer titles take precedence over shorter ones.
|
||||
|
||||
Tiddlers do not create links to themselves.
|
||||
|
||||
Use `$:/config/Freelinks/TargetFilter` to define which tiddlers are eligible for auto-linking.
|
||||
|
||||
Within view templates, the variable `tv-freelinks` is automatically set to the content of `$:/config/Freelinks/Enable`, which can be set via the settings panel of this plugin.
|
||||
|
||||
!! Notes
|
||||
|
||||
To change within which tiddlers freelinking occurs requires customising the shadow tiddler [[$:/plugins/tiddlywiki/freelinks/macros/view]]. This tiddler is tagged $:/tags/Macro/View which means that it will be included as a local macro in each view template. By default, its content is:
|
||||
To change within which tiddlers freelinking occurs requires customising the shadow tiddler [[$:/plugins/tiddlywiki/freelinks/macros/view]]. This tiddler is tagged `$:/tags/Macro/View` which means that it will be included as a local macro in each view template. By default, its content is:
|
||||
|
||||
```
|
||||
<$set name="tv-freelinks" value={{$:/config/Freelinks/Enable}}/>
|
||||
```
|
||||
|
||||
That means that for each tiddler the variable tv-freelinks will be set to the tiddler $:/config/Freelinks/Enable, which is set to "yes" or "no" by the settings in control panel.
|
||||
That means that for each tiddler the variable `tv-freelinks` will be set to the tiddler `$:/config/Freelinks/Enable`, which is set to "yes" or "no" by the settings in control panel.
|
||||
|
||||
Instead, we can use a filter expression to, say, only freelink within the tiddler with the title "HelloThere":
|
||||
|
||||
|
|
@ -40,3 +50,14 @@ Or we can combine both approaches:
|
|||
```
|
||||
<$set name="tv-freelinks" value={{{ [<currentTiddler>match[HelloThere]] ~[<currentTiddler>tag[MyTag]] +[then[yes]else[no]] }}}/>
|
||||
```
|
||||
|
||||
!! Implementation Details
|
||||
|
||||
The Aho-Corasick algorithm implementation includes:
|
||||
|
||||
* Unicode character support for international text
|
||||
* Prevention of self-referential links within the current tiddler
|
||||
* Performance safeguards including depth protection and result limiting
|
||||
* Graceful fallback handling for invalid patterns
|
||||
|
||||
Longer tiddler titles take precedence over shorter ones when multiple matches are possible.
|
||||
|
|
|
|||
|
|
@ -6,4 +6,6 @@ Filter defining tiddlers to which freelinks are made: <$edit-text tiddler="$:/co
|
|||
|
||||
<$checkbox tiddler="$:/config/Freelinks/Enable" field="text" checked="yes" unchecked="no" default="no"> <$link to="$:/config/Freelinks/Enable">Enable freelinking within tiddler view templates</$link> </$checkbox>
|
||||
|
||||
<$checkbox tiddler="$:/config/Freelinks/WordBoundary" field="text" checked="yes" unchecked="no" default="yes"> <$link to="$:/config/Freelinks/WordBoundary">Word Boundary Check</$link> </$checkbox>
|
||||
|
||||
<$checkbox tiddler="$:/config/Freelinks/IgnoreCase" field="text" checked="yes" unchecked="no" default="no"> <$link to="$:/config/Freelinks/IgnoreCase">Ignore case</$link> </$checkbox>
|
||||
|
|
|
|||
|
|
@ -3,31 +3,49 @@ title: $:/core/modules/widgets/text.js
|
|||
type: application/javascript
|
||||
module-type: widget
|
||||
|
||||
An override of the core text widget that automatically linkifies the text
|
||||
An optimized override of the core text widget that automatically linkifies the text, with support for non-Latin languages like Chinese, prioritizing longer titles, skipping processed matches, excluding the current tiddler title from linking, and handling large title sets with enhanced Aho-Corasick algorithm.
|
||||
|
||||
\*/
|
||||
|
||||
"use strict";
|
||||
|
||||
var TITLE_TARGET_FILTER = "$:/config/Freelinks/TargetFilter";
|
||||
var WORD_BOUNDARY_TIDDLER = "$:/config/Freelinks/WordBoundary";
|
||||
|
||||
var Widget = require("$:/core/modules/widgets/widget.js").widget,
|
||||
LinkWidget = require("$:/core/modules/widgets/link.js").link,
|
||||
ButtonWidget = require("$:/core/modules/widgets/button.js").button,
|
||||
ElementWidget = require("$:/core/modules/widgets/element.js").element;
|
||||
ElementWidget = require("$:/core/modules/widgets/element.js").element,
|
||||
AhoCorasick = require("$:/core/modules/utils/aho-corasick.js").AhoCorasick;
|
||||
|
||||
var ESCAPE_REGEX = /[\\^$*+?.()|[\]{}]/g;
|
||||
|
||||
function escapeRegExp(str) {
|
||||
try {
|
||||
return str.replace(ESCAPE_REGEX, "\\$&");
|
||||
} catch(e) {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function FastPositionSet() {
|
||||
this.set = new Set();
|
||||
}
|
||||
|
||||
FastPositionSet.prototype.add = function(pos) {
|
||||
this.set.add(pos);
|
||||
};
|
||||
|
||||
FastPositionSet.prototype.has = function(pos) {
|
||||
return this.set.has(pos);
|
||||
};
|
||||
|
||||
var TextNodeWidget = function(parseTreeNode,options) {
|
||||
this.initialise(parseTreeNode,options);
|
||||
};
|
||||
|
||||
/*
|
||||
Inherit from the base widget class
|
||||
*/
|
||||
TextNodeWidget.prototype = new Widget();
|
||||
|
||||
/*
|
||||
Render this widget into the DOM
|
||||
*/
|
||||
TextNodeWidget.prototype.render = function(parent,nextSibling) {
|
||||
this.parentDomNode = parent;
|
||||
this.computeAttributes();
|
||||
|
|
@ -35,128 +53,237 @@ TextNodeWidget.prototype.render = function(parent,nextSibling) {
|
|||
this.renderChildren(parent,nextSibling);
|
||||
};
|
||||
|
||||
/*
|
||||
Compute the internal state of the widget
|
||||
*/
|
||||
TextNodeWidget.prototype.execute = function() {
|
||||
var self = this,
|
||||
ignoreCase = self.getVariable("tv-freelinks-ignore-case",{defaultValue:"no"}).trim() === "yes";
|
||||
// Get our parameters
|
||||
|
||||
var childParseTree = [{
|
||||
type: "plain-text",
|
||||
text: this.getAttribute("text",this.parseTreeNode.text || "")
|
||||
}];
|
||||
// Only process links if not disabled and we're not within a button or link widget
|
||||
if(this.getVariable("tv-wikilinks",{defaultValue:"yes"}).trim() !== "no" && this.getVariable("tv-freelinks",{defaultValue:"no"}).trim() === "yes" && !this.isWithinButtonOrLink()) {
|
||||
// Get the information about the current tiddler titles, and construct a regexp
|
||||
this.tiddlerTitleInfo = this.wiki.getGlobalCache("tiddler-title-info-" + (ignoreCase ? "insensitive" : "sensitive"),function() {
|
||||
var targetFilterText = self.wiki.getTiddlerText(TITLE_TARGET_FILTER),
|
||||
titles = !!targetFilterText ? self.wiki.filterTiddlers(targetFilterText,$tw.rootWidget) : self.wiki.allTitles(),
|
||||
sortedTitles = titles.sort(function(a,b) {
|
||||
var lenA = a.length,
|
||||
lenB = b.length;
|
||||
// First sort by length, so longer titles are first
|
||||
if(lenA !== lenB) {
|
||||
if(lenA < lenB) {
|
||||
return +1;
|
||||
} else {
|
||||
return -1;
|
||||
}
|
||||
} else {
|
||||
// Then sort alphabetically within titles of the same length
|
||||
if(a < b) {
|
||||
return -1;
|
||||
} else if(a > b) {
|
||||
return +1;
|
||||
} else {
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
}),
|
||||
titles = [],
|
||||
reparts = [];
|
||||
$tw.utils.each(sortedTitles,function(title) {
|
||||
if(title.substring(0,3) !== "$:/") {
|
||||
titles.push(title);
|
||||
reparts.push("(" + $tw.utils.escapeRegExp(title) + ")");
|
||||
}
|
||||
});
|
||||
var regexpStr = "\\b(?:" + reparts.join("|") + ")\\b";
|
||||
return {
|
||||
titles: titles,
|
||||
regexp: new RegExp(regexpStr,ignoreCase ? "i" : "")
|
||||
};
|
||||
type: "plain-text",
|
||||
text: this.getAttribute("text",this.parseTreeNode.text || "")
|
||||
}];
|
||||
|
||||
var text = childParseTree[0].text;
|
||||
|
||||
if(!text || text.length < 2) {
|
||||
this.makeChildWidgets(childParseTree);
|
||||
return;
|
||||
}
|
||||
|
||||
if(this.getVariable("tv-wikilinks",{defaultValue:"yes"}) !== "no" &&
|
||||
this.getVariable("tv-freelinks",{defaultValue:"no"}) === "yes" &&
|
||||
!this.isWithinButtonOrLink()) {
|
||||
|
||||
var currentTiddlerTitle = this.getVariable("currentTiddler") || "";
|
||||
var useWordBoundary = self.wiki.getTiddlerText(WORD_BOUNDARY_TIDDLER, "no") === "yes";
|
||||
|
||||
var cacheKey = "tiddler-title-info-" + (ignoreCase ? "insensitive" : "sensitive");
|
||||
|
||||
this.tiddlerTitleInfo = this.wiki.getGlobalCache(cacheKey, function() {
|
||||
return computeTiddlerTitleInfo(self, ignoreCase);
|
||||
});
|
||||
// Repeatedly linkify
|
||||
|
||||
if(this.tiddlerTitleInfo.titles.length > 0) {
|
||||
var index,text,match,matchEnd;
|
||||
do {
|
||||
index = childParseTree.length - 1;
|
||||
text = childParseTree[index].text;
|
||||
match = this.tiddlerTitleInfo.regexp.exec(text);
|
||||
if(match) {
|
||||
// Make a text node for any text before the match
|
||||
if(match.index > 0) {
|
||||
childParseTree[index].text = text.substring(0,match.index);
|
||||
index += 1;
|
||||
}
|
||||
// Make a link node for the match
|
||||
childParseTree[index] = {
|
||||
type: "link",
|
||||
attributes: {
|
||||
to: {type: "string", value: ignoreCase ? this.tiddlerTitleInfo.titles[match.indexOf(match[0],1) - 1] : match[0]},
|
||||
"class": {type: "string", value: "tc-freelink"}
|
||||
},
|
||||
children: [{
|
||||
type: "plain-text", text: match[0]
|
||||
}]
|
||||
};
|
||||
index += 1;
|
||||
// Make a text node for any text after the match
|
||||
matchEnd = match.index + match[0].length;
|
||||
if(matchEnd < text.length) {
|
||||
childParseTree[index] = {
|
||||
type: "plain-text",
|
||||
text: text.substring(matchEnd)
|
||||
};
|
||||
}
|
||||
}
|
||||
} while(match && childParseTree[childParseTree.length - 1].type === "plain-text");
|
||||
var newParseTree = this.processTextWithMatches(text, currentTiddlerTitle, ignoreCase, useWordBoundary);
|
||||
if(newParseTree.length > 1 || newParseTree[0].type !== "plain-text") {
|
||||
childParseTree = newParseTree;
|
||||
}
|
||||
}
|
||||
}
|
||||
// Make the child widgets
|
||||
|
||||
this.makeChildWidgets(childParseTree);
|
||||
};
|
||||
|
||||
TextNodeWidget.prototype.isWithinButtonOrLink = function() {
|
||||
var withinButtonOrLink = false,
|
||||
widget = this.parentWidget;
|
||||
while(!withinButtonOrLink && widget) {
|
||||
withinButtonOrLink = widget instanceof ButtonWidget || widget instanceof LinkWidget || ((widget instanceof ElementWidget) && widget.parseTreeNode.tag === "a");
|
||||
widget = widget.parentWidget;
|
||||
TextNodeWidget.prototype.processTextWithMatches = function(text, currentTiddlerTitle, ignoreCase, useWordBoundary) {
|
||||
var searchText = ignoreCase ? text.toLowerCase() : text;
|
||||
var matches;
|
||||
|
||||
try {
|
||||
matches = this.tiddlerTitleInfo.ac.search(searchText, useWordBoundary);
|
||||
} catch(e) {
|
||||
return [{type: "plain-text", text: text}];
|
||||
}
|
||||
return withinButtonOrLink;
|
||||
|
||||
if(!matches || matches.length === 0) {
|
||||
return [{type: "plain-text", text: text}];
|
||||
}
|
||||
|
||||
matches.sort(function(a, b) {
|
||||
var posDiff = a.index - b.index;
|
||||
return posDiff !== 0 ? posDiff : b.length - a.length;
|
||||
});
|
||||
|
||||
var processedPositions = new FastPositionSet();
|
||||
var validMatches = [];
|
||||
|
||||
for(var i = 0; i < matches.length; i++) {
|
||||
var match = matches[i];
|
||||
var matchStart = match.index;
|
||||
var matchEnd = matchStart + match.length;
|
||||
|
||||
var hasOverlap = false;
|
||||
for(var pos = matchStart; pos < matchEnd && !hasOverlap; pos++) {
|
||||
if(processedPositions.has(pos)) {
|
||||
hasOverlap = true;
|
||||
}
|
||||
}
|
||||
|
||||
if(!hasOverlap) {
|
||||
for(var pos = matchStart; pos < matchEnd; pos++) {
|
||||
processedPositions.add(pos);
|
||||
}
|
||||
validMatches.push(match);
|
||||
}
|
||||
}
|
||||
|
||||
if(validMatches.length === 0) {
|
||||
return [{type: "plain-text", text: text}];
|
||||
}
|
||||
|
||||
var newParseTree = [];
|
||||
var currentPos = 0;
|
||||
|
||||
for(var i = 0; i < validMatches.length; i++) {
|
||||
var match = validMatches[i];
|
||||
var matchStart = match.index;
|
||||
var matchEnd = matchStart + match.length;
|
||||
|
||||
if(matchStart > currentPos) {
|
||||
newParseTree.push({
|
||||
type: "plain-text",
|
||||
text: text.slice(currentPos, matchStart)
|
||||
});
|
||||
}
|
||||
|
||||
var matchedTitle = this.tiddlerTitleInfo.titles[match.titleIndex];
|
||||
|
||||
if(matchedTitle === currentTiddlerTitle) {
|
||||
newParseTree.push({
|
||||
type: "plain-text",
|
||||
text: text.slice(matchStart, matchEnd)
|
||||
});
|
||||
} else {
|
||||
newParseTree.push({
|
||||
type: "link",
|
||||
attributes: {
|
||||
to: {type: "string", value: matchedTitle},
|
||||
"class": {type: "string", value: "tc-freelink"}
|
||||
},
|
||||
children: [{
|
||||
type: "plain-text",
|
||||
text: text.slice(matchStart, matchEnd)
|
||||
}]
|
||||
});
|
||||
}
|
||||
currentPos = matchEnd;
|
||||
}
|
||||
|
||||
if(currentPos < text.length) {
|
||||
newParseTree.push({
|
||||
type: "plain-text",
|
||||
text: text.slice(currentPos)
|
||||
});
|
||||
}
|
||||
|
||||
return newParseTree;
|
||||
};
|
||||
|
||||
function computeTiddlerTitleInfo(self, ignoreCase) {
|
||||
var targetFilterText = self.wiki.getTiddlerText(TITLE_TARGET_FILTER),
|
||||
titles = !!targetFilterText ?
|
||||
self.wiki.filterTiddlers(targetFilterText,$tw.rootWidget) :
|
||||
self.wiki.allTitles();
|
||||
|
||||
if(!titles || titles.length === 0) {
|
||||
return {
|
||||
titles: [],
|
||||
ac: new AhoCorasick()
|
||||
};
|
||||
}
|
||||
|
||||
var validTitles = [];
|
||||
var ac = new AhoCorasick();
|
||||
|
||||
// Process titles in a single pass to avoid duplication
|
||||
for(var i = 0; i < titles.length; i++) {
|
||||
var title = titles[i];
|
||||
if(title && title.length > 0 && title.substring(0,3) !== "$:/") {
|
||||
var escapedTitle = escapeRegExp(title);
|
||||
if(escapedTitle) {
|
||||
validTitles.push(title);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Sort by length (descending) then alphabetically
|
||||
// Longer titles are prioritized to avoid partial matches (e.g., "JavaScript" before "Java")
|
||||
var sortedTitles = validTitles.sort(function(a,b) {
|
||||
var lenDiff = b.length - a.length;
|
||||
return lenDiff !== 0 ? lenDiff : (a < b ? -1 : a > b ? 1 : 0);
|
||||
});
|
||||
|
||||
// Build Aho-Corasick automaton
|
||||
for(var i = 0; i < sortedTitles.length; i++) {
|
||||
var title = sortedTitles[i];
|
||||
ac.addPattern(ignoreCase ? title.toLowerCase() : title, i);
|
||||
}
|
||||
|
||||
try {
|
||||
ac.buildFailureLinks();
|
||||
} catch(e) {
|
||||
return {
|
||||
titles: [],
|
||||
ac: new AhoCorasick()
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
titles: sortedTitles,
|
||||
ac: ac
|
||||
};
|
||||
}
|
||||
|
||||
TextNodeWidget.prototype.isWithinButtonOrLink = function() {
|
||||
var widget = this.parentWidget;
|
||||
while(widget) {
|
||||
if(widget instanceof ButtonWidget ||
|
||||
widget instanceof LinkWidget ||
|
||||
((widget instanceof ElementWidget) && widget.parseTreeNode.tag === "a")) {
|
||||
return true;
|
||||
}
|
||||
widget = widget.parentWidget;
|
||||
}
|
||||
return false;
|
||||
};
|
||||
|
||||
/*
|
||||
Selectively refreshes the widget if needed. Returns true if the widget or any of its children needed re-rendering
|
||||
*/
|
||||
TextNodeWidget.prototype.refresh = function(changedTiddlers) {
|
||||
var self = this,
|
||||
changedAttributes = this.computeAttributes(),
|
||||
titlesHaveChanged = false;
|
||||
$tw.utils.each(changedTiddlers,function(change,title) {
|
||||
if(change.isDeleted) {
|
||||
titlesHaveChanged = true;
|
||||
} else {
|
||||
titlesHaveChanged = titlesHaveChanged || !self.tiddlerTitleInfo || self.tiddlerTitleInfo.titles.indexOf(title) === -1;
|
||||
|
||||
if(changedTiddlers) {
|
||||
$tw.utils.each(changedTiddlers,function(change,title) {
|
||||
if(change.isDeleted) {
|
||||
titlesHaveChanged = true;
|
||||
} else {
|
||||
titlesHaveChanged = titlesHaveChanged ||
|
||||
!self.tiddlerTitleInfo ||
|
||||
self.tiddlerTitleInfo.titles.indexOf(title) === -1;
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
if(changedAttributes.text || titlesHaveChanged ||
|
||||
(changedTiddlers && changedTiddlers[WORD_BOUNDARY_TIDDLER])) {
|
||||
if(titlesHaveChanged) {
|
||||
var ignoreCase = self.getVariable("tv-freelinks-ignore-case",{defaultValue:"no"}).trim() === "yes";
|
||||
var cacheKey = "tiddler-title-info-" + (ignoreCase ? "insensitive" : "sensitive");
|
||||
self.wiki.clearCache(cacheKey);
|
||||
}
|
||||
});
|
||||
if(changedAttributes.text || titlesHaveChanged) {
|
||||
|
||||
this.refreshSelf();
|
||||
return true;
|
||||
} else {
|
||||
return false;
|
||||
return this.refreshChildren(changedTiddlers);
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue