(character-fold-to-regexp): Remove special code for
case-folding. Char-fold search still respects the
`case-fold-search' variable (i.e., f matches F). This only
removes the code that was added to ensure that f also matched
all chars that F matched. For instance, after this commit, f
no longer matches 𝔽.
This was necessary because the logic created a regexp with
2^(length of the string) redundant paths. So, when a very
long string "almost" matched, Emacs took a very long time to
figure out that it didn't. This became particularly relevant
because isearch's lazy-highlight does a search bounded by (1-
match-end) (which, in most circumstances, is a search that
almost matches). A recipe for this can be found in bug#22090.
(character-fold-to-regexp): Uncomment recently commented code
and make the algorithm "dummer" by not checking every possible
combination. This will miss some possible matches, but it
greatly reduces regexp size.
* test/automated/character-fold-tests.el
(character-fold--test-fold-to-regexp): Comment out test of
functionality no longer supported.
(character-fold-to-regexp): Comment out code that uses multi-char
table. The branching caused by this induces absurdly long regexps,
up to 10k chars for as little as 25 input characters.
Warn about using long strings.
* test/automated/character-fold-tests.el
(character-fold--test-lax-whitespace)
(character-fold--test-consistency): Reduce string size for tests.
(character-fold-table): Now has an extra-slot. This is a second
char-table that holds multi-character matches. See docstring for
details.
(character-fold-to-regexp): Can build branching regexps when a
character's entry the extra slot of `character-fold-table' matches the
characters that succeed it.
(character-fold-table): Reduce the scope of a variable.
(character-fold-to-regexp): Change logic to work directly on the
input string. It's a little easier to understand, probably
faster, and sets us up for implementing multi-char matches.
* test/automated/character-fold-tests.el
(character-fold--test-fold-to-regexp): New test.
(character-fold-table): When a character's decomposition does not
involve a formatting tag (i.e., if it has an "exact" description via
other characters), then this character is allowed to match the
decomposition.
(character-fold-to-regexp): Rework internals to play nice with
lax-whitespacing.
When the user types a space, we want to match the table entry for
?\s, which is generally a regexp like "[ ...]". However, the
`search-spaces-regexp' variable doesn't "see" spaces inside these
regexp constructs, so we need to use "\\( \\|[ ...]\\)" instead (to
manually expose a space).
Furthermore, the lax search engine acts on a bunch of spaces, not
on individual spaces, so if the string contains sequential spaces
like " ", we need to keep them grouped together like this:
"\\( \\|[ ...][ ...]\\)".
(character-fold-search-forward, character-fold-search-backward):
New command
(character-fold-to-regexp): Remove lax-whitespace hack.
(character-fold-search): Remove variable. Only isearch and
query-replace use char-folding, and they both have their own
variables to configure that.
Remove usage of `isearch-lax-whitespace' inside the `iearch-word'
clause of `isearch-search-fun-default'. That lax variable does not
refer to lax-whitespacing. Related to (bug#21777).
This reverts commit a5bdb872ed.
* character-fold.el (character-fold-search): Set to nil
Default to nil for now, until someone implements proper
lax-whitespacing with char-fold searching.
(character-fold-to-regexp): New function.
* lisp/replace.el (replace-search): Check value of
`character-fold-search'.
* lisp/isearch.el: Move character-folding code to
character-fold.el
(isearch-toggle-character-fold): New command.
(isearch-mode-map): Bind it to "\M-sf".
(isearch-mode): Check value of `character-fold-search'.