beginning and ends of symbols.
* regex.c (enum syntaxcode): Add Ssymbol.
(init_syntax_once): Set the syntax for '_' to Ssymbol, not Sword.
(re_opcode_t): New opcodes `symbeg' and `symend'.
(print_partial_compiled_pattern): Print the new opcodes properly.
(regex_compile): Parse the new operators.
(analyse_first): Skip symbeg and symend (they match only the empty string).
(mutually_exclusive_p): `symend' is mutually exclusive with \s_ and
\sw; `symbeg' is mutually exclusive with \S_ and \Sw.
(re_match_2_internal): Match symbeg and symend.
call EXTEND_RANGE_TABLE and return a proper value.
(set_image_of_range): Don't call set_image_of_range_1
if no TRANSLATE or if range includes all of Latin-1.
Only call it for the Latin-1 part of the range.
For other cases, make two separate ranges,
one for the original specified characters and one for
their case-conversions.
(set_image_of_range): Use set_image_of_range_1 for Latin-1.
Return a value to indicate running out of memory.
(SET_RANGE_TABLE_WORK_AREA): Check value from set_image_of_range.
(extend_range_table_work_area): New subroutine.
(EXTEND_RANGE_TABLE): Replaces EXTEND_RANGE_TABLE_WORK_AREA.
Different calling conventions, and used from set_image_of_range{,_1}.
(IMMEDIATE_QUIT_CHECK): Definitions moved.
(PATFETCH_RAW): Rename to PATFETCH.
(set_image_of_range): New fun.
(SET_RANGE_TABLE_WORK_AREA): Use it.
(regex_compile): Don't translate the pattern chars so eagerly.
Only do it when inserting an `exactn' bytecode or when handling a char-range.
(mutually_exclusive_p): Avoid empty statement.
(CHECK_INFINITE_LOOP): Use DISCARD_FAILURE_REG_OR_COUNT
when jumping to `fail' to avoid undoing reg changes in the
last iteration of the loop.
(GET_UNSIGNED_NUMBER): Skip spaces around the number.
Also change several `int' into `re_wchar_t'.
(PATTERN_STACK_EMPTY, PUSH_PATTERN_OP, POP_PATTERN_OP): Remove.
(PUSH_FAILURE_POINTER): Don't cast any more.
(POP_FAILURE_REG_OR_COUNT): Remove the cast that strips `const'.
We want GCC to complain, since this piece of code makes
re_match non-reentrant, which *should* be fixed.
(GET_BUFFER_SPACE): Use size_t rather than unsigned long.
(EXTEND_BUFFER): Use RETALLOC.
(SET_LIST_BIT): Don't cast.
(re_wchar_t): New type.
(re_iswctype, re_wctype_to_bit): Make it crystal clear to GCC
that those two functions will always properly return.
(IMMEDIATE_QUIT_CHECK): Cast to void.
(analyse_first): Use recursion rather than an explicit stack.
(re_compile_fastmap): Can't fail anymore.
(re_search_2): Don't check re_compile_fastmap for failure.
(PUSH_NUMBER): Renamed from PUSH_FAILURE_COUNT.
Now also sets the new value (passed in a new argument).
(re_match_2_internal): Use it.
Also, use a new var `reg' of type size_t when looping through regs
rather than reuse the inappropriate `mcnt'.
(btowc, iswctype, wctype) [_LIBC]: Redefine to __<fun>.
(BIT_ALPHA, BIT_ALNUM, BIT_ASCII, BIT_NONASCII, BIT_GRAPH, BIT_PRINT)
(BIT_UNIBYTE): Remove.
(re_match_2_internal): Delete corresponding code and streamline the
BIT_MULTIBYTE case to not bother checking ISUNIBYTE.
(CHAR_CLASS_MAX_LENGTH) [!WIDE_CHAR_SUPPORT]: Set to 9 rather than 6.
(re_wctype_t): New type.
(re_wctype, re_iswctype, re_wctype_to_bit): New functions.
(regex_compile): Use them and fix handling of overly long char classes.
(struct re_pattern_buffer): Remove newline_anchor.
* regex.c: Keep namespace clean for GNU libc by renaming <fun>
to __<fun> and using `weak_alias (__<fun>, <fun>)'.
(re_max_failures, fail_stack): Use size_t rather than unsigned.
(regex_compile): For ^ and $, choose between buffer and line (beg|end)
depending on the new RE_NO_NEWLINE_ANCHOR syntax flag.
(print_compiled_pattern, re_search_2, mutually_exclusive_p)
(re_match_2_internal, re_compile_pattern, re_comp, regcomp):
Get rid of references to newline_anchor.
(regcomp): Allocate and precompute a fastmap.
(bcopy, bcmp, REGEX_REALLOCATE, re_match_2_internal):
Use memcmp and memcpy instead of bcopy and bcmp.
(init_syntax_once): Use ISALNUM.
(PUSH_FAILURE_POINT, re_match_2_internal): Remove failure_id.
(REG_UNSET_VALUE): Remove. Use NULL instead.
(REG_UNSET, re_match_2_internal): Use NULL.
(SET_HIGH_BOUND, MOVE_BUFFER_POINTER, ELSE_EXTEND_BUFFER_HIGH_BOUND):
New macros.
(EXTEND_BUFFER): Use them (to work with BOUNDED_POINTERS).
(GET_UNSIGNED_NUMBER): Don't use ISDIGIT.
(regex_compile): In handle_interval, return an error rather than try to
unfetch the interval if we can't find the closing brace.
Obey the RE_NO_GNU_OPS syntax bit.
(TOLOWER): New macro.
(regcomp): Use it.
(regexec): Allocate regs.start and regs.end as one block.
(PTR_TO_OFFSET, POS_AS_IN_BUFFER): Move to a better place.
(ISDIGIT, ISCNTRL, ISXDIGIT) [!emacs]: Remove duplicate definition.
(regex_compile): Use RE_FRUGAL instead of RE_ALL_GREEDY.
(re_compile_pattern): Use size_t for length.
(init_syntax_once): Move to a better place.
* regex.h: Merge changes from GNU libc. Indent cpp directives.
(RE_FRUGAL): Replaces RE_ALL_GREEDY (inverted meaning).
(POP_FAILURE_REG_OR_COUNT): Renamed from POP_FAILURE_REG.
Handle popping of a register's or a counter's data.
(POP_FAILURE_POINT): Use the new name.
(re_match_2_internal): Push counter data on the stack for succeed_n,
jump_n and set_number_at and remove misleading dead code in succeed_n.