1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2025-12-06 06:20:55 -08:00

calc: Allow strings with character codes above Latin-1

The current behavior of the functions 'calc-display-strings',
'strings', and 'bstrings' is to skip any vector containing
integers outside the Latin-1 range (0x00-0xFF).  We introduce a
custom variable 'calc-string-maximum-character' to replace this
hard-coded maximum, and to allow vectors containing higher
character codes to be displayed as strings.  The default value
of 0xFF preserves the existing behavior.

* lisp/calc/calc.el (calc-string-maximum-character): Add custom
variable 'calc-string-maximum-character'.
* lisp/calc/calccomp.el (math-vector-is-string): Replace hard-coded
maximum with 'calc-string-maximum-character', and the 'natnump'
assertion with 'characterp'.  The latter guards against the
maximum being larger than '(max-char)', but not on invalid types of
the maximum such as strings.

* test/lisp/calc/calc-tests.el (calc-math-vector-is-string): Add
tests for 'math-vector-is-string' using different values of
'calc-string-maximum-character'.

* doc/misc/calc.texi (Quick Calculator, Strings, Customizing Calc):
Add variable definition for 'calc-string-maximum-character' and
reference thereof when discussing 'calc-display-strings'.
Generalize a comment about string display and availability of 8-bit
fonts.
(Bug#78528)
This commit is contained in:
Jacob S. Gordon 2025-05-19 15:05:37 -04:00 committed by Eli Zaretskii
parent 82766b71a4
commit 5bd9fa084d
5 changed files with 164 additions and 17 deletions

View file

@ -10179,7 +10179,7 @@ result @samp{[120]} (because 120 is the ASCII code of the lower-case
is displayed only according to the current mode settings. But is displayed only according to the current mode settings. But
running Quick Calc again and entering @samp{120} will produce the running Quick Calc again and entering @samp{120} will produce the
result @samp{120 (16#78, 8#170, x)} which shows the number in its result @samp{120 (16#78, 8#170, x)} which shows the number in its
decimal, hexadecimal, octal, and ASCII forms. decimal, hexadecimal, octal, and character forms.
Please note that the Quick Calculator is not any faster at loading Please note that the Quick Calculator is not any faster at loading
or computing the answer than the full Calculator; the name ``quick'' or computing the answer than the full Calculator; the name ``quick''
@ -10836,11 +10836,11 @@ from 1 to @samp{n}.
@cindex Strings @cindex Strings
@cindex Character strings @cindex Character strings
Character strings are not a special data type in the Calculator. Character strings are not a special data type in the Calculator.
Rather, a string is represented simply as a vector all of whose Rather, a string is represented simply as a vector all of whose elements
elements are integers in the range 0 to 255 (ASCII codes). You can are integers in the Latin-1 range 0 to 255. You can enter a string at
enter a string at any time by pressing the @kbd{"} key. Quotation any time by pressing the @kbd{"} key. Quotation marks and backslashes
marks and backslashes are written @samp{\"} and @samp{\\}, respectively, are written @samp{\"} and @samp{\\}, respectively, inside strings.
inside strings. Other notations introduced by backslashes are: Other notations introduced by backslashes are:
@example @example
@group @group
@ -10857,21 +10857,24 @@ inside strings. Other notations introduced by backslashes are:
@noindent @noindent
Finally, a backslash followed by three octal digits produces any Finally, a backslash followed by three octal digits produces any
character from its ASCII code. character from its code.
@kindex d " @kindex d "
@pindex calc-display-strings @pindex calc-display-strings
Strings are normally displayed in vector-of-integers form. The Strings are normally displayed in vector-of-integers form. The
@w{@kbd{d "}} (@code{calc-display-strings}) command toggles a mode in @w{@kbd{d "}} (@code{calc-display-strings}) command toggles a mode in
which any vectors of small integers are displayed as quoted strings which any vectors of small integers are displayed as quoted strings
instead. instead. The display of strings containing higher character codes can
be enabled by increasing the custom variable
@code{calc-string-maximum-character} (@pxref{Customizing Calc}).
The backslash notations shown above are also used for displaying The backslash notations shown above are also used for displaying
strings. Characters 128 and above are not translated by Calc; unless strings. For ASCII control characters (below 32), and for the
you have an Emacs modified for 8-bit fonts, these will show up in @code{DEL} character (127), Calc uses the backslash-letter combination
backslash-octal-digits notation. For characters below 32, and if there is one, or otherwise uses a @samp{\^} sequence. Control
for character 127, Calc uses the backslash-letter combination if characters above 127 are not translated by Calc, and will show up in
there is one, or otherwise uses a @samp{\^} sequence. backslash-octal-digits notation. The display of higher character codes
will depend on your display settings and system font coverage.
The only Calc feature that uses strings is @dfn{compositions}; The only Calc feature that uses strings is @dfn{compositions};
@pxref{Compositions}. Strings also provide a convenient @pxref{Compositions}. Strings also provide a convenient
@ -35684,6 +35687,33 @@ choose from, or the user can enter their own date.
The default value of @code{calc-gregorian-switch} is @code{nil}. The default value of @code{calc-gregorian-switch} is @code{nil}.
@end defvar @end defvar
@defvar calc-string-maximum-character
@xref{Strings}.@*
The variable @code{calc-string-maximum-character} is the maximum value
of a vector's elements for @code{calc-display-strings}, @code{string},
and @code{bstring} to display the vector as a string. This maximum
@emph{must} represent a character, i.e. it's a non-negative integer less
than or equal to @code{(max-char)} or @code{0x3FFFFF}. Any negative
value effectively disables the display of strings, and for values larger
than @code{0x3FFFFF} the display acts as if the maximum were
@code{0x3FFFFF}. Some natural choices (and their resulting ranges) are:
@itemize
@item
@code{0x7F} or 127 (ASCII),
@item
@code{0xFF} or 255 (Latin-1, the default),
@item
@code{0x10FFFF} (Unicode),
@item
@code{0x3FFFFF} (Emacs).
@end itemize
The default value of @code{calc-string-maximum-character} is @code{0xFF}
or 255.
@end defvar
@node Reporting Bugs @node Reporting Bugs
@appendix Reporting Bugs @appendix Reporting Bugs

View file

@ -2182,7 +2182,19 @@ modifier, it scrolls by year.
The month and year navigation key bindings 'M-}', 'M-{', 'C-x ]' and The month and year navigation key bindings 'M-}', 'M-{', 'C-x ]' and
'C-x [' now have the alternative keys '}', '{', ']' and '['. 'C-x [' now have the alternative keys '}', '{', ']' and '['.
** Calc
*** New user option 'calc-string-maximum-character'.
Previously, the 'calc-display-strings', 'string', and 'bstring'
functions only considered integer vectors whose elements are all in the
Latin-1 range 0-255. This hard-coded maximum is replaced by
'calc-string-maximum-character', and setting it to a higher value allows
the display of matching vectors as Unicode strings. The default value
is '0xFF' or '255' to preserve the existing behavior.
* New Modes and Packages in Emacs 31.1 * New Modes and Packages in Emacs 31.1
** New minor mode 'delete-trailing-whitespace-mode'. ** New minor mode 'delete-trailing-whitespace-mode'.

View file

@ -628,6 +628,37 @@ Otherwise, 1 / 0 is changed to uinf (undirected infinity).")
(defcalcmodevar calc-display-strings nil (defcalcmodevar calc-display-strings nil
"If non-nil, display vectors of byte-sized integers as strings.") "If non-nil, display vectors of byte-sized integers as strings.")
(defcustom calc-string-maximum-character #xFF
"Maximum value of vector contents to be displayed as a string.
If a vector consists of characters up to this maximum value, the
function `calc-display-strings' will toggle displaying the vector as a
string. This maximum value must represent a character (see `characterp').
Some natural choices (and their resulting ranges) are:
- `0x7F' (`ASCII'),
- `0xFF' (`Latin-1', the default),
- `0x10FFFF' (`Unicode'),
- `0x3FFFFF' (`Emacs').
Characters for low control codes are either caret or backslash escaped,
while others without a glyph are displayed in backslash-octal notation.
The display of strings containing higher character codes will depend on
your display settings and system font coverage.
See the following for further information:
- info node `(calc)Strings',
- info node `(elisp)Text Representations',
- info node `(emacs)Text Display'."
:version "31.1"
:type '(choice (restricted-sexp :tag "Character Code"
:match-alternatives (characterp))
(const :tag "ASCII" #x7F)
(const :tag "Latin-1" #xFF)
(const :tag "Unicode" #x10FFFF)
(const :tag "Emacs" #x3FFFFF)))
(defcalcmodevar calc-matrix-just 'center (defcalcmodevar calc-matrix-just 'center
"If nil, vector elements are left-justified. "If nil, vector elements are left-justified.
If `right', vector elements are right-justified. If `right', vector elements are right-justified.

View file

@ -907,13 +907,20 @@
(concat " " math-comp-right-bracket))))) (concat " " math-comp-right-bracket)))))
(defun math-vector-is-string (a) (defun math-vector-is-string (a)
"Return t if A can be displayed as a string, and nil otherwise.
Elements of A must either be a character (see `characterp') or a complex
number with only a real character part, each with a value less than or
equal to the custom variable `calc-string-maximum-character'."
(while (and (setq a (cdr a)) (while (and (setq a (cdr a))
(or (and (natnump (car a)) (or (and (characterp (car a))
(<= (car a) 255)) (<= (car a)
calc-string-maximum-character))
(and (eq (car-safe (car a)) 'cplx) (and (eq (car-safe (car a)) 'cplx)
(natnump (nth 1 (car a))) (characterp (nth 1 (car a)))
(eq (nth 2 (car a)) 0) (eq (nth 2 (car a)) 0)
(<= (nth 1 (car a)) 255))))) (<= (nth 1 (car a))
calc-string-maximum-character)))))
(null a)) (null a))
(defconst math-vector-to-string-chars '( ( ?\" . "\\\"" ) (defconst math-vector-to-string-chars '( ( ?\" . "\\\"" )

View file

@ -879,5 +879,72 @@ An existing calc stack is reused, otherwise a new one is created."
(should-error (math-read-preprocess-string nil)) (should-error (math-read-preprocess-string nil))
(should-error (math-read-preprocess-string 42))) (should-error (math-read-preprocess-string 42)))
(ert-deftest calc-math-vector-is-string ()
"Test `math-vector-is-string' with varying `calc-string-maximum-character'.
All tests operate on both an integer vector and the corresponding
complex vector. The sets covered are:
1. `calc-string-maximum-character' is a valid character. The last case
with `0x3FFFFF' is borderline, as integers above it will not make it
past the `characterp' test.
2. `calc-string-maximum-character' is negative, so the test always fails.
3. `calc-string-maximum-character' is above `(max-char)', so only the
first `characterp' test is active.
4. `calc-string-maximum-character' has an invalid type, which triggers
an error in the comparison."
(cl-flet* ((make-vec (lambda (contents) (append (list 'vec) contents)))
(make-cplx (lambda (x) (list 'cplx x 0)))
(make-cplx-vec (lambda (contents)
(make-vec (mapcar #'make-cplx contents)))))
;; 1: calc-string-maximum-character is a valid character
(dolist (maxchar '(#x7F #xFF #x10FFFF #x3FFFFD #x3FFFFF))
(let* ((calc-string-maximum-character maxchar)
(small-chars (number-sequence (- maxchar 2) maxchar))
(large-chars (number-sequence maxchar (+ maxchar 2)))
(small-real-vec (make-vec small-chars))
(large-real-vec (make-vec large-chars))
(small-cplx-vec (make-cplx-vec small-chars))
(large-cplx-vec (make-cplx-vec large-chars)))
(should (math-vector-is-string small-real-vec))
(should-not (math-vector-is-string large-real-vec))
(should (math-vector-is-string small-cplx-vec))
(should-not (math-vector-is-string large-cplx-vec))))
;; 2: calc-string-maximum-character is negative
(let* ((maxchar -1)
(calc-string-maximum-character maxchar)
(valid-contents (number-sequence 0 2))
(invalid-contents (number-sequence (- maxchar 2) maxchar))
(valid-real-vec (make-vec valid-contents))
(invalid-real-vec (make-vec invalid-contents))
(valid-cplx-vec (make-cplx-vec valid-contents))
(invalid-cplx-vec (make-cplx-vec invalid-contents)))
(should-not (math-vector-is-string valid-real-vec))
(should-not (math-vector-is-string invalid-real-vec))
(should-not (math-vector-is-string valid-cplx-vec))
(should-not (math-vector-is-string invalid-cplx-vec)))
;; 3: calc-string-maximum-character is larger than (max-char)
(let* ((maxchar (+ (max-char) 3))
(calc-string-maximum-character maxchar)
(valid-chars (number-sequence (- (max-char) 2) (max-char)))
(invalid-chars (number-sequence (1+ (max-char)) maxchar))
(valid-real-vec (make-vec valid-chars))
(invalid-real-vec (make-vec invalid-chars))
(valid-cplx-vec (make-cplx-vec valid-chars))
(invalid-cplx-vec (make-cplx-vec invalid-chars)))
(should (math-vector-is-string valid-real-vec))
(should-not (math-vector-is-string invalid-real-vec))
(should (math-vector-is-string valid-cplx-vec))
(should-not (math-vector-is-string invalid-cplx-vec)))
;; 4: calc-string-maximum-character has the wrong type
(let* ((calc-string-maximum-character "wrong type")
(contents (number-sequence 0 2))
(real-vec (make-vec contents))
(cplx-vec (make-cplx-vec contents)))
(should-error (math-vector-is-string real-vec)
:type 'wrong-type-argument)
(should-error (math-vector-is-string cplx-vec)
:type 'wrong-type-argument))))
(provide 'calc-tests) (provide 'calc-tests)
;;; calc-tests.el ends here ;;; calc-tests.el ends here