1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2025-12-06 06:20:55 -08:00

calc: Allow strings with character codes above Latin-1

The current behavior of the functions 'calc-display-strings',
'strings', and 'bstrings' is to skip any vector containing
integers outside the Latin-1 range (0x00-0xFF).  We introduce a
custom variable 'calc-string-maximum-character' to replace this
hard-coded maximum, and to allow vectors containing higher
character codes to be displayed as strings.  The default value
of 0xFF preserves the existing behavior.

* lisp/calc/calc.el (calc-string-maximum-character): Add custom
variable 'calc-string-maximum-character'.
* lisp/calc/calccomp.el (math-vector-is-string): Replace hard-coded
maximum with 'calc-string-maximum-character', and the 'natnump'
assertion with 'characterp'.  The latter guards against the
maximum being larger than '(max-char)', but not on invalid types of
the maximum such as strings.

* test/lisp/calc/calc-tests.el (calc-math-vector-is-string): Add
tests for 'math-vector-is-string' using different values of
'calc-string-maximum-character'.

* doc/misc/calc.texi (Quick Calculator, Strings, Customizing Calc):
Add variable definition for 'calc-string-maximum-character' and
reference thereof when discussing 'calc-display-strings'.
Generalize a comment about string display and availability of 8-bit
fonts.
(Bug#78528)
This commit is contained in:
Jacob S. Gordon 2025-05-19 15:05:37 -04:00 committed by Eli Zaretskii
parent 82766b71a4
commit 5bd9fa084d
5 changed files with 164 additions and 17 deletions

View file

@ -10179,7 +10179,7 @@ result @samp{[120]} (because 120 is the ASCII code of the lower-case
is displayed only according to the current mode settings. But
running Quick Calc again and entering @samp{120} will produce the
result @samp{120 (16#78, 8#170, x)} which shows the number in its
decimal, hexadecimal, octal, and ASCII forms.
decimal, hexadecimal, octal, and character forms.
Please note that the Quick Calculator is not any faster at loading
or computing the answer than the full Calculator; the name ``quick''
@ -10836,11 +10836,11 @@ from 1 to @samp{n}.
@cindex Strings
@cindex Character strings
Character strings are not a special data type in the Calculator.
Rather, a string is represented simply as a vector all of whose
elements are integers in the range 0 to 255 (ASCII codes). You can
enter a string at any time by pressing the @kbd{"} key. Quotation
marks and backslashes are written @samp{\"} and @samp{\\}, respectively,
inside strings. Other notations introduced by backslashes are:
Rather, a string is represented simply as a vector all of whose elements
are integers in the Latin-1 range 0 to 255. You can enter a string at
any time by pressing the @kbd{"} key. Quotation marks and backslashes
are written @samp{\"} and @samp{\\}, respectively, inside strings.
Other notations introduced by backslashes are:
@example
@group
@ -10857,21 +10857,24 @@ inside strings. Other notations introduced by backslashes are:
@noindent
Finally, a backslash followed by three octal digits produces any
character from its ASCII code.
character from its code.
@kindex d "
@pindex calc-display-strings
Strings are normally displayed in vector-of-integers form. The
@w{@kbd{d "}} (@code{calc-display-strings}) command toggles a mode in
which any vectors of small integers are displayed as quoted strings
instead.
instead. The display of strings containing higher character codes can
be enabled by increasing the custom variable
@code{calc-string-maximum-character} (@pxref{Customizing Calc}).
The backslash notations shown above are also used for displaying
strings. Characters 128 and above are not translated by Calc; unless
you have an Emacs modified for 8-bit fonts, these will show up in
backslash-octal-digits notation. For characters below 32, and
for character 127, Calc uses the backslash-letter combination if
there is one, or otherwise uses a @samp{\^} sequence.
strings. For ASCII control characters (below 32), and for the
@code{DEL} character (127), Calc uses the backslash-letter combination
if there is one, or otherwise uses a @samp{\^} sequence. Control
characters above 127 are not translated by Calc, and will show up in
backslash-octal-digits notation. The display of higher character codes
will depend on your display settings and system font coverage.
The only Calc feature that uses strings is @dfn{compositions};
@pxref{Compositions}. Strings also provide a convenient
@ -35684,6 +35687,33 @@ choose from, or the user can enter their own date.
The default value of @code{calc-gregorian-switch} is @code{nil}.
@end defvar
@defvar calc-string-maximum-character
@xref{Strings}.@*
The variable @code{calc-string-maximum-character} is the maximum value
of a vector's elements for @code{calc-display-strings}, @code{string},
and @code{bstring} to display the vector as a string. This maximum
@emph{must} represent a character, i.e. it's a non-negative integer less
than or equal to @code{(max-char)} or @code{0x3FFFFF}. Any negative
value effectively disables the display of strings, and for values larger
than @code{0x3FFFFF} the display acts as if the maximum were
@code{0x3FFFFF}. Some natural choices (and their resulting ranges) are:
@itemize
@item
@code{0x7F} or 127 (ASCII),
@item
@code{0xFF} or 255 (Latin-1, the default),
@item
@code{0x10FFFF} (Unicode),
@item
@code{0x3FFFFF} (Emacs).
@end itemize
The default value of @code{calc-string-maximum-character} is @code{0xFF}
or 255.
@end defvar
@node Reporting Bugs
@appendix Reporting Bugs

View file

@ -2182,7 +2182,19 @@ modifier, it scrolls by year.
The month and year navigation key bindings 'M-}', 'M-{', 'C-x ]' and
'C-x [' now have the alternative keys '}', '{', ']' and '['.
** Calc
*** New user option 'calc-string-maximum-character'.
Previously, the 'calc-display-strings', 'string', and 'bstring'
functions only considered integer vectors whose elements are all in the
Latin-1 range 0-255. This hard-coded maximum is replaced by
'calc-string-maximum-character', and setting it to a higher value allows
the display of matching vectors as Unicode strings. The default value
is '0xFF' or '255' to preserve the existing behavior.
* New Modes and Packages in Emacs 31.1
** New minor mode 'delete-trailing-whitespace-mode'.

View file

@ -628,6 +628,37 @@ Otherwise, 1 / 0 is changed to uinf (undirected infinity).")
(defcalcmodevar calc-display-strings nil
"If non-nil, display vectors of byte-sized integers as strings.")
(defcustom calc-string-maximum-character #xFF
"Maximum value of vector contents to be displayed as a string.
If a vector consists of characters up to this maximum value, the
function `calc-display-strings' will toggle displaying the vector as a
string. This maximum value must represent a character (see `characterp').
Some natural choices (and their resulting ranges) are:
- `0x7F' (`ASCII'),
- `0xFF' (`Latin-1', the default),
- `0x10FFFF' (`Unicode'),
- `0x3FFFFF' (`Emacs').
Characters for low control codes are either caret or backslash escaped,
while others without a glyph are displayed in backslash-octal notation.
The display of strings containing higher character codes will depend on
your display settings and system font coverage.
See the following for further information:
- info node `(calc)Strings',
- info node `(elisp)Text Representations',
- info node `(emacs)Text Display'."
:version "31.1"
:type '(choice (restricted-sexp :tag "Character Code"
:match-alternatives (characterp))
(const :tag "ASCII" #x7F)
(const :tag "Latin-1" #xFF)
(const :tag "Unicode" #x10FFFF)
(const :tag "Emacs" #x3FFFFF)))
(defcalcmodevar calc-matrix-just 'center
"If nil, vector elements are left-justified.
If `right', vector elements are right-justified.

View file

@ -907,13 +907,20 @@
(concat " " math-comp-right-bracket)))))
(defun math-vector-is-string (a)
"Return t if A can be displayed as a string, and nil otherwise.
Elements of A must either be a character (see `characterp') or a complex
number with only a real character part, each with a value less than or
equal to the custom variable `calc-string-maximum-character'."
(while (and (setq a (cdr a))
(or (and (natnump (car a))
(<= (car a) 255))
(or (and (characterp (car a))
(<= (car a)
calc-string-maximum-character))
(and (eq (car-safe (car a)) 'cplx)
(natnump (nth 1 (car a)))
(characterp (nth 1 (car a)))
(eq (nth 2 (car a)) 0)
(<= (nth 1 (car a)) 255)))))
(<= (nth 1 (car a))
calc-string-maximum-character)))))
(null a))
(defconst math-vector-to-string-chars '( ( ?\" . "\\\"" )

View file

@ -879,5 +879,72 @@ An existing calc stack is reused, otherwise a new one is created."
(should-error (math-read-preprocess-string nil))
(should-error (math-read-preprocess-string 42)))
(ert-deftest calc-math-vector-is-string ()
"Test `math-vector-is-string' with varying `calc-string-maximum-character'.
All tests operate on both an integer vector and the corresponding
complex vector. The sets covered are:
1. `calc-string-maximum-character' is a valid character. The last case
with `0x3FFFFF' is borderline, as integers above it will not make it
past the `characterp' test.
2. `calc-string-maximum-character' is negative, so the test always fails.
3. `calc-string-maximum-character' is above `(max-char)', so only the
first `characterp' test is active.
4. `calc-string-maximum-character' has an invalid type, which triggers
an error in the comparison."
(cl-flet* ((make-vec (lambda (contents) (append (list 'vec) contents)))
(make-cplx (lambda (x) (list 'cplx x 0)))
(make-cplx-vec (lambda (contents)
(make-vec (mapcar #'make-cplx contents)))))
;; 1: calc-string-maximum-character is a valid character
(dolist (maxchar '(#x7F #xFF #x10FFFF #x3FFFFD #x3FFFFF))
(let* ((calc-string-maximum-character maxchar)
(small-chars (number-sequence (- maxchar 2) maxchar))
(large-chars (number-sequence maxchar (+ maxchar 2)))
(small-real-vec (make-vec small-chars))
(large-real-vec (make-vec large-chars))
(small-cplx-vec (make-cplx-vec small-chars))
(large-cplx-vec (make-cplx-vec large-chars)))
(should (math-vector-is-string small-real-vec))
(should-not (math-vector-is-string large-real-vec))
(should (math-vector-is-string small-cplx-vec))
(should-not (math-vector-is-string large-cplx-vec))))
;; 2: calc-string-maximum-character is negative
(let* ((maxchar -1)
(calc-string-maximum-character maxchar)
(valid-contents (number-sequence 0 2))
(invalid-contents (number-sequence (- maxchar 2) maxchar))
(valid-real-vec (make-vec valid-contents))
(invalid-real-vec (make-vec invalid-contents))
(valid-cplx-vec (make-cplx-vec valid-contents))
(invalid-cplx-vec (make-cplx-vec invalid-contents)))
(should-not (math-vector-is-string valid-real-vec))
(should-not (math-vector-is-string invalid-real-vec))
(should-not (math-vector-is-string valid-cplx-vec))
(should-not (math-vector-is-string invalid-cplx-vec)))
;; 3: calc-string-maximum-character is larger than (max-char)
(let* ((maxchar (+ (max-char) 3))
(calc-string-maximum-character maxchar)
(valid-chars (number-sequence (- (max-char) 2) (max-char)))
(invalid-chars (number-sequence (1+ (max-char)) maxchar))
(valid-real-vec (make-vec valid-chars))
(invalid-real-vec (make-vec invalid-chars))
(valid-cplx-vec (make-cplx-vec valid-chars))
(invalid-cplx-vec (make-cplx-vec invalid-chars)))
(should (math-vector-is-string valid-real-vec))
(should-not (math-vector-is-string invalid-real-vec))
(should (math-vector-is-string valid-cplx-vec))
(should-not (math-vector-is-string invalid-cplx-vec)))
;; 4: calc-string-maximum-character has the wrong type
(let* ((calc-string-maximum-character "wrong type")
(contents (number-sequence 0 2))
(real-vec (make-vec contents))
(cplx-vec (make-cplx-vec contents)))
(should-error (math-vector-is-string real-vec)
:type 'wrong-type-argument)
(should-error (math-vector-is-string cplx-vec)
:type 'wrong-type-argument))))
(provide 'calc-tests)
;;; calc-tests.el ends here