mirror of
git://git.sv.gnu.org/emacs.git
synced 2026-01-30 04:10:54 -08:00
Improve documentation of bidi in ELisp manual.
doc/lispref/nonascii.texi (Character Properties): Document use of `bidi-class' and `mirroring' properties as part of reordering. Provide cross-references to "Bidirectional Display". doc/lispref/display.texi (Bidirectional Display): Document the pitfalls of concatenating strings with bidirectional content, with possible solutions. Document string-mark-left-to-right. Mention paragraph direction in modes that inherit from prog-mode. Document use of `bidi-class' and `mirroring' properties as part of reordering. etc/NEWS: Mark string-mark-left-to-right as documented.
This commit is contained in:
parent
4dcb0d7a58
commit
c094bb0cf7
4 changed files with 107 additions and 17 deletions
|
|
@ -1,3 +1,15 @@
|
|||
2011-08-18 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* nonascii.texi (Character Properties): Document use of
|
||||
`bidi-class' and `mirroring' properties as part of reordering.
|
||||
Provide cross-references to "Bidirectional Display".
|
||||
|
||||
* display.texi (Bidirectional Display): Document the pitfalls of
|
||||
concatenating strings with bidirectional content, with possible
|
||||
solutions. Document string-mark-left-to-right. Mention paragraph
|
||||
direction in modes that inherit from prog-mode. Document use of
|
||||
`bidi-class' and `mirroring' properties as part of reordering.
|
||||
|
||||
2011-08-16 Eli Zaretskii <eliz@gnu.org>
|
||||
|
||||
* modes.texi (Major Mode Conventions): Improve the documentation
|
||||
|
|
|
|||
|
|
@ -5992,6 +5992,7 @@ left-to-right and right-to-left characters.
|
|||
for editing and displaying bidirectional text.
|
||||
|
||||
@cindex logical order
|
||||
@cindex reading order
|
||||
@cindex visual order
|
||||
@cindex unicode bidirectional algorithm
|
||||
Emacs stores right-to-left and bidirectional text in the so-called
|
||||
|
|
@ -6006,17 +6007,16 @@ for display. Reordering of bidirectional text for display in Emacs is
|
|||
a ``Full bidirectionality'' class implementation of the @acronym{UBA}.
|
||||
|
||||
@defvar bidi-display-reordering
|
||||
The buffer-local variable @code{bidi-display-reordering} controls
|
||||
whether text in the buffer is reordered for display. If its value is
|
||||
non-@code{nil}, Emacs reorders characters that have right-to-left
|
||||
directionality when they are displayed. The default value is
|
||||
@code{t}. Text in overlay strings (@pxref{Overlay
|
||||
Properties,,before-string}), display strings (@pxref{Overlay
|
||||
Properties,,display}), and @code{display} text properties
|
||||
(@pxref{Display Property}) is also reordered if the buffer whose text
|
||||
includes these strings is reordered for display. Turning off
|
||||
@code{bidi-display-reordering} for a buffer turns off reordering of
|
||||
all the overlay and display strings in that buffer.
|
||||
This buffer-local variable controls whether text in the buffer is
|
||||
reordered for display. If its value is non-@code{nil}, Emacs reorders
|
||||
characters that have right-to-left directionality when they are
|
||||
displayed. The default value is @code{t}. Text in overlay strings
|
||||
(@pxref{Overlay Properties,,before-string}), display strings
|
||||
(@pxref{Overlay Properties,,display}), and @code{display} text
|
||||
properties (@pxref{Display Property}) is also reordered for display if
|
||||
the buffer whose text includes these strings is reordered. Turning
|
||||
off @code{bidi-display-reordering} for a buffer turns off reordering
|
||||
of all the overlay and display strings in that buffer.
|
||||
|
||||
Reordering of strings that are unrelated to any buffer, such as text
|
||||
displayed on the mode line (@pxref{Mode Line Format}) or header line
|
||||
|
|
@ -6056,7 +6056,7 @@ it is reordered for display. That is, the entire chunk of text
|
|||
covered by these properties is reordered together. Moreover, the
|
||||
bidirectional properties of the characters in this chunk of text are
|
||||
ignored, and Emacs reorders them as if they were replaced with a
|
||||
single character @code{u+FFFC}, known as the @dfn{Object Replacement
|
||||
single character @code{U+FFFC}, known as the @dfn{Object Replacement
|
||||
Character}. This means that placing a display property over a portion
|
||||
of text may change the way that the surrounding text is reordered for
|
||||
display. To prevent this unexpected effect, always place such
|
||||
|
|
@ -6073,9 +6073,9 @@ begins at the right margin and is continued or truncated at the left
|
|||
margin.
|
||||
|
||||
@defvar bidi-paragraph-direction
|
||||
Emacs determines the base direction of each paragraph dynamically,
|
||||
based on the text at the beginning of the paragraph. The precise
|
||||
method of determining the base direction is specified by the
|
||||
By default, Emacs determines the base direction of each paragraph
|
||||
dynamically, based on the text at the beginning of the paragraph. The
|
||||
precise method of determining the base direction is specified by the
|
||||
@acronym{UBA}; in a nutshell, the first character in a paragraph that
|
||||
has an explicit directionality determines the base direction of the
|
||||
paragraph. However, sometimes a buffer may need to force a certain
|
||||
|
|
@ -6087,6 +6087,13 @@ dynamic determination of the base direction, and instead forces all
|
|||
paragraphs in the buffer to have the direction specified by its
|
||||
buffer-local value. The value can be either @code{right-to-left} or
|
||||
@code{left-to-right}. Any other value is interpreted as @code{nil}.
|
||||
The default is @code{nil}.
|
||||
|
||||
@cindex @code{prog-mode}, and @code{bidi-paragraph-direction}
|
||||
Modes that are meant to display program source code should force a
|
||||
@code{left-to-right} paragraph direction. The easiest way of doing so
|
||||
is to derive the mode from Prog Mode, which already sets
|
||||
@code{bidi-paragraph-direction} to that value.
|
||||
@end defvar
|
||||
|
||||
@defun current-bidi-paragraph-direction &optional buffer
|
||||
|
|
@ -6099,3 +6106,70 @@ non-@code{nil}, the returned value will be identical to that value;
|
|||
otherwise, the returned value reflects the paragraph direction
|
||||
determined dynamically by Emacs.
|
||||
@end defun
|
||||
|
||||
@cindex layout on display, and bidirectional text
|
||||
@cindex jumbled display of bidirectional text
|
||||
@cindex concatenating bidirectional strings
|
||||
Reordering of bidirectional text for display can have surprising and
|
||||
unpleasant effects when two strings with bidirectional content are
|
||||
juxtaposed in a buffer, or otherwise programmatically concatenated
|
||||
into a string of text. A typical example is a buffer whose lines are
|
||||
actually sequences of items, or fields, separated by whitespace or
|
||||
punctuation characters. This is used in specialized modes such as
|
||||
Buffer-menu Mode or various email summary modes, like Rmail Summary
|
||||
Mode. Because these separator characters are @dfn{weak}, i.e.@: have
|
||||
no strong directionality, they take on the directionality of
|
||||
surrounding text. As result, a numeric field that follows a field
|
||||
with bidirectional content can be displayed @emph{to the left} of the
|
||||
preceding field, producing a jumbled display and messing up the
|
||||
expected layout.
|
||||
|
||||
To countermand this, you can use one of the following techniques for
|
||||
forcing correct order of fields on display:
|
||||
|
||||
@itemize @minus
|
||||
@item
|
||||
Append the special character @code{U+200E}, LEFT-TO-RIGHT MARK, or
|
||||
@acronym{LRM}, to the end of each field that may have bidirectional
|
||||
content, or prepend it to the beginning of the following field. The
|
||||
function @code{string-mark-left-to-right}, described below, comes in
|
||||
handy for this purpose. (In a right-to-left paragraph, use
|
||||
@code{U+200F}, RIGHT-TO-LEFT MARK, or @acronym{RLM}, instead.) This
|
||||
is one of the solutions recommended by
|
||||
@uref{http://www.unicode.org/reports/tr9/#Separators, the
|
||||
@acronym{UBA}}.
|
||||
|
||||
@item
|
||||
Include the tab character in the field separator. The tab character
|
||||
plays the role of @dfn{segment separator} in the @acronym{UBA}
|
||||
reordering, whose effect is to make each field a separate segment, and
|
||||
thus reorder them separately.
|
||||
@end itemize
|
||||
|
||||
@defun string-mark-left-to-right string
|
||||
This subroutine returns its argument @var{string}, possibly modified,
|
||||
such that the result can be safely concatenated with another string,
|
||||
or juxtaposed with another string in a buffer, without disrupting the
|
||||
relative layout of this string and the next one on display. If the
|
||||
string returned by this function is displayed as part of a
|
||||
left-to-right paragraph, it will always appear on display to the left
|
||||
of the text that follows it. The function works by examining the
|
||||
characters of its argument, and if any of those characters could cause
|
||||
reordering on display, the function appends the @acronym{LRM}
|
||||
character to the string. The appended @acronym{LRM} character is made
|
||||
@emph{invisible} (@pxref{Invisible Text}), to hide it on display.
|
||||
@end defun
|
||||
|
||||
The reordering algorithm uses the bidirectional properties of the
|
||||
characters stored as their @code{bidi-class} property
|
||||
(@pxref{Character Properties}). Lisp programs can change these
|
||||
properties by calling the @code{put-char-code-property} function.
|
||||
However, doing this requires a thorough understanding of the
|
||||
@acronym{UBA}, and is therefore not recommended. Any changes to the
|
||||
bidirectional properties of a character have global effect: they
|
||||
affect all Emacs frames and windows.
|
||||
|
||||
Similarly, the @code{mirroring} property is used to display the
|
||||
appropriate mirrored character in the reordered text. Lisp programs
|
||||
can affect the mirrored display by changing this property. Again, any
|
||||
such changes affect all of Emacs display.
|
||||
|
|
|
|||
|
|
@ -392,7 +392,8 @@ The value is an integer number.
|
|||
@item bidi-class
|
||||
Corresponds to the Unicode @code{Bidi_Class} property. The value is a
|
||||
symbol whose name is the Unicode @dfn{directional type} of the
|
||||
character.
|
||||
character. Emacs uses this property when it reorders bidirectional
|
||||
text for display (@pxref{Bidirectional Display}).
|
||||
|
||||
@item decomposition
|
||||
Corresponds to the Unicode @code{Decomposition_Type} and
|
||||
|
|
@ -440,7 +441,9 @@ defined mirroring glyph. All the characters whose @code{mirrored}
|
|||
property is @code{N} have @code{nil} as their @code{mirroring}
|
||||
property; however, some characters whose @code{mirrored} property is
|
||||
@code{Y} also have @code{nil} for @code{mirroring}, because no
|
||||
appropriate characters exist with mirrored glyphs.
|
||||
appropriate characters exist with mirrored glyphs. Emacs uses this
|
||||
property to display mirror images of characters when appropriate
|
||||
(@pxref{Bidirectional Display}).
|
||||
|
||||
@item old-name
|
||||
Corresponds to the Unicode @code{Unicode_1_Name} property. The value
|
||||
|
|
|
|||
1
etc/NEWS
1
etc/NEWS
|
|
@ -1043,6 +1043,7 @@ of function value which looks like (closure ENV ARGS &rest BODY).
|
|||
*** New function `special-variable-p' to check whether a variable is
|
||||
declared as dynamically bound.
|
||||
|
||||
+++
|
||||
** New function `string-mark-left-to-right'.
|
||||
Given a string containing right-to-left (RTL) script, this function
|
||||
returns another string with a terminating LRM (left-to-right mark)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue