1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2026-01-03 10:31:37 -08:00

(Coding System Basics): Describe about rondtrip

identity of coding systems.
This commit is contained in:
Kenichi Handa 2005-04-01 00:29:51 +00:00
parent 9b06ffa3dc
commit 6fa886202f

View file

@ -628,6 +628,28 @@ characters; for example, there are three coding systems for the Cyrillic
conversion, but some of them leave the choice unspecified---to be chosen
heuristically for each file, based on the data.
In general, a coding system doesn't guarantee a roundtrip identity,
i.e. decoding followed by encoding in the same coding system can
result in the different byte sequence. But there are several coding
systems that go guarantee that the result will be the same as what you
originally decoded. They are:
@quotation
chinese-big5 chinese-iso-8bit cyrillic-iso-8bit emacs-mule
greek-iso-8bit hebrew-iso-8bit iso-latin-1 iso-latin-2 iso-latin-3
iso-latin-4 iso-latin-5 iso-latin-8 iso-latin-9 iso-safe
japanese-iso-8bit japanese-shift-jis korean-iso-8bit raw-text
@end quotation
Likewise, a coding systme doesn't guarantee the other way of roundtrip
identity, i.e. encoding buffer text into a coding system followed by
decoding again with the same coding system will produce the different
buffer text. For instance, when you encode Latin-2 characters by
@code{utf-8} and decode it back by the same coding system, you'll get
Unicode charactes (of charset @code{mule-unicode-0100-24ff}), and when
you encode Unicode characters by @code{iso-latin-2} and decode it back
by the same coding system, you'll get Latin-2 characters.
@cindex end of line conversion
@dfn{End of line conversion} handles three different conventions used
on various systems for representing end of line in files. The Unix