Commit graph

9017 commits

Author SHA1 Message Date
Daniel Kochmański
47c17cbfa2 predlib: add accessors for *elementary-types* and *member-types*
Previously elementary types were considered to be (CONS SPECC TAG), but I want to
introduce additional slot information to them, so we define a structure for that
type. The representation a is list because MAYBE-SAVE-TYPES calls COPY-TREE. Also
DEFSTRUCT is not available yet.

Rename PUSH-TYPE to PUSH-NEW-TYPE and move it to a correct section in the file.
2025-09-08 09:16:25 +02:00
Daniel Kochmański
783289c629 predlib: assert n important property when adding a new type
The property in question is a strict total order within the kingdom.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
6de56d977f predlib: cosmetic cleanup
Fix comment depth (;; -> ;;;) and simplify a few expressions.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
ef5d534af2 subtypep: refactor canonical number types for consistency
Previously CANONICAL-COMPLEX-TYPE accepted the specializer and that was not
consistent with other functions handling canonical types.

Rename REGISTER-INTERVAL-TYPE to CANONICAL-INTERVAL-TYPE because this function
may register numerous elementary types and return their bit-wise composition,
and rename REGISTER-ELEMENTARY-INTERVAL to REGISTER-INTERVAL-TYPE.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
19eb060d14 subtypep: refactor register-interval-type
We use destructuring to bind elements of the type, and both high and low tag
computation follows the same code shape to highlight similarities.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
b7a22e904b subtypep: ensure that all registered types have total order
This allows us to remove the kludge from FIND-TYPE-BOUNDS - the parameter
MINIMIZE-SUPER was to allow registering ranges that are in a canonical
form (that is left-bound).

We don't register types that may be obtained by a composition of other
registered types to avoid fake aliasing.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
25f825efff subtypep: don't add a new tag for equivalent types
The function MAKE-REGISTERED-TAG calls FIND-TYPE-BOUNDS to determine supertypes
that need to be updated with the new bit, and subtypes that need to be included
in the new tag.

Thiis procedure was bogus because it did not recognize equivalent types. That
lead to a situation, where synonymous types could have been added twice with
incorrect relation. Consider:

type A: 011
type B: 001

We add a type C that is equivalent to A and B is subtype (to both). With the old
method the result would be:

type A: 111
type B: 001
type C: 101

So if we had later queried wheter A is subtypep to C, then the answer would be
incorrectly NIL.

The bug was hidden by the fact, that CANONICAL-TYPE expands type aliases when
they are symbols, so we had never encountered a situation where equivalent types
had different names in *ELEMENTARY-TYPES*. This changes when we introduce the
new kingdom for CONS type, because the key is (CONS X Y), and symbols in type
names X Y are not expanded, so

  (CONS (OR FIXNUM BIGNUM)) is not EQUAL to (CONS INTEGER)
2025-08-27 10:46:58 +02:00
Daniel Kochmański
fd101452b6 subtypep: introduce the function MAKE-REGISTERED-TAG
This function is used by REGISTER-ELEMENTARY-INTERVAL and REGISTER-TYPE.
Additionally we drop the call to LOGANDC2 in the invocation of UPDATE-TYPE,
because FIND-TYPE-BOUNDS always does that for us (so it was redundant).

Also remove redundant (and unused) function BOUNDS-<.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
cfe1dec177 subtypep: rebind type variables with a macro WITH-TYPE-DATABASE
It seems that some variables were rebound also in cmptype-arith.lsp -- to avoid
potential inconsistency we abstract away bindings as WITH-TYPE-DTABASE.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
e8f931c484 subtypep: fix the expansion of the type STRING
The type STRING was defined as an alias to (ARRAY CHARACTER (*)) and that was
inconsistent with the type definition for unicode builds, it should be:

    (OR (ARRAY CHARACTER (*))
        (ARRAY BASE-CHAR (*)))
2025-08-27 10:46:58 +02:00
Daniel Kochmański
135632bedf subtypep: small refactor of find-built-in-tag
Instead of relying on default value in gethash, we handle NIL separately and use
FOUNDP in the last case. That reduces code nesting and makes it more redable.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
fb5969cdcc subtypep: cleanup; remove unnecessary call
FIND-BUILT-IN-TAG works only on type specifiers being symbols , but we've
already estabilished that this type specifier is a cons (COMPLEX ,@args).
2025-08-27 10:46:58 +02:00
Daniel Kochmański
019579dd46 subtypep: use constants for hardcoded tags for T and NIL
+BUILT-IN-TYPE-NIL+ and +BUILT-IN-TYPE-t+ are bottom and top types of the common
lisp type system. They were sometimes refered in the code as naked integers - we
change that by defining constants to better convey the meaning.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
f04f0ac160 bytevm: fix a possible segmentation fault in OP_PUSHKEYS
The issue was revealed by registering long (EQL LIST) elements as cons types --
essentially we've reached the frame size limit in the middle of the loop, the
frame was resized, but the pointer `first' was relative to the old frame base.

The solution is to reinitialize the pointer before each iteration.
2025-08-27 10:46:58 +02:00
Daniel Kochmański
b9e0a7f949 Clarify INSTALL instructions for Windows (replace ...) 2025-08-23 14:14:12 +02:00
Daniel Kochmański
e5aa9c1110 Fix a braino in CL:FINISH-OUTPUT which called ecl_force_output
The function CL:FINISH-OUTPUT called by accident ECL_FORCE_OUTPUT when used on
ANSI streams. That becames an issue when we call it on a two-way stream where
the output buffer was a gray stream with STREAM-FINISH-OUTPUT differing from
STREAM-FORCE-OUTPUT.
2025-08-13 14:19:56 +02:00
Marius Gerbershagen
a2019ce31a Merge branch 'object-streams' into 'develop'
Bivalent streams improvements

See merge request embeddable-common-lisp/ecl!355
2025-08-11 16:41:03 +00:00
Daniel Kochmański
86876f1dc3 Update changelog 2025-08-11 10:01:41 +02:00
Daniel Kochmański
c33c8f2ef7 tests: add a test for failed conversion byte->char
Sometimes bytes are outside of the character range. In that case we should
signal an error when we try to read them.
2025-08-11 10:01:41 +02:00
Daniel Kochmański
41f52d8d0f streams: bivalent stream signals a condition for bytes out of range
Sometimes a byte may be not within the character code range. In that case, when
we read the char, the system will signal a condition.

Alternatively (and that's the behavior before this commit) we could return the
character #\Nul. That was done by virtue of ECL_CHAR_CODE skipping tag bytes, so
the returned NIL was treated as 0.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
43fef5fad8 streams: address a possible segfault in sequence streams
Byte streams transcoding to :ucs-2 and :ucs-4 don't call ecl_set_stream_elt_type
effectively not initializing .byte_buffer.  Moreover functions seq_in_read_byte8
and seq_out_write_byte8 assume the vector type to be an octet based, and they
increment the stream position and test for its limit according to that.

That means that ecl_binary_read_byte and ecl_binary_write_byte calls would
segfault when seq_in_read_byte8 and seq_out_write_byte8 are called.

Both conditions could be easily mitigated by initializing .byte_buffer manually
and fixing seq_*_*_byte8 functions to account for the byte size, but there is no
need for that, because for these streams we are not using

ecl_binary_*_byte
ecl_eformat_*_byte

so byte8 functions are not called and .byte_buffer is not used.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
c7d78a412e tests: add tests for bivalent streams based on sequences 2025-08-11 10:01:40 +02:00
Daniel Kochmański
26a22057e5 streams: introduce direct bivalent sequence streams
Previously sequence streams always needed to go through the eformat and binary
encoders and decoders -- if bytes were too big, then we couldn't create sequence
streams from them.

After this commit it is possible to pass a character stream or a byte stream and
use it as a bivalent stream without a roundtrip for encoding and decoding.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
688eceb9ed streams: bring bivalent streams UNREAD-CHAR and UNREAD-BYTE together
This finishes the commit that adds unread-byte and peek-byte functions to the
mix in that for bivalent stream UNREAD-BYTE will work for the subsequent
READ-CHAR and vice versa. This also caters to transcoding etc.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
b7eaf35502 streams: move byte_stack to strm_os and improve UNREAD-BYTE
The .byte_stack is used only by files to:
a) unread a single octet when we use fallback LISTEN implementation
b) unread bytes that make a character when UNREAD-CHAR is used

The latter is important to transcode characters from one external format to
another (i.e see the test external-format.0003-transcode-read-char).

This commit improves the function unread-byte to do the same brinding bivalent
streams almost to parity with regard to that implementation (see next commit).

That makes the implementation of eformat cleaner, .byte_stack more
self-contained, and saves us consing new byte stack for sequence streams (where
it was simply ignored, not to mention not entirely correct - because we've used
a .byte_stack length to decrement the pointer position while the byte could have
more bits than one octet).

Other optimizations that could be done here:
- make the byte stack an adjustable vector to avoid consing on each unread
2025-08-11 10:01:40 +02:00
Daniel Kochmański
10a1e8dddf streams: invert .last_char processing and UNREAD-CHAR
Previously we've stored in this field the last read char, while now we store
there the last unread char. This way we can't tell whether the last read char
was the same as the unread one, but on the other hand this way requires less
bookeeping and the code shape is similar to UNREAD-BYTE.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
a726cdf879 streams: get rid of last_code slot in the structure
It was used to store bytes for unread, but we are going to change how unread
works, and we still can simply test for newline and encode behavior directly in
unread-char for newlines.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
c8f41912a0 streams: echo-stream treats last_byte as a flag
Instead of remembering the last unread object and its type, it simply yots down
the fact that something has been unread (and clears on read), and delegates the
question to the input stream.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
31a3fc904e streams: move ecl_generic_unread_byte to ecl_binary_unread_byte 2025-08-11 10:01:40 +02:00
Daniel Kochmański
ca845457f8 streams: switch to the new binary reader/writer implementation
We drop warying generic-read/write variants in favor of using binary encoders
introduced in earlier commits.

This will allow for unified handling of unread bytes and characters and
transcoding both in bivalent streams.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
1136d55122 streams: introduce a notion of a byte buffer in a stream
The byte buffer is used for encoding and decoding both characters and bytes.
Previously we've used a stack-allocated array, but this doesn't cut it when it
comes to binary streams, where the byte may be a "finite recognizable subtype of
integer" (c.f specification of OPEN), because then the array may have more
elements.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
ba422ec9dd streams: add binary encoders and decoders to the mix
This will allow us to transcode characters to bytes and vice versa. This is
necessary to implement conductive UNREAD-BYTE and UNREAD-BYTE, but will allow us
to also add low-level parsers for binary objects in the future.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
5fe96b8339 tests: add sequence stream tests for new functionality
- ensure that all byte element types are handled by binary sequence streams
- ensure that the vector fill pointer is followed by the input sequence
2025-08-11 10:01:40 +02:00
Daniel Kochmański
c7f534771a streams: sequence input stream follows the vector length
This is to allow working with sequence streams where the vector may change after
the stream has been created.

When the user specifies :END to be some fixed value, then we upkeep that
promise, but when :END is NIL, then we always consult the vector fillp.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
086f0a4bef streams: allow for sequence streams to handle all byte arrays
Previously when we couldn't convert the vector element type to a character,
creating sequence streams failed even when we were expecting the binary stream.
From now on it is possible to vectors with upgraded types being any integer.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
ea11e2c433 streams: rename common sequence stream accessors
SEQ_{INPUT,OUTPUT}* -> SEQ_STREAM*

Don't use IO_STREAM_ELT_TYPE in sequences and define SEQ_STREAM_ELT_TYPE
instread to avoid ambiguity.

This is a cleanup that signifies similarities between both objects.
2025-08-11 10:01:40 +02:00
Daniel Kochmański
37c3955180 manual: add documentation for new binary stream interfaces 2025-08-11 10:01:38 +02:00
Daniel Kochmański
7ee0977a50 tests: add test for binary and bivalent streams with extensions
Tests reading, peeking and unreading both characters and bytes.
2025-08-11 10:01:37 +02:00
Daniel Kochmański
a8e57c60a5 streams: implement new interfaces for unreading and peeking bytes
ecl_file_ops has two new members:

  void (*unread_byte)(cl_object strm, cl_object byte);
  cl_object (*peek_byte)(cl_object strm);

C API additions:

  void ecl_unread_byte (cl_object byte, cl_object strm)
  cl_object ecl_peek_byte (cl_object strm)

  si_unread_byte(cl_object strm, cl_object byte)    [1]
  si_peek_byte(cl_object strm, cl_object byte)      [2]

Lisp API additions:

  (ext:unread-byte stream byte) :: integer          [1]
  (ext:peek-byte   stream byte) :: (or integer nil) [2]

  (gray:stream-unread-byte stream byte) :: null
  (gray:stream-peek-byte stream) :: (or integer :eof)

We implement a "generic" version of unread-byte by storing it in a new slot
last_byte.
2025-08-11 10:01:37 +02:00
Daniel Kochmański
a887d040a2 tests: add a new test suite "stream"
Currently it contains only a check for a recently fixed bug in:
streams: fix a braino in str_in_unread_char
2025-08-11 10:01:37 +02:00
Daniel Kochmański
a916a5ccff tests: make finishes return the values from the executed form 2025-08-11 10:01:37 +02:00
Daniel Kochmański
e98e36dfca streams: fix a braino in str_in_unread_char
We've tested a wrong variable so this function allowed us to plum into negative
indexes.
2025-08-11 10:01:37 +02:00
Daniel Kochmański
407fe456fe streams: make ecl_read_byte return OBJNULL on EOF
This is to allow for sequence streams to return arbitrary objects (when
appropriately constructed) without many changes.
2025-08-11 10:01:37 +02:00
Daniel Kochmański
431132e4d1 streams: ecl_file_ops cleanup and some minor fixes
1. ecl_peek_char had outdated comment presumbly from before we've introduced
   stream dispatch tables - that comment has been removed.

2. fix erroneous specializations

   - of STREAM-UNREAD-CHAR

   By mistake we had two methods specialized to ANSI-STREAM, while one were
   clearly meant to specialize to T (in order to call BUG-OR-ERROR).

   - of winsock winsock_stream_output_ops

     stream peek char was set to ecl_generic_peek_char instead of
     ecl_not_input_read_char

3. change struct ecl_file_ops definition

a) ecl_file_ops structure change order of some operations to always feature READ
   before WRITE (for consistency)

b) we are more precise in dispatch function declaration and specify the return
   type to be ecl_character where applicable
2025-08-11 10:01:37 +02:00
Daniel Kochmański
fad7073f10 ffi: convert-to-foreign-string: ensure a cstring
The function operates on base_string while if it was supplied with an extended
string then ecl_base_char array became ecl_character, and that lead to bad
copies. To fix it we ensure that the passes string is first coerced to cstring.
2025-08-11 09:16:00 +02:00
Daniel Kochmański
77ceb401f9 doc: man: stack sizes are specified in bytes not kilobytes
Fixes #788.
2025-08-10 22:29:16 +02:00
Daniel Kochmański
bebb43d558 INSTALL: update the version of te tested emsdk version to 4.0.12
That's the current latest and it works OK when we compile host and wasm target
according to the most up-to-date instructions.
2025-08-04 10:06:31 +02:00
Marius Gerbershagen
a7126313d9 Merge branch 'refactor-streams' into 'develop'
Split file.d into various stream implementations

See merge request embeddable-common-lisp/ecl!351
2025-07-29 14:35:55 +00:00
Daniel Kochmański
3f2ad992e6 Merge branch 'openbsd-file_cnt' into 'develop'
openbsd: implement FILE_CNT() on opaque FILE

See merge request embeddable-common-lisp/ecl!354
2025-07-29 11:23:21 +00:00
Daniel Kochmański
68a78e252f msvc: update makefile 2025-07-26 16:59:42 +02:00