Implementing Ha3sm IO Channels
Rick Hohensee
Ha3sm will model the world outside the CPU with less uniformity than unix
does, or attempts to. In Ha3sm everything is not a file. Everything is a
device, and devices come in two major classes, block
devices and stream devices. Devices may be virtualized by
processes. This is how filesystems may be implemented, but that is not
extensively discussed here. A process will speak to these devices via
channels. With two types of device and a read and write channel
as separate entities, this gives four basic types of channel;
recieve, transmit, read and write. So
actually, everything is a channel, since there needn't be any code common
to both sides of a device. How many channels a "device" might have can
vary.
Streams are things with a data-rate over time, and streams may be
temporally continuous. Blocks are things with a finite size, and a
location on a device holding an expanse of blocks. This is similar to unix
character devices and block devices, but Ha3sm itself makes no attempt to
make them both look like files as first-order objects.
I didn't set out to design a microkernel, but Ha3sm is one. Access
controls will be at the channel level. This will be on a need-to-know
basis. A process that can't use a channel just finds out that as far as
it's concerned there is no such channel. This also means that the concept
of a file, which has never been a very clear concept, does not exist at
the kernel level.
Implementing stream channels will make heavy use of osimplay's standard
ring buffer data structures. A channel will involve one side in a
process and one side in the kernel. The kernel can use process
memory-space transparently. Virtual devices will probably need an
in-kernel arbiter of some kind.
Syscalls will be created thusly; one for initializations and maintenance
of channels in general, one for block reads, and one for block writes.
On-going stream transfers will not require syscalls by the processes
involved. The initialization syscall will be the general Ha3sm
please in-kernel interpreter. please may be slow, which
is why channels will pre-establish permissions for read and write
requests.
This is lower-level than unix, and in fact is relatively a micro-kernel,
since a filesystem is above the level considered by the Ha3sm kernel.
Access controls will be at the device level, not the file level. This
allows true multi-user, with some loss of flexibility. Also, processes
without access to a device will not be cognizant of it's existance. Ha3sm
kernel block services (syscalls) will resemble Forth BLOCK-related
"words". In general, Ha3sm is unix-meets-Forth, with some AmigaDos and
other flavorings. And yes, "please" is from INTERCAL.
There are dozens of free interrupt vectors in a PC. Linux syscalls are
all on one intvec. There's no reason to limit a Ha3sm to one kind of
"please". 0x80 could be H3sm, 0x90 could be Lisp, 0xa0 could be ml, and
they would all use the same few pre-authorized read, write and so on.
In the following, read-write distinctions are usually viewed from the
process. The basic implementation of channels is explained here
independant of the permissions or class of process. Be it an
0wner or guest process, the workings of a particular
type of channel are basically the same. The descriptions tend to be for
guest processes, which must ask please for system resources,
although 0wner processes probably will also for implementation simplicity.
This page is/was also my design worksheet for channels. I'm tweaking as I
go. It was in the course of writing this that I realized that on-going
stream transfers don't need repeated syscalls. It also seems at this point
like Owner processes will go through please syscalls in most
cases, just like guest processes. The implementation descriptions assume
that Ha3sm set up a channel list (device list) at init time, and channel
list entries know thier affiliated initialization jump addresses. A
please is a syscall, and can jump right into reflex code with no
IRET stack stunts.
streams
Streams will use osimplay standard ring buffers. This is a data
structure with standard offsets to various metadata including a size, a
data throughput count, a constant size-derived mask for truncating to
bound accesses, and an index to the last byte inserted. Both sides of a
channel can then tell if they have been lapped and so on. Streams may fan
out from one source to several sinks, but do not blend into a single sink
from several sources.
send, transmitter
conceptually
A stream from a process to the system is a transmit channel. A
transmitter goes through an initialization involving a trap, and then it's
continuous operation is handled by the peripheral's data-recieved trap
handler, it's data-recieved reflex. I believe this is analagous
to a unix "bottom-half" of a trap handler. The transmitting process may
then simply ring data into it's transmit buffer for that channel,
and the system will handle it asynchronously. The transmit channel
initialization is the only time the process must explicitly trap to the
system.
implementation
- A process wishes to transmit a stream to a device
- It asks please
- the device exists, i.e. is visible to the process. Nobody else is
transmitting to the device. Otherwise, game over. Set up a
failure message and return-from-interrupt.
- process sets up a ring buffer for the channel. Device may specify
ring buffer size, but 1k is probably good. Transmitting requires one
ring in the process's memory-space.
- channel init code hooks process's ring location into device's
recieve reflex.
- Assuming the device reflex runs on a data-recieved-ACK,
it looks like a reflex will be some init code that please
can jump into, and then fall through into the actual reflex code,
and then return-from-interrupt. This is why 0wner will probably use
please. There also needs to be a place for a try-again (see below)
to jump into the reflex.
- The transmit reflex will now run asynchronously. The process
may or may not ring more transmit data at it's leisure.
- The reflex checks it's recieved pointer in the ring versus the
process's send pointer in the ring.
- If the transmit reflex occurs and the ring buffer is empty of
untransmitted data, the transmit reflex must implement some
periodic "try again". Try-agains can be a list on a clocked
reflex. If a try-again succeeds, the data-recieved-ACK will
again be active. I think. Unix uses the process-switch trap
to stack extra stuff that has to be on a trap, such as signals.
A try-again is not a signal; it is a periodic reminder to the
kernel.
Where to hook the try-agains may vary on a PC, depending on how many
useful trapping clocks there are.
- The reflex checks if it has been lapped by the process. That is,
it checks if the process has ringed more new data than the size of
the ring since the reflex recieved the latest data. If so, some
system log entry of some kind is in order, at least.
- The same stack technique can be used for the seed byte of the stream
and for retries when the reciever is ahead of the sender. An
osimplay word such as fakeint may arise for the purpose.
SO,
recieve, reciever
conceptually
Ha3sm recievers can be like Amigados listeners in that they can
multiplex. Several processes may wish to recieve mouse data, for example.
Several processes may all see asynchronous updates to recieve rings from
the same device. Initializing a channel does not require exclusive use.
Once up, Recievers run entirely in the flow of control of the system, not
the recieving process. The lack of symmetry between multiplexabilty of
transmitters and recievers is displeasing, but the kernel is more likely
to have info of general interest.
implementation
- A process wishes to recieve a stream from a device
- the device exists, i.e. is visible to the process. Otherwise,
game over. Process does not need exclusive acccess to the channel.
- process sets up a ring buffer for the purpose. Device may specify
ring buffer size, but 1k is probably good. Recieving requires one
ring in the process's memory-space.
- process calls system via a trap to hook ring location into device's
recieve reflex. Recieve hooks, ring addresses, can be
a list.
- It may be necessary to put the device reflex to sleep while the hooking
is in progress, given that other processes could be harmed by
bad data.
- The recieve reflex will now run asynchronously. The process
may use or not use recieved data at it's leisure. The reflex
doesn't care if a particular process gets lapped. The overhead for each
reciever is small, particularly for IO.
blocks
Blocks are device blocks. They can be sized by the device, or assumed to
be 1k. Blocks block. That is, block IO is a one-shot, and the process
should wait till it's complete. That is, the burden of waiting should be
on the process doing the IO. This is the sort of thing where Ha3sm will
not be the giant data slush-fund that unix is. Processes will bear thier
own costs as much as possible. Block IO should run at a moderate to low
interrupt priority, so that the system doesn't feel the graininess of
large transfer delays. A 1k buffer should be the default without specific
info from the device. Locking and exclusivity at greater ply than
per-block and smaller ply than per-device is left outside the kernel.
Per-block locking is inate to the required trapping. When two processes
read the same block and then write different modifications of it, the
latter write will supercede the former, with no cognizance of the former's
modifications.
Block channels are more political than physical. The concept of an
established communications channel does not help transfer a block other
that to save or simplify permissions checks on each block. This is the
value of per-device permissions, rather than per-file or per-block.
write
conceptually
Write a block from process memory out to a block device. The block will
have an address of some kind on the device. The process will call the
system to perform the write, and wait for the call to fail or return.
Control of access by the kernel is per-device.
implementation
- A process wishes to send blocks of data to a device
- process makes a please syscall
request to establish a write channel.
- the device exists, i.e. is visible to the process.
Otherwise, game over; please fails.
- process sets up a block buffer for the purpose. Device may specify
buffer size, but 1k is probably good. Writing requires one
buffer in the process's memory-space.
- system provides process with a channel ID for subsequent writes
to the device in question.
- please syscall returns
- when process is ready, it makes a write syscall, with the channel ID
and the block location in process memory.
- process blocks, i.e. does not run, until write completes, or fails.
Writes are queued by the system, and go in order, so they may be
some snoozing.
read
conceptually
Read a block off of a block device into process memory. The block will
have an address of some kind on the device. The process will call the
system to perform the read, and wait for the call to fail or return.
Control of access by the kernel is per-device.
implementation
Basically the same as a write.
Examples
Some devices may benefit from a buffer in the kernel. The keyboard for
example. This will be the object of various recievers, and should be kept
running at all times, and so should have a default buffer. Guest processes
don't need to see kernel memory though.
The program logic of Ha3sm channels
parts
Ha3sm will have streams and blocks, roughly analagous to unix character
and block devices. Streams will have send and recieve channels, and blocks
will have read and write channels. Which channels a particular real or
virtual device has may vary widely from one to numerous. Ha3sm user access
controls will be by channel. Access to a channel will be moderated via the
"please" system request. All channel initializations will involve a
please. A timeout mechanism will be available as an internal system
service running on a periodic event or events, possibly the PC calendar
clock interrupts (surprises). The timeout mechanism will be available as a
system word (subroutine). The lowest level block channel code will be two
distinct system requests. The lowest level stream code will be two
distinct reflexes (asynchronous interrupt handlers). Channels will also
have channel-specific init code. As system words, we'll have please
retries send recieve read write askread askwrite, with askread
and askwrite internal only.
This pseudocode is aware of IRET issues. Streams are roughly
init-and-forget. Blocks have separate init, read and write syscalls,
although the init is the usual "please". A "surprise" is an asynchronous
interrupt. A "request" is a synchronous trap, a syscall.
(timeout monitor code)
Timeouts get set at init time. The timeout monitor for streams checks the
channel's buffer for changes in the data-sent pointer. Channel init code
will provide the ring address. Block channel timeouts are started at the
xfer request time, which happens for every block, but only once per block
xfered. The timeouts will probably be on PC wallclock INTs, which give
intuitive timing intervals.
A timeout expiry must inform the process. That's a signal. Yick. A block
timeout can return to ther process instead of the transfer. A stream
timeout can bogusize the ring metadata. Either one has to do something
about the pending surprise.
channel-related "please"
This is the "channel" word of a "please". "please" may eventually be a
full H3sm interpreter. "channel" is thus a word that ends with a jump or
an IRET. This is the portion common to opening channels of any type.
"please" may also eventually have distinct text interpreter and compiled
forms.
A process does please channel $TYPE $DEVICE ($RETRIES) request
The "please channel" is a Forth-like dictionary lookup
Then a list traversal of channels known to the process
IF not find channel
please syscall IRETs with failure code
ELSE found channel
obtain channel init hook address
JUMP to channel INIT hook, IRET from channel code
channel type
stream
Stream reflexes run continuously via hardware interrupts. Enabling the
pertinent interrupt can occur at system init time or stream init time.
SEND
The example send stream is based on a byte-recieved ACK such as typical
with a Centronics device. Ongoing streaming and timeouting is asynchronous
to the process end of the channel.
init
IF channel in use
channel init fails and IRETs
ELSE channel available
channel type is presumed to match request via naming
init code inserts process information in trap code
IF $timeout non-null
set up timeout
FI
JUMP (possibly fall through) to surprise code
FI
FI
reflex (callee-saves?)
(entered due to ACK on byte-recieved or on jump from retry)
IF byte pending
send byte
IRET
ELSE
IRET
FI
RECIEVE
Recieve streams multiplex. Many processes or threads can recieve the same
data from a particular system transmitter. The device data-ready surprise
invokes a list of recievers to update. I don't foresee any getting
pathologically long or slow. Transmitters can have a list-length limit.
The surprise reflex code will probably be pretty snappy.
init
channel type is presumed to match request via naming
init code inserts process information in list in surprise code
(simple array)
IF $timeout non-null
set up timeouter
FI
JUMP (possibly fall through) to surprise code
FI
reflex (callee-saves?)
(entered due to ACK on byte-recieved or on jump from retry)
IF byte pending
FOR reciever list
send byte
NEXT
IRET
ELSE
IRET
FI
block
init
IF channel in use
channel init fails and IRETs
ELSE channel available
channel type is presumed to match request via naming
Assign channel ID for request
init code IRETs to process returning channel ID
FI
FI
READ
process does READ $CHANID ($timeout) request
IF timeout not null
set up timeouter
FI
block read to device interface
IRET from ACK reflex with success code
WRITE
process does WRITE $CHANID ($timeout) request
IF timeout not null
set up timeouter
FI
block write to device interface
IRET from ACK reflex with success code