Ha3sm sketch of a new operating system Hohensee's autonomous 3-stack machine Rick Hohensee Dec 23 2000 <-> Jan 3 2001 Maryland, USA ...................................... Overview intro This is a design excercise. It is for my own mental organization as the code for Ha3sm starts to coalesce, so it's sloppy and pedestrian, but the reader may see a variant of this she wishes to pursue, such as a 2-stack version. Comments on things that clearly won't work will be appreciated, if they are in terms of the parts of this design that apparently will work. H3sm, for example, my unix-hosted 3-stack Forth-like language, works. H3sm is without the "a" for autonomous. It's not an OS. Additional detail is also invited if it is based on and isn't entirely contradictory to what is established here. I don't know if any of Ha3sm is new. The "everything is a device" thing may be, and the device IO calls look a bit fresh. Ha3sm preserves the distinction between (in unix terms) block and char devices at the syscall level. Reading the two types of device is two distinct syscalls. Writing is two other calls. By the way, Some things in H3sm are separately novel, and the H3sm keyword abbreviations facility is even useful, easily adapted to regular Forth, and fairly orthogonal to the ANSI Forth standard. LAAETTR stands for Left As An Excercize To The Reader. Syscalls We start from the syscall interface, which I find has value for defining an OS. Firstly, yes, Ha3sm does have a syscall interface. There are processes, and a kernel, and processes can request services from the kernel via syscalls. Syscalls will be implemented with traps, also known as exceptions. Syscalls are what I call intentional traps. A trap is like a subroutine call, but a bit more of the machine's state is changed than just the program counter and the return stack. In the case of Ha3sm syscalls the state change is fairly lightweight. There is a change to the kernel memory space, but there is no wholesale pushing of registers. This means traps provide a strictly controllable means for guest processes to invoke appropriate code in the kernel. It appears at this point that Ha3sm will require only 5 distinct syscalls, although one, "please", is a catch-all for whatever else may arise, by virtue of supplying an interface to an in-kernel H3sm interpreter. Syscalls are "caller-saves". The caller doesn't usually have to save anything though. Usually there is parameter passing between the process and the syscall in the usual stack-based manner of H3sm. H3sm thus defines the calling convention of Ha3sm. A Ha3sm process on a one-stack register machine will have three stack pointers, a Size register, and a couple scratch registers. Ha3sm can have a separate protected memory address space for each process. There are two types of process; owner and guest. Processes must login, and guest processes are in individual protected memory spaces. The kernel and owner processes are not so sequestered. The main purpose of a syscall is to give control of the machine to the kernel temporarily at the bidding of a process. The kernel can see all memory, and knows what process called it on this particular syscall, so the syscall handler code in the kernel can use guest process memory areas and/or kernel areas. From this point forward I may start using the term call to mean a syscall. Calls in H3sm and Forth are called words. So far we are talking a multi-tasking kernel, owner processes and memory-sequestered processes. Ha3sm won't do memory virtualization to persistant store, AKA paged or segmented virtual memory. There is no swapping of memory to/from disk as a general kernel service. Ha3sm also provides no mechanism for a process to change it's initial memory allocation. The most a process will be allocated is half the physical RAM remaining unallocated. Processes are like virtual computers. Each one has an affiliated user, an address space, and normally, a core H3sm dictionary, including call words, which provide the virtual machine with I/O. H3sm (no a) is a 3-stack virtual machine, programming language, and interpreter. Each Ha3sm process then is like a shell, and also somewhat analagous to libc or other dynamic linking library. This is typical of a Forth-like language. The H3sm programming language defines a 3-stack (virtual) CPU. This is the the CPU Ha3sm is designed around. Ha3sm will allow non-H3sm-based processes, but H3sm-based is the norm. When a Ha3sm process makes a call, the kernel handler for that call is invoked. The kernel gets the CPU, as far as program flow is concerned, but because it's a lightweight task-switch, the CPU still has the register state the process left it with for most registers. This is how things are communicated between the process and the call. This means that the kernel observes the register usage of H3sm. In platform-independant terms, calls model the 3 stacks of H3sm. On H3sm-style hardware this would not be a matter of virtualization. In other words, Ha3sm uses H3sm calling conventions. This is relatively flexible as pertains multiple return values and so on. One-stack and two-stack *a3sm's are LAAETTR. Devices Calls are syncronous. The running program which currently is in control of the CPU causes them, perfectly in step with what it is otherwise doing. Calls then are just subroutine calls into the kernel. Asynchronous interrupts are another matter entirely. Hardware devices like keyboards, discs and so on may need to interrupt the processor at any time. Such interrupts are called asynchonous interrupts. The system has to deal with them immediately, which means it has to drop (and save) what it's doing and handle the interrupt. Hardware interrupts are signaled directly to the CPU. The CPU then invokes a handler for the particular type of interrupt detected. This is a crucial facility of an operating system. The handling of various interrupts is usually considered in terms of discrete device drivers. A complete driver includes the asyncronous handlers to handle the activities of the device itself, and the interface to the calls that let the rest of the system use what the device does. Asynchronous interrupts involve a heavier state change than calls. The current registers are all saved, and a full kernel state is entered, since the state of the process that was running at the time the interrupt occured is irrelevant, and must be preserved for when the device handler is finished. Ha3sm is based on a simple division of hardware devices into two types. There are block devices and stream devices. Each type may have a send and/or a recieve channel, and the entire driver namespace has a please or command channel. A driver may therefor be one, two or three channels. calls will exist for talking to the various channel types. The asynchronous interrupt handlers, the "hardware interrupts", run entirely within the kernel creating the channels' data for the calls to talk to. Scheduler Stock stuff basically. For a system to have multiple processes each getting some portion of the CPU time, there has to be something causing them to take turns. This is called the scheduler. Ha3sm has a system timer and affiliated interrupt and handler that will assure that no process keeps the CPU for more than some measured timeslice at a time. The scheduler then invokes the next process. Methods for determining who's turn it it with the CPU vary, but the main thing is that Ha3sm will preempt the current process periodically. This is called preemptive multi-tasking. The kernel routine that handles the scheduler interrupt is to be called "preempt". Realtime Hardware interrupts are typically prioritized. In a system wth several hardware devices operating on thier own individual timescales, there will be timing conflicts to resolve. Various things will want attention at the same time, and there's only one CPU. A particular machine will have one interrupt signal that is not interruptable by others. In Ha3sm this interrupt is reserved for use for special tasks that must always be serviced as soon as the system is capable of doing so. This is called hard realtime. It is feasible to guarantee a short response time for one thing at a time, and that one thing may be periodic or otherwise recurring. Ha3sm should be implemented such that resources for one periodic hard realtime task are reserved, and so that that interrupt can be serviced as quickly as possible. Let's call this feature and it's implementation the "reflex". The reflex is, to the rest of the Ha3sm scheduler, asynchronous. It is therefor necessarily callee-saves. The reflex would be a beneficial place to minimize what is saved and restored on occurance of a reflex event. Users When Ha3sm starts, it starts a H3sm running in the kernel memory space. This is an owner process, and is distinct from other processes mentioned so far. It is also distinct from the H3sm interpreter in the kernel itself that handles "please" calls. Processes in the system are invoked via the login message. The login call checks for a user name and password, and only starts the new process if the user info given is valid. If it is, the resulting process can access only the devices that user is registered for. Access permissions as enforced by the Ha3sm kernel itself are unix-like, but per device channel. Owner processes are allocated a process-specific H3sm-space, but can see all of RAM if it interests them. This is very much unlike the unix superuser being just another process in terms of memory sequestering. In Ha3sm, if it's your box, it's all your box. Guest processes are like little sequestered AmigaDos's. Owner processes are un-sequestered AmigaDos-as-coroutines I guess. Services A process may offer some facility to the rest of the system by registering a service it provides as if it was a device. This is called a process service. Upon successful registration of a service, the service may be invoked by the call interface. A process may register several services. This allows a process to implement things like filesystems, network interfaces and so on. Ha3sm is therefor what is called a microkernel, where "everything is a device". The "everything is a device" aspect provides a form of interprocess communication. Re-overview Ha3sm, Hohensee's autonomous 3-stack machine, the hypothetical OS, is based on the Hohensee's 3-stack machine model of a CPU, the existing but very rustic programming environment. Ha3sm calls pass values between the kernel and user processes based on the H3sm model, on 3 stacks. Ha3sm is preemptive multitasking. It is true multi-user, in that non-owner users are in (hardware) sequestered memory-spaces. Ha3sm can provide hard realtime handling for only one periodic task. This is because one computer can only guarantee instant response to one event generator. Ha3sm is a microkernel providing device abstraction and CPU abstraction, but not virtual memory, network interfaces or filesystem abstractions. Interprocess communication and other facilities may be implemented as services of processes, which are registered as psuedo-devices. In Ha3sm, "Everything Is A Device", but there are 2 types of devices with 5 types of "channels". There is also no provision planned for SMP, journalling or snapshotting the system. Ha3sm is hoped to be implemented on the 386 first, but note that this is the first mention of a specific hardware platform. The hardware requirements of Ha3sm are similar to, but a bit less than, what unix requires. ................................................................. Drivers, handlers, Blocks, streams, channels and please There are block devices and stream devices. This is in aknowledgement of the deep difference between things that have a finite size, and things that can best be expressed as having a rate over time, but no static size bounds. Joining the two with an abstraction layer outside the kernel such as a filesystem is LAAETTR. This directly gives rise to four types of data channels, and I add a control channel. Each of these channel types has a call. Block devices communicate via blocks in a manner similar to Forth. Stream devices communicate via ring buffers. Ring buffers keep a throughput count, which can be masked down to a current input cursor, and also can be used as a check value to see if a process has not kept up with device throughput, i.e. a process may check if it has been lapped by the device in the endless race around the ring. The size of a particular ring buffer shall be a power of two for easy masking down the throughput value to the cursor value. Buffer sizes are static data in each device's data structure. Forth standard practice involves stack-effect diagrams when presenting a Forth word. This is similar to C function prototypes, but is just commentary, and the distinction between functions and operators doesn't exist in Forth or H3sm. A Forth example... SWAP ( A B --- B A ) Forth stack diagrams are a "before --- after" picture of the stack, and --- is what before and after are relative to; the invocation of the word being diagrammed. The left of --- is before, right is after, and the top of the stack is to the right on both before and after sides. In the above, B is on the top of the stack, SWAP happens, and then A is on top. H3sm stack-effect diagrams are the same, with the addition of (P: ) for pointer stack effects, and Size! for things that may change the Size register. For some words that need stack items present that they don't modify, I use (required ||| consumed --- produced ) e.g. OVER ( a b ||| --- a). Here are our first call words. These descriptions are what would be used from the process using these calls. read ( --- success ) (P: device_ID block ||| ) Read a block into process memory directly from a block device. write ( timeout --- success ) (P: device_ID block ||| ) lock the device, or wait/timeout. If obtained, write block in process memory directly to device. unlock. Else return success=false. recieve ( --- recieved ) (P: device_ID ring ||| ) read a segment of a kernel ring buffer from the process's concept of current in the kernel ring buffer to the kernel's concept of current. That is, sync/update the process buffer with the device buffer. Check if the process has been lapped. This requires a buffer in the driver and the process. send ( timeout --- sent ) (P: device_ID ring ||| ) lock the device, or wait/timeout. If obtained, write the process's pending ring segment to the drivers ring buffer from the driver's current. unlock. Else return sent=false. please ( --- success ) (P: please/reply_buffer ||| ) Send an ASCII string to a device. The device may be addressed by the ASCII header syntax vis-a-vis the kernel device namespace. The reply, if any, may go to to same buffer the please came from. This is a half-duplex channel. The process is the master. There is no buffer on the kernel side. There is the device namespace in the kernel, and other device-specific dynamic info and literal strings. ASCII itself contains a telegraphy addressing syntax, which is reusable as is. Note that write and send channels must be locked on each use. Locking is implicit to write and send, but it is the process's responsibility to partition sends/write sizes appropriately. Read/recieve channels don't care how many concurrent readers there are; they multiplex implicitly. Writes, on the other hand, may only time-interlace. The maximum individual send is the size of the device's buffer. These calls need a success/fail return value. The sense of this value will be success = true = <>0 . This means success return values can be more informative than failure return values, which are always 0 (false). Specific calls may easily have other specific stack effects if more failure info is desired than a 0. Some mechanism is probably also advised for preventing a process from monopolizing an output channel. Something simple, like arbitrarily refusing every 2^Nth lock request, where N is a device-specific constant. ....................................... The kernel device namespace There are only 5 calls. please calls are messages, and all other kernel services are accessed that way. The criterion for whether something should be a distinct call is, "does it need to be pre-authorized?". Only send, write, recive and read need pre-existing state, for performance reasons. There are several items that want to be associated with a device name. Each name minimally has to map to an address representing the device exactly. This will probably be the address of the device's data structure. There will be a unix-like permissions scheme for each device channel. cLIeNUX, my Linux/GNU/unix "distribution", uses filesystem symlinks to implement internationalizable aliases for fundamental directories. This doesn't look too pricey for the Ha3sm device namespace. What is needed is a name type that doesn't appear in usual listings but that is visible to calls that use the concealed name specifically, and an alias mechanism. Fundamentally, an alias mechanism means one device can be pointed to by more than one name. This scheme is simple, but confusing, and unusual, so I will expound a bit. Let's use the unix naming scheme. Think of names beginning with a period such as .floppy as being concealed. Now assume that a convention exists such that .floppy is the default name for the floppy device. The user of the Ha3sm system in question, let us say, is from Mongolia. .floppy is meaningless to her. However, programs that she may want to run that access the floppy at her behest, but without specific direction from her to do so, need a name to find it by. .floppy is the utility name, which utilities know without having to uncover it. If our Mongol princess lists the devices though, she won't see .floppy. Hopefully she will see everything in her prefered language, character set and page-traversal layout. There therefor will be at least one other name for .floppy, in her language, pointing at exactly the same thing. it looks like our device namelist wants to be two related structures, a list of actual device node data, which is pointers and permissions lists per device, and a list of names that points to the data nodes in a more_than_1:1 fashion. Only the names list would ever need to be sorted. Both lists will be subjected to the forces of expansion and contraction, however. Both lists will have items of irregular length. A process needs to open and close devices it uses, via the open and close calls. open sets up references to the device, gets it's buffer size if any, determines if it exists, and in some cases, tells the driver where the in-process buffer is. open will also synchronize the driver's current throughput value and the process's version thereof. In other words, communication requires some commonality of state between parties, and for the four device-specific channel types that is established by open, and de-allocated by close. De-allocation on the kernel side will be minimal. Locks for example are already self-de-allocating. The kernel device namespace as a whole is always "open" vis-a-vis please. please #device open ( timeout --- Device_ID=success ) (P:device-specific) please close ( Device_ID --- ) ............................. Timing A simple round-robin scheduler is contemplated. It is hoped that once around the process list, herinafter refered to as a Cycle, can be used as a simple, nearly implicit means of timing wait/timeouts. It is also hoped that waits will be rare, but alas, that doesn't look likely. There is an obvious tradeoff between slice duration and time spent doing task-switch bookkeeping. A substantial portion of total CPU time can be consumed by state-change activities if the slice duration is too small. Less obviously, a significant hit here, on the order of 30%, may be worthwhile for overall simplicity. The implementation of a Cycle may change to something more controlled or regular, but the idea of a several-slice wait period will probably remain. Relatively heavyweight processes also means relatively few processes which also means a Cycle should be relatively short. The "wait" kernel subroutine will be used by IO calls and the "rest" option of please, "please rest". wait will wait, and will involve a timeout. This is a down-counter for number of times some condition is to be tested. A timeout spec of 0 means "now or never". wait must perform a conditional branch on timeout, which is handled by the call code in the kernel. wait doesn't branch if the test condition is satisfied. please rest ( no stack effects) relinquish the CPU for one Cycle. This is wait with a timeout of 1. ..................................... Owner and guests There are two very different kinds of process; owner and guest. All guest processes must login with user ID and password. There is no separate fork() or exec() or similar. The usual user utilities and "shell commands" are expected to be threads, i.e. not full-blown processes, i.e. they will normally be H3sm words. In stock unix terms this would be most analagous to shell built-ins. Owner processes are also only spawned by login, but may or may not require owner authentication. The initial process invoked by the kernel is an owner process. Owner processes have a dictionary and so on at an appropriate point in the physical address-space, and can see all RAM. IFF a login is attempted from the system keyboard, which may be defined at boot as some other device such as a serial port, then an owner login is allowed. owner logins are strictly dis-allowed via process services, which precludes network interfaces, which precludes logging in as owner over a packet-switched network. Workarounds to allow remote-admin are LAAETTR. The system may be configured at boot to find user authentication data on mass storage. Guest processes don't need access to this data. Guest processes are controlled by strict-constructionist permission mechanisms, i.e. guests can only see things specifically allowed to them individually. Owner processes, on the other hand, can "fandango on core". Each guest process is then somewhat analagous to an instantiation of Tripos, CintPOS or AmigaDos, and an owner process sees the whole system like AmigaDos. This is unlike the unix superuser, which is still a regular user as far as memory sequestering. This means that Ha3sm is actually divided into guest-spaces, which are all a subset of kernel/owner space, which is physical RAM. If an owner process crashes it may well take the system with it. A guest process crash is in effect a logout. .................. Events I want to go over interrupts and traps again under the general heading of events. This is the guts of an OS, so a bit of redundancy and a bit of jargoneering is called for. Here's how interrupts and so on break down in Ha3sm, higher priority first... Ha3sm-ese usual term and comments .. ... reflex realtime interrupt handler unpreemptible. Caused by the OS, possibly via a timer. Manageable with please calls by an owner process. This is the realtime interrupt handler. preempt process scheduler interrupt handler caused by a timed hardware interrupt. Switches processes. Saves soon-to-be-sleeping process's state in a kernel data structure, and "returns" to the next process in that structure. This is the only event that doesn't nest like a plain subroutine vis-a-vis other events. Does a full register state save/restore. Only preemptible by a reflex. Can also be caused by the please rest message by a process that doesn't need the rest of it's slice. handler device interrupt handler The kernel-only side of a device driver. Generates the device data. Caused by the hardware interrupt of the particular device. May be manipulated by an owner process with please install and friends. surprise exception handler Unintentional exception, such as divide-by-zero, general protection violation et cetera. May have an OS-related function, like killing an errant process. Not really synchronous from the soon-to-be-former process's point of view. "muddled synchronous", since the current process probably is a bit muddled. call syscall synchronous exception. An intentional call to kernel-space and kernel code by a process, including possibly otherwise sequestered (guest) process. Caused by the appropriate machine instruction. Ha3sm needs exactly 5 syscalls I think, read, write, send, recieve and please, which means we can save an instruction or two by giving each call it's own trap ("int" on x86) number. .................................... Allocator Processes have to be allocated thier memory requirements, and de-allocated them. Guest processes will have to have access protection enforced upon them for thier allocations. Owner processes will have a core allocation per process, which will be performed in cooperation with guest process allocations, but owner processes may then access all of RAM. Processes need contiguous non-overlapping spans of memory. H3sm has the $egment mechanism for creating a bounded segment of arbitrary data. A $egment is a pointer-prefixed or count-cell-prefixed range. $egments combine implicitly into singly up-address linked lists. I'll probably be using $egments and simple variants of $egments to keep allocation data interlaced with the allocations throughout RAM. The smallest item that will require an allocation is the process. An allocation will be rather slow if a large numbers of processes are active, but allocation happens only on login. I'm hoping a fully usable and general OS is possible without malloc/brk. Again, I also don't want to do virtual memory or "paging". ......................................................... Simplicity, Portability and Performance Pick any two. I'll take 1 and 3. My suspicion is that overall system simplicity can allow the Ha3sm kernel to be written entirely in assembly for a particular platform with an effort a crazed individual can achieve in some few months. This may be a payback for some of the things that could be more elegant. So it is with simple round-robin and Cycles. The please call is unbounded in terms of what it might be called upon to provide, but it is also largely non-essential. Most of the device code should be re-usable for all devices of a particular channel type. Ha3sm therefor exchanges total code non-portability for smallness, which itself helps porting. It's probably possible to in some sense write the kernel in Forth, but I'd just go for asm. Looking at other kernels, and my own code for H3sm, high-level languages are over-rated for low-level tasks. There's about 90 primitives in H3sm that I'm currently rewriting in 386 assembly. The "a" in Ha3sm is probably a similar amount of code. Although this is utterly platform-specific code, the H3sm primitives and the Ha3sm interrupt handlers provide a succinct point of portability. This at least provides a clear target or plateau above which asm is not necessary. Dennis Ritchie said that C caused about a 40% performance hit in UNIX versus the straight assembly version. The rule of thumb is more like 3:1 in favor of assembly. Marcel Hendrix in comp.lang.forth says 5:1 is typical for the Pentium. Forth with some assembly can surpass straight C, and combining Forth and assembly is easy. Performance benefits in Ha3sm will benefit not only total throughput, but also responsiveness. The reflex handler for example wants, clearly, to be pure assembly. Once a Ha3sm is built, it seems to be capable of providing complete host abstraction to the H3sm model. This is nice, given that H3sm hardware is possible for an imaginable amount of effort. In any case, small overall system size is conducive to high-performance implementation. ................ Graphics Big subject. I just want to make a guideline. As I suspect that accepting that there are two types of device in the world makes things simpler, so I suspect that visual display devices can be categorized beneficially into 3 types; glyph, pixel, and vector, or teletype, tube/screen and plotter/scope. Initial Ha3sm facilities will be implemented as the VGA text variant of the glyph display category, with the idea that the other two types will arise. That's about all I can see implementing any time soon. ASCII sent to a text device displays ASCII. ASCII itself may be sufficient to define escapes for the other two display types. LAAETTR. .......................................... handlers, calls and CATCH The microkernelness of Ha3sm is an accident of it's goals of Forthishness, mutability and implementability. Note that Ha3sm provides a threader and interpreter in kernel-space. Given a H3sm in the kernel itself, we can ponder dealing with interrupts via H3sm words. Forth CATCH is implemented, I assume, by installing a handler for an interrupt in an interrupt vector. That is, CATCH does the installing. ON x86 for example, asynchronous and synchronous interrupt vectors are all in the same array. The handler code for an asyncronous interrupt is, it seems necessarily, much different than a handler for a trap, i.e. a synchronous interrupt, typically a call. An asynchronous interrupt must do a major state-save, do it's thing, and restore the interrupted process's saved state. It looks at this point that something like Forth CATCH is adequate for snagging the int vector, but some words are needed in addition to CATCH for the state-save stunts in the asynchronous case. Switching processes, which is the job of the scheduler and also the "please rest" call, involves saving a process's state and invoking some other process. This means that the state of that other process is already saved, and that the kernel knows where. This necessitates a process-table. 386+ task gates provide a mechanism for tasks (processes) to switch between each other with little kernel intervention, but task gates used that way require paging which I wish to avoid. I also may wish to insert some kernel actions into a switch that are not possible with the provided 386+ scheme. Possible actions are partial state switches and resetting the slice timer conditionally, for example. .......................................... Messages please is a call, a syscall. Arguments to please are messages. A lot of things that are syscalls elsewhere would be messages in the Ha3sm sense in Ha3sm. Messages are addressed to a device, with the "default device" being the kernel itself. The possibilities are endless. Ha3sm messages are everything the kernel provides to processes besides the dedicated IO channels, so messages cover all other "syscalls", /proc, ioctl, and a H3sm interpreter. Some handy things that come to mind initially are ( * owner process only) ( ** reflex controls) bye rest login * bequeath ( jump out of Ha3sm to your groovy 5-stack OOP whatever) * reboot * install ** start ( start a reflex handler) ** stop ( stop a reflex handler) *(*) calibrate (timer commands, including the reflex timer) identify ( basic box info, e.g. Ha3sm version, CPU...) date time epoch ( microseconds since 2000, kept/returned as a pyte. (2048 bits?) ) uptime loading ( CPU usage stats, unix top equivalent) processes ( unix ps equivalent) In other words, all the usual low-level (no files) syscalls, /proc, and of course a tetris game. Implementationally, a device name is a pointer by the time the kernel needs it, so all this can be implemented as an in-kernel H3sm interpreter with a syntactical addendum that interprets device names and leaves a pointer on the pstack for the H3sm in the rest of the message. For added cuteness, messages could observe Forth-style comment syntax, and could accept Plan9-like #'s in devicenames simply by eliding #'s and thus not allowing # in a device name. That will help the in-kernel H3sm figure out what part is the devicename, if any. Interpretive use of please will need an escape or something to insert control bytes into strings, perhaps. ....................................... Birth of a Process (please) login (P: name password --- ) parent ( --- 1 ) child ( --- 0 ) Processes are not Ha3sm. They merely coexist with her and share the same argument passing convention. New processes are instigated by existing processes, but become the responsibility of Ha3sm. A certain amount of ceremony is therefor appopriate, as follows. Some process decides that it wants to be a daddy. Our suitor process, John, contrives to obtain a user name and password from somewhere, possibly from a stream someone is talking to, or possibly it just uses it's own. No matter. John offers this to Ha3sm by calling login ala please us0r secret login The kernel parses this suspiciously and looks up us0r in the user data structure. Some user data is compiled into the kernel, and some may optionally be on a block device somewhere. IFF user us0r is found, a set of important data for us0r is found with it. Part of that user data is a password, which the kernel then checks versus the one it was just passed, "secret" in this case. IFF password checking succeeds us0r is authorized as a user. Login occurs. Process us0r is not really concieved yet though. The allocation has to succeed also. Along with us0r's name and pass is some info about her initial process, including size allocation desired. The allocator is presented with this number and attempts the allocation. IFF that succeeds, then John is given a true success return flag for login. John is now completely out of the picture, probably off somewhere handing out cigars, if not scrounging up another username and password. Ha3sm now engages in a bit of nest-feathering. Process metadata is generated for this pending new process. The user data for us0r may also request either a default H3sm environment, or a block range to load in the sense of loading an ELF or .EXE file, i.e. a binary executable load. (or also in the sense of Forth LOAD, possibly, which is a form of compilation. This would be combined with a default H3sm core dictionary load.) Next is the timeslice allocation. This process is added to the process list in the scheduler. The start address of this process is hooked up so that when the scheduler gets to it it's ready to go. That is, an initial stack state is built for this process to be inserted in the process list, so that it can "return-from-interrupt" to it's initial state as if it had been interrupted before. This may have to precede timeslice allocation, actually. We now have a useless autonomous process. Unlike in unix, It has no IO connections at all. This process must ask for connections on it's own, and may be refused them. Let's say this process is the H3sm interpreter variety. It will perhaps attempt to open the keyboard via please #keyboard#qwerty#recieve open or similar. I haven't worked out a permissions scheme in detail yet, but we'll assume us0r is allowed to read the qwerty ring buffer. A user at the console can now tell this process what to do. This process can now do something useful. Cheers. All these actions are the same for guest and owner processes. Just for symmetry, An owner process might not have permissions for a particular device, and thus may be refused access. It can however read that memory directly, and can modify it's own permissions. ................................. Registering a service A process may offer data to the system to be intefaced to as a device. Such a process psuedo-device is called a service. A device, and thus a service, has to at least respond to "please identify" sensibly. This means it needs a unique name. This also means it needs permissions flags. The kernel will set up the permissions allowed. A process can't permit access, or do anything else, for things it itself is not authorized for. This is because Ha3sm will implement "if you can't use it you can't see it at all". As far as I can tell, that can be controlled by please . For now, assume most service-offering processes are owner processes. Most devices will also want to implement an authorized channel or two. That is a matter of creating the buffer and permissions flags required, and then adjusting the device namespace to include the new service. Once everything is set up, the kernel doesn't care that a service's data is in a process segment. That's just more RAM to the in-kernel call handler. The idea of services is easy to implement if devices and thier handlers are clearly orthogonal to the calls that access the device's or service's data. Ideally, read, write, send and recieve will not have any device-specific code, they will just use pointers to the device-specific data. This is possible because of strict sequestering of things into nouns (device/service data) and verbs (calls). This sort of thing is why I'd rather not discuss object-oriented methods. Thanks. The performance characteristics of services look like they will be efficient but laggy. The calls used on a service are exactly the same calls used for a device. No hit there at all. However, services that translate things from devices will act like delay-lines, depending on how much they do with the data and so on. There may be race-condition issues also with services that use locking channels, the proper handling of which will probably also introduce lag. .................................................... Arbitrary ASCII Here's the first several ASCII control byte assignments. ASCII is a complete telegraphy protocol. Ha3sm please messages will adhere to this. In particular, byte values 1 and 2 will be used to delimit devicenames, which is in a sense one form of an address for a message. Arbitrary addressing is also possible. =00 U+0000 NULL =01 U+0001 START OF HEADING =02 U+0002 START OF TEXT =03 U+0003 END OF TEXT =04 U+0004 END OF TRANSMISSION =05 U+0005 ENQUIRY =06 U+0006 ACKNOWLEDGE =07 U+0007 BELL =08 U+0008 BACKSPACE =09 U+0009 HORIZONTAL TABULATION =0A U+000A LINE FEED =0B U+000B VERTICAL TABULATION =0C U+000C FORM FEED Byte value 3, END OF TEXT, could be interpreted conversely to mean BEGIN BINARY DATA. There's a simple little comms problem with arbitrary data, however, particularly with a simple byte-literal protocol like ASCII, that requires a sort of an extension to ASCII. You can't terminate a segment of arbitrary data with a delimiter. For a segment to be able to contain completely arbitrary data you have to assume that it might contain any value you might otherwise have used as a delimiter. You need a bit more protocol logic than was practical for telegraphy in 1966. One thing that can be done is that a segment of arbitrary data can be prefixed with a count; i.e. a size. A size prefix for arbitrary data sent to a stream is an area where H3sm's pytes starts to look useful. Let's say we just sent something a byte with the value 3, an ASCII END OF TEXT. We are going to follow that with some sized (counted) arbitrary data. There has to be some agreement between the parties as to how to interpret a size value. There are a variety of ways to express a number. If you use a binary integer, the count number itself has a size and an endianism, a signedness, and so on. A variant of H3sm pytes and some assumptions can make this very general, but simple. Consider a pyte as an integer represented by N bytes and a Size byte of value N. Consider that N, i.e. Size, can range from 1 to 255. Let's make big-endian the default endianism of an Aetm arbitrary data size prefix. Let's say we want to insert 39355 bytes of data into an Aetm message. 39,355 is 99bb hexadecimal. That's a two-byte number if it's taken to be unsigned, so we can set Size to 2. Our 3 bytes after the 3-byte, END OF TEXT, are 0x02, 0x99 and 0xbb. Sender and reciever now know to resume ASCII interpretation of data after 39,355 bytes have transmitted after the 0xbb. Size = 0 may prove useful also for things like an endianism escape. With a little imagination vis-a-vis network services and so on, the same protocol we use to open a stream device can be used to send anything anywhere. .................................. Tables and such Memory Spaces physical segment segment allocated protected kernel * owner process * * guest process * * That is, an owner process has an allocation like a guest process, but is not strictly constrained to that area. The kernel can allocate more space for itself. Processes get what they login with. That's it. There will also be an option in the kernel to only allow one guest process per guest ID. Interrupts priority returns to triggered by Real time highest preemptee configurable scheduler next highest next process timer devices per device preemptee devices other timers low preemptee timer Channels (traps, calls) buffer/where lock open'd mux read block/process * * write block/process * * recieve ring/process&device * * send ring/process&device * * please string/process Lock=block. write and send need to take a timeout argument. Anatomy of a (stream) device |perms|lock| size |throughput|--------the ring----------------| \ /\ \ current [sizemask]________________/ Block devices' in-kernel data are just perms, size and lock. ............... Eclectics I've been trying to combine Forth, AmigaDos and unix for a long time. Ancillary to that process, Forth grew another stack and a Size register, unix and AmigaDos lost files, and AmigaDos became a process. Meanwhile, Chuck Moore stuck address registers in his Forth engines, Dr. Chen Ting did some VHDL of a 3-stack machine, and Martin Richards made Tripos a process. About a week ago the Everything Is A Device scheme occured to me, and the thing started to look like it might coalesce. The pyte stack in H3sm reduces namespace explosion in Forth. Similarly, throwing out files reduces calls versus other systems. One can only assume that files will be back, but not in the kernel proper. The duality of device types is perhaps unusual. After having been questioned disapprovingly about this, it occurs to me that there's nothing but common sense preventing a driver from offering all five channel/call types for the same data. More likely, filesystem processes will arise to synthesize that. The Amiga had some clever hardware for handling multi-media realtime tasks. What can be done in this regard is limited on an IBM PC, and varies quite a bit within that range, but the Ha3sm realtime facility does what can be done. The predecessor of AmigaDos was Tripos from Martin Richards at Cambridge. The latest incarnation of Tripos from Richards himself, who is still actively developing it, is CintPOS, which runs as a Tripos emulator on Linux. This is closely analagous to AmigaDos-as-process. CintPOS and Tripos are written in BCPL, which C was based on. BCPL is to some extent a Forther's C, or the third generation of what Forth is the fourth generation of. This is probably due do BCPL's primarily educational mission. BCPL does not have types or structs. It is based on cells, as are recent Forths and H3sm. BCPL even has MULDIV, an algebraic equivalent of Forth's */ . The most promising sources of application code examples for Ha3sm are thus the Forth world and CintPOS. I have heard that some work in operating systems involves protecting processes from the kernel. Ha3sm makes no such attempt. In Ha3sm generally, if you want info, you must surrender info. As a process talking to the kernel, you offer up your entire address-space. I also have no interest in object-oriented methods per se. Objectisms in Ha3sm, if any, are entirely accidental. Other things I've heard about that don't interest me personnally are process protection entirely in software and inately distributed systems. Plan9 for counter-example can "download a CPU" or something. LAAETTR. Ha3sm, I believe, could provide a succint platform for implementing many other more complex systems. ...................... Recent influences I solicited comment on interprocess communication from Frank da Cruz, prime mover of C-Kermit. He did take enough time out to say "Allow timeouts!", which probably influenced the stack effect diagrams for write and send . I've been browsing the sources or docs to Plan9, CintPOS, Minix, BCPL, ASCII, Linux and EROS lately. And COLDFORTH. "please" is of course in honor of INTERCAL. H3sm and so on are in ftp://linux01.gwdg.de/pub/cLIeNUX ftp://linux01.gwdg.de/pub/cLIeNUX/interim LAAETTR.