Hohensee's 3-ring Linux

WHEN

May 2001

summary

I have incorporated my 3-stack Forth-family language, H3sm, Hohensee's 3-stack machine, directly into the Linux kernel as a kernel thread of execution, also known as a "kernel daemon" in unixese. I'll describe the specifics of Linux 2.4.0 and H3sm 1.5, most of which will have direct analogies to other unices and Forth-family languages, and other interactive programs as well. I will attempt to cater to the Forth-side point of view.

WHAT

I am typing on my Linux console virtual terminal (vt) number two. The console is the keyboard, mouse and CRT directly connected to the Linux system in question, the one running on the PC 3 feet away. My setup has 12 vt's that I can switch between with various key combinations. Each vt has a complete sequestered user state affiliated with it. Most are initialized to run a command shell as the unix superuser, "root". In a day I usually actually use about 8 vt's for various things.

vt 1 has a very unusual state associated with it. Here is what is currently shown on the monitor when I switch to vt 1...


_________________________________________________________________________
<snip>

Verifying DMI Pool Data ...

Loading................................................................
...............................
Uncompressing Linux... Ok, booting the kernel.

7 emit
7 emit
<cursor here>


__________________________________________________________________________

 
That vt has a running H3sm attached to it, which does "emit" ala Forth, but doesn't do an OK prompt. The `7 emit's did in fact cause audible beeps. This appears to be a normal H3sm interface. It's not a big deal to set the unix initialization process to spawn whatever interactive program you prefer on a particular terminal, virtual or otherwise. The unix ps command lists some info about the various processes on the system. This will show the above H3sm to be a bit unusual in terms of what it is in fact an interface to...
___________________________________________________________________________
:; cLIeNUX /dev/tty4  12:42:59   /
:;ps
PID     TTY     STAT    RSS     COMMAND
1       0       S       89      (init)
2       0       S       0       (kswapd)
3       0       S       0       (kreclaimd)
4       0       S       0       (kspamd)
5       0       S       0       (kflushd)
6       0       S       0       (kupdate)
10      1026    S       142     (bash)
17      1027    S       143     (bash)
18      1028    S       137     (bash)
19      1029    S       137     (bash)
20      1030    S       137     (bash)
21      1031    S       137     (bash)
22      1032    S       137     (bash)
23      1033    S       137     (bash)
24      1034    S       137     (bash)
25      1035    S       137     (bash)
26      1036    S       137     (bash)
67      0       S       101     (syslogd)
76      0       S       84      (gpm)
83      1027    S       394     (browse)
90      1026    S       333     (browse)
94      1026    S       111     (edit)
95      1028    R       2       (ps)
:; cLIeNUX /dev/tty4  18:21:15   /
:;
________________________________________________________________________

I'll actually have to fill in some blanks for you as to what the above shows. There is no H3sm process. H3sm is actually "kspamd", the 4th process. RSS is each process's userspace memory allocation. The first few processes after init don't have any because they are kernel threads of execution, and use the same address-space as the kernel itself. They are the kernel itself, very basically. This makes our beeping H3sm very unusual indeed. I don't understand the TTY field in the above ps listing. kspamd is actually hooked to /dev/tty1, or vt 1. This is shown in the /proc info for kspamd. /proc is a psuedo-filesystem of kernel information. File reads to the /proc namespace cause the kernel to produce some system info. /proc/4/fd is a directory for process 4 showing file descriptors the kernel is affiliating with that process. The Lynx browser will show...
_______________________________________________________________________________

Files:

   lrwx------        1K May 11 18:35 0 -> /dev/tty1
   lrwx------        1K May 11 18:35 1 -> /dev/tty1
   lrwx------        1K May 11 18:35 2 -> /dev/tty1

_______________________________________________________________________________

Those 3 file descriptors give our H3sm the full facilities of the Linux console driver. The Linux console driver is an embarrassment of riches, and is for our purposes accessed by the simple elegance of the "everything is a file" design of unix. We get very complete vt102 emulation in "cooked" mode, which gives us the beep, echoes our input, accepts our input in rather intelligent linewise fashion with backspace and so on, puts us on vt 1 of the 12 that my setup instantiates, and we get mouse text cut/paste too.

For a hint at what makes our kernel daemon status special, let's hexdump some kernel data. When a Linux kernel is built a file called System.map is generated containing an address/type/name listing of all the linker symbols in the kernel. The kernel I'm working on now has 8200 symbols. Let's dump one of these...

 
.............................................
(System.map) c029fda0 d charset2uni c029ffa0 d page00

...........................................
c029ffa0 ->p dump

c029ffa0   00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f   |                |
c029ffb0   10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f   |                |
c029ffc0   20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f   | !"#$%&'()*+,-./|
c029ffd0   30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f   |0123456789:;<=>?|
c029ffe0   40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f   |@ABCDEFGHIJKLMNO|
c029fff0   50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f   |PQRSTUVWXYZ[\]^_|
c02a0000   60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f   |`abcdefghijklmno|
c02a0010   70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f   |pqrstuvwxyz{|}~~|
c02a0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   |                |
c02a0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   |                |
c02a0040   a0 00 00 00 a4 00 00 a7 a8 00 00 00 00 ad 00 00   |~   ~  ~~    ~  |
c02a0050   b0 00 00 00 b4 00 00 00 b8 00 00 00 00 00 00 00   |~   ~   ~       |
c02a0060   00 c1 c2 00 c4 00 00 c7 00 c9 00 cb 00 cd ce 00   | ~~ ~  ~ ~ ~ ~~ |
c02a0070   00 00 00 d3 d4 00 d6 d7 00 00 da 00 dc dd 00 df   |   ~~ ~~  ~ ~~ ~|
c02a0080   00 e1 e2 00 e4 00 00 e7 00 e9 00 eb 00 ed ee 00   | ~~ ~  ~ ~ ~ ~~ |
c02a0090   00 00 00 f3 f4 00 f6 f7 00 00 fa 00 fc fd 00 00   |   ~~ ~~  ~ ~~  |
c02a00a0   00 00 c3 e3 a1 b1 c6 e6 00 00 00 00 c8 e8 cf ef   |  ~~~~~~    ~~~~|

................................................
Some kind of text mapping table apparently. Not the sexiest of the 8200 symbols available, but illustrative.

There are ways to do this from a normal sequestered Linux user process, even as a plain user if filesystem permissions allow. They are not nearly so direct, and don't allow for things like moving active kernel code around, such as dynamic metacompilation might give rise to. H3rL's user-interaction goes through the filesystem, and also through the tty device special file mechanism, but moving data between H3sm and Linux runs on the metal, since there is no "between" between them.

HOW

H3sm 1.5 is written in assembly, x86 GNU gas and m4 macros. This means there is a linking phase to building H3sm of the same type as the one for Linux. H3sm has assembler/linker labels for all words, which are call addresses, and all branch targets. Vis-a-vis Linux, this is two big namespaces we want to eliminate the possibility of any conflicts for. Linux is bigger, is the host, and may be more prone to symbol name dependancy surprises, so we eliminate this problem from the H3sm side. An ed script and some hand tweaking convert all the labels in H3sm and references to them to don a "H3sm" suffix. That's obscures things enough so that we don't conflict with Linux. There are a very few H3sm symbols that Linux will need to see. The call address of H3sm is mainH3sm, for one. We can now produce an object code file that can be linked into Linux. Where?

A couple of the kernel daemons normal to Linux 2.4.x are coded in the mm/vmscan.c file in the Linux kernel source directory. kswapd, kreclaimd and so on. The documentation in the 2.4 sources is very good, as are the separate documentation efforts for these things on the web, but documentation tends to assume the reader is interested in use of existing facilities, rather than severe changes to them. Interaction is called for. The name kspamd is a holdover from early all-C test code I inserted into mm/vmscan.c to get a feel for things. In particular, I wasn't sure how kernel thread scheduling worked. Kernel threads run in cooperative multitasking with the rest of the kernel. A kernel thread has to explicitly relinquish the CPU, or nothing else runs. The handoff to the rest of the kernel is accomplished with the "schedule" kernel call. This is an addendum to what H3sm has to do in userland, and is an unresolved symbol in the H3sm object file until it is linked with the kernel. This sort of thing gave rise to some tricky nesting issues. This is the kspamd/H3sm code in mm/vmscan.c...

_____________________________________________________________________________

<snip>

int kspamd(void *unused)
{
        struct task_struct *tsk = current;
        pg_data_t *pgdat;

        tsk->session = 1;
        tsk->pgrp = 1;
        strcpy(tsk->comm, "kspamd");

        mainH3sm();     /* This doesn't return usually. */

}



static int __init kswapd_init(void)
{
        printk("Starting kswapd v1.8\n");
        swap_setup();
        kernel_thread(kswapd, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL);
        kernel_thread(kreclaimd, NULL, CLONE_FS | CLONE_FILES 
			| CLONE_SIGNAL);
        kernel_thread(kspamd, NULL, CLONE_FS | CLONE_SIGNAL);
        printk("Starting H3smik v0.00.00.1\n");
        return 0;
}

<snip>

____________________________________________________________________________
The bulk of the kspamd routine is just mimicing what the real kernel daemons do to be kernel threads. The mainH3sm() calls H3sm, and never returns to kspamd(). This means that scheduling code for H3sm must be in H3sm itself. There is a mechanism used by other kernel daemons to put themselves on a runqueue, but the call to it is in C, which is problematic for H3sm 1.5. To get a similar effect so that there isn't a gratuitous level of CPU use I insert a nanosleep syscall in H3sm's top loop.

kswapd_init is called during kernel initialization. Much of that is also sheer mimicry, except that I don't use the CLONE_FILES flag. That means H3sm/kswapd gets a distinct set of file descriptors from the rest of the kernel and daemons. That is necessary so that the FD's kspamd gets are 0, 1 and 2, which is probably indicative of some deeper problem avoided. If I do CLONE_FILES the FD's assigned are 3, 5 and 6. Not good. The action of kswapd_init is ultimately calls to the "clone" syscall, which has the flags described in the "manpage".

There is a problem I don't yet understand with writes to the console driver via the /dev/console device special file. This kills kspamd. I have therefor, temporarily I hope, disabled the usual kernel boot message mechanism by making /dev/console a regular text file rather than an implicit link to the console driver. This allows H3rL to survive, but I need to backtrack and see what I can un-break now that H3rL is live. The top loop of H3sm is where the things are that one needs to do to insert it, or something similar, into Linux. Here's the code... ________________________________________________________________________


top_loop_of_H3sm:
                        # stick a sleep in here.
                        # Do not melt the CPU, do not slow down
                        #       the test cycle.
        call timespec
        call pdup
        call pplusc
        call pplusc
        call nanosleep
        call twopdrop
        call drop

        HANDOFF
                call token
                YES(            mozygote)
                call interpret
ELL(                                                    top_loop_of_H3sm)


________________________________________________________________________ We've already opened the three file descriptors, initialized dp and so on. This is within a H3sm word called zygote, which is, ha ha ha, the last word in the base dictionary. HANDOFF is a macro for ...
		pusha
		pushf
		call schedule
		popf
		popa
which is our coroutine link point to the Linux scheduler, with a stash-everything-on-the-stack wrapper. I made it a macro in case I have to sprinkle it around various places in H3sm, but I haven't looked into any long-duration words yet that would lock Linux out for too long.

As mentioned earlier, the runqueue thing provided by normal kernel code for daemons is in C, so the code above HANDOFF does a sleep. It sleeps for at least one Linux "jiffy", which on x86 is 1/100 second.

nanosleep, read, write and so on are regular Linux syscalls using traps. It's not strictly necessary to trap from the kernel to be allowed to enter the kernel, as it is from a user process, but it would only be in extremely performance oriented situations that alternatives would be worth persuing. schedule on the other hand is a plain subroutine call from out point of view, although the kernel performs much trickery for a schedule call. Beyond that, there are probably 4000 symbols in a normal kernel, and you have direct access to all of them if you're interested.

WHO

Rick Hohensee
www.clienux.com

WHERE

ftp://ftp.gwdg.de/pub/linux/install/clienux/interim is where the latest H3sm/H3rL stuff is likely to appear on the net.