Output is to the created file-variables $output and $listing, defaulted to ~listing and ~binary.
Individual instructions can be assembled at the shell prompt, since they are each a shell function. This is after . osimplay, and you may have to set pass=2, and H=0 or whatever. H is the assembly pointer, "." in Gas I believe.
osimplay is the latest incarnation of shasm, shell assembler. shasm actually was asmacs first, m4 macros to rename all the INTeL 386 "mnemonics" for Gas. osimplay adds some features of fancier assemblers, such as rudimentry data structures, and composites for things that tend to always be used as more than one instruction in a particular arrangement, such as pattern fills, checksums, min, max and so on. Some of the names have changed too. MOV was "copy" in asmacs and shasm, and now it's =. I would have prefered Forth's !, but that conflicts with the shell, and = is growing on me. Yes, = is a valid Bash user-defined function name.
osimplay is pure GNU Bash. It does not need dd, cat or similar for anything, although I'm starting to use cat for some things in conjunction with osimplay in the actual build process of a program as a degenerate linker. I believe Linux does the same thing. It turns out cat is very difficult to do as a shell function. I may write one in osimplay and include it. The clip tool included with osimplay is written in osimplay to trim binaries to 0x7c00 for PC boot floppy images. It's a single-purpose dd.
osimplay has a simple ELF mechanism for static Linux executables, and a convenience for calling Linux syscalls, the osimplay Linux word. osimplay also has a lot more systems programming support than shasm, since I am now writing predecessors of pmode kernels with it. I keep one osimplay with the Linux stuff in it and another without, the latter being for target-system programs like kernels, and arbitrary data files like fonts. The only difference between the two though is fewer functions in the target system version.
osimplay has various features in the meta-assembler range, but the style of what is offered is somewhat unique. I do a reentrant procedure I call entrance procedures, but unlike, say, Randy Hyde's HLA, I don't do the usual flow-control abstractions like for-next or do-while, or even Forth's do-loop. Plain conditional branches (i.e. GOTO) get a bad rap because they don't sell compilers. Neither do I. I want to keep osimplay from developing a troublesome semantic seam between high-level and low-level, and flow-control abstractions are an area where those start to develop. In Gcc that seam is asm("") (which the Plan 9 C compiler does not have an equivalent for, by the way), and in a PD Forth it's CODE words and the runtime interpreter. osimplay has none of that.
A grep "()" osimplay of the target-system version of osimplay will produce something like this for the last part of it, which is where all the osimplay stuff (the most recent layer on shasm) is...
osimplay () { # main(). e.g. osimplay [your source file] beam () { # Fill a range of an xray with a jump address cell () { # name/allot a cell-size storage location copies () { # plural, range-to-range copy, handles overlaps clump () { # C struct() kinda, data associations namer leave () { # return from the current reentrant procedure fill () { # plural, copies A across range @ DI -x86 STOSD flag () { # assert the zero/sign flags of a register's value enter () { # how you invoke an osimplay reentrant procedure maskbyte () { # mask arg down to a byte maskdual () { # mask arg down to it's lowest-significance 16 bits match () { # plural, range-range compare, zero flag=match max () { # max arg1 arg2 leaves greater in arg2 min () { # min arg1 arg2 leaves lesser in arg2 numbering () { # C enum, but not strictly constants and no commas quad () { # name/allot a 4-byte storage location quadtohex () { # quad value to ASCII hexadecimal string range () { # name/allot some count-cell prefixed memory ring () { # ring-wise increment a strand's index ringstore () { # ringstore ring reg/lit [arg3 means cell] !A scale () { Forth */ with a shift instead of a divide scan () { # plural, compare A to memory range @ DI until hit/miss strand () { # name/allot a general array and 8 cells of metadata sum () { # plural, additive checksum tag () { # tagOf those, the ones that I use in boot demos and so on are zero, text, maskdual, cell, copies and clump. Whatever code needs to be reentrant so far, I've done with pushes and pulls (pops). My keyboard driver may use xray and it's helpers. Even writing a text editor, I haven't gotten into too much use of the reentrant procedure facilities yet. My rule for parameter-passing by hand is, if you weren't passed it and aren't returning it, and you need to clobber it, push it first, clobber it, and pull it. That is, if a register is being used to pass a value, that register probably doesn't need to be reentrant, i.e. it has probably been changed in the current pass.ASCII bytes and zeros in a cell text () { # name/allot some text xjump () { # indexed jump into an xray execution array xray () { # name an execution array for beam, yarx and xjump xsum () { # plural, XOR-ing checksum yarx () { # finish compembling an xray and it's beams zero () { # simple convenience to set (a) register(s) to 0 locals () { # help-only word on osimplay familial frame variables
I do like osimplay better than C, even for apps. And I use it. Something like clumps is a tiny percentage of the functionality of C structs, for example, but it's the most useful, least problematic part. That's how osimplay is, and it gets that from Forth. Allow or facilitate what you can without creating any barriers. That's not far removed from the philosophy of, say, Perl, but I mean barriers between the shell prompt and the CPU silicon. And unlike most PD Forths, assembled osimplay runs on the metal. Forth is a few-level language. osimplay is a one-level language.
osimplay doesn't have any scoping mechanism yet, and may not get any. If it does get such things, it will probably use shell function nesting to implement it. Bash is very good about reentrance and names. There are local variables afiliated with entrance procedures, but otherwise variable names in osimplay are all global. In fact, they're global to your shell. You can assemble something and check $H and whatnot after the fact, interactively. Things like scoping mechanisms lose that. Also as a matter of simplicity, the locals in entrance procedures are pre-named ala sa sb sc...
You can easily do conditional assembly within a file somehow with shell functions. You complicate matters massively if you do. osimplay actually uses an extra (third) pass to clean up forward branches, ELF pointers and so on, and that was a quick hack that just happened to work. When you start playing games with transient shell state, don't expect any sympathy. Bell Labs tries to eliminate #if and so on in the Plan 9 sources, for good reason. If you want to alias a name or something, you probably want to do it in your working copy of osimplay itself. "macros" are trivial in osimplay. They are shell functions, but they bring up the aforementioned transient state issues. osimplay is in your shell state, which can be messy. If you want to run osimplay on a major Linux distro, you probably want to set up a very constrained $PATH for it for similar reasons. If you have something in $PATH named, say, HL, your sanity will be sorely challenged.
There are also now a set of binary stiring variables for the values 0 thru 255. $____II__ is decimal 12, for example, binary 00001100. They are much much faster than the binary word, but limited to bytes.
Instruction names in osimplay vary along the terse/verbose axis from =, + and so on to loadmachinestatusdual, INTeL LMSW. Rarely used thingies are verbose. = is somewhere on the order of 25% of most code, and doesn't need to look like MOVW.
I don't support some things I can't see any use for, including the ASCII conversion instructions and the save instructions for IDTR and so on. You loaded it, right? Then you know what it is. IF there is a use for AAA, or the affiliated Auxiliary Carry flag, is it worth the obscurity? Maybe once in a MULTICS. So use ab and friends to hand-assemble the little gargoyle.
There are basically two forms of most of the instructions on a 386, the byte form and the cell form. Whether the cell form acts on duals or quads are a matter of system context. osimplay doesn't confront you with a global state variable in every mnemonic. The cell concept will be around awhile, and it smooths over a lot of the quirks of the x86 quite nicely, presumably through IA64.
If you're writing a kernel, try to get a monitor or a Forth going on the target machine as soon as possible, so you can code on the target.
Pre-assemble stuff you're not working on. Pre-assembled fonts and so on can just be cat'ted onto the end of a binary and pointered to with clump. Doing something similar with the front end of a binary might take a bit more finesse.
There's a temptation to write osimplay in osimplay, or rework asmacs to be more like osimplay, so osimplay is faster at assembly time. That's a lot of work for a small win. It's all about the resulting binary. The better path is to get a self-extending target system up ASAP and lose osimplay altogether. osimplay was originally to portable-ize my 3-stack language, and it still is, but H3sm might as well become an OS at this point.
Werner Almsberger I believe it was said osimplay looks like the "bastard offspring of a drunken encounter between INTERCAL and APL." Thanks so much. He obviously put some effort into that one. What his left hand is saying while he slings bananas at me with his right foot is that osimplay is about two orders of magnitude less twisted than INTeL assembly, and LOOKS like a high-level language. Take it from me, when it comes to obfuscation, the INTERCAL guys are helpless naifs compared to INTeL. But, There's More. The APL/INTERCAL look, strange as it is even to a Forth guy, implies possible portability.
Inter-machine portability is a subjective thing even with something like C. Gcc would be a nightmare to port to a 2-stack machine (a silicon Forth engine), for example. In fact, it wouldn't be Gcc any more. It would structurally be starting over. All the internal AI stuff Gcc does assumes a single stack, semantically in memory, i.e. really an array. Within the realm of register machines, there are two levels. Gcc also, last I looked, and C itself, don't know much about systems instructions, registers or concepts like "process".
If you're talking about applications programming on register machines, making osimplay portable looks doable. That is, osimplay could be as portable as C. The 386 plurals, i.e. the "string" ops, which x86 does very differently than other machines, can be hidden behind stock operation names like fill and so on. The 386 core applications registers are a subset of most other competing machines. RISCs don't have the 386 2-register memref, so don't use it. CISCs do have mem-to-mem addressing for most ops, and x86 doesn't, but that won't cause conflicts coming FROM x86. One of these days, somebody with a 68040 may do to 386 osimplay what the Plan 9 assembler does, which is provide a macro for op-mem-mem on 386. Et Cetera.
I heard anecdotally that hand-tweaked Pentium-Pro code runs about 60% faster on a PPro than plain 386 code. Is it worth it? No. Not as a stock feature in osimplay it's not. My osimplay will stay straight 386. I personally won't even do floats in osimplay. Chuck Moore says Forth */ (times-divide, a scaling primitive) makes floating-point unnecessary. Mostly, yeah. And osimplay scale uses a shift instead of a divide :o)
Rick Hohensee