There are no positional restrictions, so comments and other statements can start anywhere. Of course, if portability to real Cobol compilers is desired, begin comments in column 7.
Comments allow all characters in them, so feel free to add text in any language as long as the characters are mapped within ASCII's 256 set (feature added at request of german users). I don't know if lex can read non-ASCII characters.
Cobcy programs are made up of four major divisions: identification division, containing informational statements, environment division, defining the devices needed to run the program and stating some properties of the program for maintainers, data division, having all the variables, and procedure division, where all the executable instructions are.
The minimal program for cobcy contains all four divisions and a program-id clause in identification division, which is required. Example:
IDENTIFICATION DIVISION. PROGRAM-ID. EMPTY-PROGRAM. ENVIRONMENT DIVISION. DATA DIVISION. PROCEDURE DIVISION.
IDENTIFICATION DIVISION.exactly as written (I will assume you know about case-insensitivity). Only one clause is implemented so far in this division - program-id clause.
PROGRAM-ID. program-name. [(COMMON, INITIAL) [PROGRAM]]where program-name is any legal Cobol identifier denoting the name of the program. Nothing else, except comments can be present in the identification division.
SOURCE-COMPUTER. name [[WITH] DEBUGGING]. OBJECT-COMPUTER. name [MEMORY [SIZE] integer (WORDS, CHARACTERS, MODULES)]. SPECIAL-NAMES. device IS device_name.The source and object computer specifications have no effect on the program and are provided for documentation purposes only. Memory size clause for object computer is obsolete and should not be used in new programs. Special names section declares aliases for devices. These are not necessary for Cobcy programs, but are provided for compatibility.
Input-output section so far supports only the file-control section. One or more SELECT statements must appear if this section is present.
FILE-CONTROL. SELECT [OPTIONAL] file_desc ASSIGN [TO] device {([FILE] STATUS [IS] status_variable, ACCESS [MODE] [IS] (SEQUENTIAL, RANDOM, DYNAMIC), ORGANIZATION [IS] (SEQUENTIAL, LINE SEQUENTIAL, RELATIVE, INDEXED) [(RELATIVE [KEY] [IS] relative_key_name, RECORD [KEY] [IS] record_key_name ) ] ) }.Sequential access mode forbids seeking, like reading a character device. Random and dynamic access modes are for block devices with seeking. Although cobcy does implement file pointer positioning on the next record in random and dynamic access modes, the behaviour is really undefined, so don't rely on it. I may change it too. Organization denotes the format of the file. Sequential files are simply ASCII listings of records separated by newlines. The default output to printer is formatted as sequential. Line sequential files are the same, except that no separators are present, so that the file is one big line. Tell me if this should be the other way around... Relative and indexed organizations use dBASE III format to store records. The indexes in the indexed files are in dBASE III NDX format.
DATA DIVISION. FILE SECTION. FD file_desc [LABEL RECORD [IS] (STANDARD, OMITTED)] [VALUE [OF FILE-ID] [IS] (file_name_var, "file name")] . level record_name. {level variable_name PIC[TURE] [IS] picture VALUE [IS] default_value {[(USAGE [IS] (BINARY,COMP[UTATIONAL],DISPLAY,INDEX,PACKED-DECIMAL), [SIGN] [IS] (LEADING,TRAILING) [SEPARATE] [sep_char], SYNC [(LEFT, RIGHT)], JUST[IFIED] [RIGHT] ( ] } . }Label records clause is again, an obsolete one. It makes no difference in the file format. Value clause for the FD is the way to specify the file name on disk. It took me a while to figure out. The associated record must be a record. I have not implemented yet a file with only one variable in it.
The record declaration format is the same as in working-storage section. Level is any positive integer. Unlike in the cobol standard the numbers have no special meaning yet, although I may change that later. Variable name is any legal identifier. Examples of pictures follow:
XXXXXXXXXX 10 character text string 9999999999 10 digit number X(10) 10 character text string 99/99/99 6 digit number formatted as date on output 99B9999 6 digit number with a blank (blanks are not allowed in the picture) +9999 Sign is always displayed -9999 Sign is drawn only if negativeTell me to add more if you think this is not enough.
All the options, like SYNC, JUSTIFIED, etc. are not implemented yet, they just generate a comment in the output C source.
Working-storage section contains variable and record blocks in the same format as above described.
Linkage, communication, and report sections are not implemented, generating a comment to this matter in the output C code.
PROCEDURE proc_name.Starts a new procedure, finishing the last one. This is pretty much an equivalent of a paragraph and in the output C code the only difference is that a procedure is never executed unless explicitly called.
INITIALIZE id {, id} {TOK_REPLACING (ALPHABETIC,ALPHANUMERIC,NUMERIC,ALPHANUMERIC-EDITED,NUMERIC-EDITED) [DATA] value } .This is another stub.
REPLACE OFF.A stub.
READ (file_desc, variable) [RECORD] AT END clause_list.Yes, I do support READ for file descriptors; it is actually much simpler than the process for record READ. However, for clarity reasons one should always use the bound record as the argument. Clause list must contain only clauses, no statements!
IF boolean clause_list {ELSIF boolean clause_list} [ELSE clause_list] .Although COBOL has no ELSIF keyword, I have added on for efficiency: I cannot translate the ELSE IF construct to a C
else if
statement because
without block markers it is just too ambiguous. Your ELSE IFs will translate
into this kind of a thing:
if (a > b) stmts else if (b > c) stmts else if (c > d) stmts etc...which works just fine, but is much less readable in C code. The readability of the compiled output will be greatly enhanced if you take time and do a
Another thing that deserves explanation is a boolean argument. The conditions
are the same as in any other language, but you can also use word substitutes.
For example you can say GREATER THAN
instead of >
.
The tricky part is the >=
type things. Here you should use
negation on the opposite operator, like NOT LESS
for that one.
Of course you can still use symbolic notation too.
Other operators available are
ALPHABETIC
,
ALPHABETIC-UPPER
, and
ALPHABETIC-LOWER
.
Use parentheses where you need them. This will make the code more readable.
The conditions can be strung together using the usual connectors, like
AND
and
OR
.
This concludes the statament part. The following are supported clauses. Remember, a clause cannot be used by itself. You have to convert it to a statement by appending a period. Otherwise you can string them together.
ACCEPT var [{, var2}] [FROM (DATE, DAY, DAY-OF-WEEK, TIME, CONSOLE)]Reads input from console, which is considered to be a sequentially organized file (see above). In id lists the commas are optional, but let me stress this: use them! Cobcy may require them later to reduce ambiguity. Only console input has been implemented, the others do generate stubs in the C code though.
DISPLAY var [{, var2}] [UPON special_name]Displays given variables on the specified device. If device is not specified, output goes to stdout.
MOVE expression TO var [{, var2}]expression is any normal arithmetic expression. Records cannot be moved to or from.
ADD expression TOK_TO var [{, var2}] [GIVING result_var] SUBTRACT expression TOK_FROM var [{, var2}] [GIVING result_var] MULTIPLY var [{, var2}] TOK_BY expression [GIVING result_var]Pretty obvious.
DIVIDE var [{, var2}] BY var [GIVING result_var] [ROUNDED] [ON SIZE ERROR clause_list] .Rounding is turned off by default. Currently rounding is not implemented very well.
COMPUTE result [ROUNDED] = expressionexpression can be any arithmetic expression involving any valid variables.
GO TOK_TO labelPretty obvious.
PERFORM paragraph_name [integer TIMES] [VARYING variable (FROM start TO end [BY increment], FROM start [BY increment] (UNTIL, WHILE) boolean, AFTER boolean ) ] .The limits for iteration can be integers or integer variables.
OPEN {(INPUT, OUTPUT, EXTEND, I-O) fd_list [REVERSED] [WITH NO REWIND] }The modes are self-explanatory. REVERSED and WITH NO REWIND are not implemented.
CLOSE fd_list [(UNIT, REEL) (FOR REMOVAL, WITH NO REWIND)] [WITH LOCK]Closes the file. No options are implemented.
WRITE identifier [FROM var]identifier can be a variable or a file descriptor. See READ. The source variable is written directly into the file and if the first identifier is a variable, it is not changed.
CALL call_list [USING [BY] (CONTENT, REFERENCE) id {, id_list} ]This function is not yet implemented.
STOP RUN EXIT PROGRAMThese are equivalent.
Well, that's it for now. I hope I'll add stuff to this when I add stuff to the compiler :)