Loops and Iterators
Until now we have only made use of the simple loop constructs, while! and until!. These are similar to the loop constructs found in other languages, aside from the terminal '!'. However, that terminal '!' hides something very special about these loop constructs in Sather; a programmer can define such looping constructs as easily as he or she could define a standard routine. Once defined, they may be used as conveniently as the while! and until! iterators.
To a first approximation, iterators are like streams that can "yield" different values on successive loop iterations. When an iterator has no more values to yield, it "quit"s. This, in turn, terminates the enclosing loop.
Iterators are defined as class features, just like routines, but iterator names must terminate with an '!'. When an iterator is called, it executes the statements in its body in order. If it executes a yield statement, control is returned to the caller. In this, the iterator is similar to a coroutine whose state remains persistent over multiple calls. Subsequent calls on the iterator resume execution with the statement following the yield statement. If an iterator executes quit or reaches the end of its body, control passes immediately to the end of the innermost enclosing loop statement in the caller and no value is returned.
3.1 Using iterators
3.1.1 loop statements
Iteration is done with loop statements, used in conjunction with iterator calls. In the absence of iterator calls, a loop statement simply executes an infinite loop. The difference between an iterator call and a routine call is that the iterator call "remembers" its state after it yields a value and, on subsequent calls, it simply resumes execution. The "lifetime" of an iterator usually includes several calls within a particular loop. Hence, an execution state is maintained for each iterator call textually enclosed within a loop - this execution state will be used to "remember" the state of the iterator between invocations. When a loop is entered, the execution state of all enclosed iterator calls is initialized. When an iterator is encountered, control is transferred to the iterator until the iterator "yields" control. Just as a routine may provide a value when it returns, so too an iterator may provide a value when it yields.
| sum: INT := 0;
loop sum := sum + 1.upto!(10); end;
#OUT + sum + '\n'; -- Prints sum of integers from 1 to 10 |
|
Instead of yielding control back to the enclosing loop, an iterator may also terminate or quit, which terminates the enclosing loop.
Note that each loop may contain more than one iterator call, thus providing much more flexibility than conventional languages. When any of the iterators terminates, the whole loop terminates, and execution continues at the next statement after the loop.
3.1.2 Built-in iterators
The until!, while! and break! iterators are built-in. They have the standard definitions of until, while and break in other languages and may occur anywhere in the loop body. while! expressions are iterator calls which take a single boolean argument that is re-evaluated on each iteration. They yield when the argument is true and quit when it is false. until! expressions are iterator calls which take a single boolean argument that is re-evaluated on each iteration. They yield when the argument is false and quit when it is true. break! expressions are iterator calls which immediately quit when they are called.
| sum:INT := 0; i: INT := 0;
loop while!(i < 5);
sum := sum + i;
i := i + 1;
end;
#OUT+ "Sum="+sum+'\n'; -- Prints out Sum=10 |
|
The break! iterator can be used to terminate a loop at any time. We illustrate this with the bubble sort routine show below, which terminates the first time a pass through the data occurs with no order change.
| bubble_sort(a:ARRAY{INT}) is
loop
done: BOOL := true;
i:INT := 0; -- Loop until the "break!" is encountered
loop until!(i = (a.size-2));
if a[i] > a[i+1] then
done := false
swap(inout a[i], inout a[i+1]);
end;
i := i + 1;
end;
if done then break!; end;
end;
end; |
|
The 'swap' routine is as we have described earlier.
| swap(inout x:INT, inout y:INT) is
tmp:INT := x; x := y; y := tmp;
end; |
|
Note that the above 'bubblesort' routine could easily be rewritten to only use until!.
In addition to the built-in iterators, there are many commonly used iterators in the INT class that provide for interation over a range of values. For instance, the iterator upto! yileds successive integer values.
| sum:INT:= 0; loop sum := sum+ 10.upto!(20); end; |
|
The upto! iterator returns successive integers from 10 upto 20, inclusive. Below, are examples of a few other common iterators
| i:INT :== 10; sum:INT := 0;
loop 11.times!; sum := sum + i; i := i + 1; end; |
|
The 'times' iterator yields a certain number of times. For iterating over a range with a certain stride, use the 'step' iterator
| sum: INT := 0;
loop sum := sum + 18.step!(11,2); end;
-- The first argument is the number of iterations, 11 in this case
-- the second argument is the stride |
|
. The following example counts 11 even numbers starting at 18
The 'step_upto!' iterator is similar, but instead of specifying a number of iterations, it specifies the maxium value to be reached
| sum: INT := 0;
loop sum := sum + 18.step_upto!(40,2); end; |
|
. The following loop is equivalent to the preceeding one.
3.2 Defining Iterators
3.2.1 yield statements
Iterator definitions are similar to routine definitions, except that we need to indicate when control should be transferred back to the calling point. In a routine, this transfer of control is indicated by a return statement, which terminates the routine. An iterator, however, can return control in two different ways. It can either
- Temporarily yield control to the callig point, ready to continue the next time it is encountered. This yield of control is done by a yield statement
- Permanently yield control to the calling loop, terminating the loop in the process. This termination of the enclosing loop is achieved by a quit statemen or by reaching the end of the iteratort.
The yield must return a value (of the appropriate type), if the iterator has a return value.
| range!(min, max:INT):INT is
i:INT := min;
loop until!(i > max);
yield i; --
i := i + 1;
end;
end;
|
|
This iterator can be used to add up all the numbers in a particular integer range
| sum: INT := 0; loop sum := sum+range!(1,10); end; |
|
3.2.2 Explicitly leaving an iterator using quit
When an iterator has yielded as many times as needed, it can either reach the end of it's statement list or explicitly call a quit statement. quit statements are used to terminate loops and may only appear in iterator definitions. No value is returned from an iterator when it quits. No statements may follow a quit statement in a statement list. The following definition of 'range!' is equivalent to the preceeding definition:
| range!(min, max:INT):INT is
x:INT := min;
loop if x > max then quit end;
yield x;
x := x + 1;
end;
end; |
|
3.2.3 Control flow within an iterator
The following figures illustrate the control flow between an interator and its calling loop.
When the iterator is first called, control goes into the iterator and then returns to the outer loop, when the iterator yields in step [7]
After the first yield, control continues in the outer loop until the iterator is encountered again in step [11] and control is again transferred to the iterator, right after the point of the yield, in step [12]
![](DescriptionX2Eiterator-1-image-3.gif)
The above sequence will continue until the if statement is true and the quit statement is encountered in the iterator. Control is then transferred to the end of the enclosing loop. The iterator calling context keeps track of the internal state of the iterator from the last yield.
3.2.4 The once argument mode
One problem with the above definition of 'range!' is that the arguments to the function will be evaluated each time through the loop. Consider the following loop
| sum:INT := 0;
max:INT := 5;
loop sum := sum + range!(3,max);
max := max+2;
end; |
|
This, somewhat silly, example will go into an infinite loop, since the argument 'max' increases each time through the loop.
Iterator argument are hot by default. This means that the arguments will be re-evaluated and passed to the iterator each time through the loop. When the arguments to the iterator are constant, it is not important whether they are re evaluated or not. However, in many cases it is important to ensure that the argument is only evaluated the first time through the loop.
This happens to once-arguments. Arguments which are marked with the mode 'once are only evaluated the first time they are encountered during a loop execution. Thus, the correct definition of the 'range' iterator is:
| range!(once min, once max:INT):INT is
i:INT := min;
loop until!(i > max);
yield i
i := i + 1;
end;
end; |
|
Note that 'once' arguments are only marked at the point of definition, not at the point of call. Thus, invoking the loop will look the same as before
| sum:INT := 0; loop sum := sum + range!(3,5); end; |
|
The 'self' parameter (i.e. the object on which the iterator is being called) is always a once parameter.
| i: INT := 5;
loop #OUT+i.upto!(11)!+' '; i:=1; end;
-- The above loop prints out 5 6 7 8 9 10 11 |
|
In the above example, though the value of 'i' changes the second time through the loop, the change is ignored - the first value of 'i' is used.
The following more complex example will sum up some of the elements of the first row although the variable row will contain different rows in consecutive loop iterations.
| loop -- Sum up some of the elements of the first row!
row := matrix.row!;
sum := sum + row.elt!;
-- row is only evaluated at the first iteration!
end; |
|
3.2.5 out and inout argument modes
Yield causes assignment to out and inout rguments in the caller i.e. these arguments are assigned each time when the iterator yields..
| range!(once min, once max:INT, out val:INT) is
i: INT := min;
loop until!(i > max);
val := i;
yield;
i := i + 1;
end;
end; |
|
Which may be used by:
| sum:INT := 0;
loop res:INT;
range2!(3,5, out res);
sum := sum + res;
end;
#OUT+sum+'\n'; -- Prints out 12 |
|
Note that no assignment to out and inout arguments takes place when an iterator quits.
3.2.6 pre and post conditions in iterators
The behavior of pre- and post- conditions in iterator definitions is a natural extension of their behavior in routine definitions. The pre clause must be true each time the iterator is called and the post clause must be true each time it yields. The post clause is not evaluated when an iterator quits.
3.2.7 Argument evaluation in iterators
At a more technical level, when an iterator is first called in a loop, the expressions for self and for each once argument are evaluated left to right. Then the expressions for arguments which are not once (in or inout before the call, out or inout after the call) are evaluated left to right. On subsequent calls, only the expressions for arguments which are not once are re-evaluated. self and any once arguments retain their earlier values.
3.2.8 Points to note
Iterator usage
- Iterators may only be called within loop statements.
- once mode arguments are only marked at the point of definition, not at the point of call, unlike out and inout arguments. out and inout arguments cannot be once arguments as well.
- Each textual instance of an iterator in a loop is distinct. The following loop prints out [2,2] [3,3] [4,4] (and not [2,3])
| loop
a: INT := range!(2,4);
b: INT := range!(2,4);
#OUT+ "["+a +","+ b+"] ";
end; |
|
- Not all iterators reach their end or quit - execution may be terminated because some other iterator in the same loop quit. See the next point.
- Iterator instances in a conditional statement are evaluated every time they are encountered. The following loop prints out [2,2] [3,2] [4,3] and then is terminated when the first iterator quits, even though the second iterator is not yet complete
| b: INT := 0;
loop a: INT := range!(2,4);
if a.is_even then b := range!(2,4); end;
#OUT+ "["+a +","+ b+"] ";
end; |
|
- The expressions for self and for once arguments may not themselves contain iterator calls. (Such iter calls would not be useful anyway, since they would only execute their first iteration.) Thus, the following code is illegal, even though the 'times!' iterator is a perfectly valid iterator on integers.
| loop a: INT := range!(3,4).times!; end; |
|
- Iterators may call themselves recursively as routines do. As iterators are normally supposed to yield more than once, one should not forget to define a loop within the iterator to catch all of these results
| class BINARY_TREE is
attr left,right: SAME; -- subtrees
attr data: INT;
elt!: INT
is
if void(self) then quit end;
yield data;
loop yield left.elt! end; -- yield data in the left subtree.
loop yield right.elt! end;
end; -- elt!
end; -- class BINARY_TREE |
|
.
- If an iterator in complex expression quits, the surrounding expression might not be fully evaluated.
| loop #OUT + "(" + c.elt! + ")\n" end; |
|
- When the iterator elt! terminates the surrounding loop, an opening bracket has already been printed. The expression producing the matching closing bracket will not be evaluated, hence the algorithm will always print a bogus closing bracket in the end. The standard solution looks as follows:
| loop #OUT + ( "(" + c.elt! + ")\n" ); end; |
|
- The extra paratheses force the whole line to be evaluated first. As this evaluation will be aborted by the quit of the iterator the printing evaluation will not happen for the last iterator call.
Iterator definitions
- Iterator names always end with an exclamation mark '!'.
- Yield is not permitted within a protect statement (see the Chapter on Exceptions)
- Iterators enjoy the same access options as routines. Just as with routine definitions, iterator definintions may be marked private.
- Iterator overloading and conformance rules are the same as those for routines.
- An iter argument may have only one mode. Thus, it is neither possible nor meaningful to have 'once inout' or 'once out' arguments.
3.3 Iterator Examples
Some of the following examples make use of arrays, which will be introduced in the chapter on Parametrized Classes
Because they are so useful, the 'while!', 'until!' and 'break!' iterators are built into the language. Here's how 'while!' could be written if it were not a primitive
| while!(pred:BOOL) is
-- Yields as long as 'pred' is true
loop
if pred then yield
else quit
end
end
end. |
|
The built-in class 'INT' defines some useful iterators. Here's the definition of 'upto!'. Unlike the argument 'pred' used above, 'i' here is declared to be 'once'; when 'upto!' is called, the argument is only evaluated once, the first time the iterator is called in the loop.
| upto!(once i:SAME):SAME is
-- Yield successive integers from self to `i' inclusive.
r::=self;
loop
until!(r>i);
yield r;
r:=r+1
end
end; |
|
To add up the integers 1 through 10, one might say
| sum: INT := 0; loop sum := sum + 1.upto!(10) end |
|
Or, using the library iterator 'sum!' like this. 'x' needs to be declared (but not initialized) outside the loop, so its value is available after the loop terminates.
| x: INT; loop x:=INT::sum!(1.upto!(10)) end |
|
Some of the most common uses of iters are with container objects. Arrays, lists, sets, trees, strings, and vectors all have iterators to yield all their elements. Here we print all the elements of some container 'a'
| a: ARRAY{INT} := |1,2,7|;
loop #OUT + a.elt!.str + '\n' end |
|
This doubles the elements of array 'a'
| loop a.set!(a.elt! * 2) end |
|
This computes the dot product of two vectors 'a' and 'b'. There is also a built-in method 'dot' to do this. 'x' needs to be declared (but not initialized) before the loop.
| loop x:=sum!(a.elt! * b.elt!) end |
|
Separating elements of a list
When printing out the elements of a container, or other kinds of lists, it is usually appropriate to insert a separator between all the elements (but, of course, not after the last element). There is a convenient iterator in the string class that does exactly this:.
| a: ARRAY{INT} := |1,2,3|;
loop #OUT + ",".separate!(a.elt!.str); end;
-- Prints out 1,2,3 |
|
The separate! iterator is called on the string which you wish to use to separate components of the list. In this case, the list elements will be separated by a comma. The definition of this iterator is as shown below
| class STR is
...
separate!(s: STR): STR is
-- On the first iteration just outputs `s', on later iterations
-- it outputs self followed by `s'.
yield s.str; loop yield self + s.str end
end; |
|
Note that the argument to the iterator is not a once argument, and will be evaluated each time the iterator is called.