Raku Programming/Lazy Lists and Feeds
Laziness
[edit | edit source]In most traditional computing systems, data objects are allocated to a set size and their values filled in to the spaces in memory. In C for instance, if we declare an array int a[10]
, the array a
will be a fixed size with enough space to store exactly 10 integers. If we want to store 100 integers, we need to allocate a space for 100. If we want to store a million, we need to allocate an array that size.
Let's consider the problem where we want to compute a multiplication table, a two-dimensional array where the value of a given cell in the array is the product of the two indices of it. Here's a simple loop that could generate this table with factors up to N:
int products[N][N];
int i, j;
for(i = 0; i < N; i++) {
for(j = 0; j < N; j++) {
products[i][j] = i * j;
}
}
Creating this table can take a while to perform all N2 operations. Of course, once we've initialized the table it's very fast to look a value up in it. Another thing to consider here is that we end up calculating more values then we are ever going to use, so that's wasted effort.
Now, let's look at a function to do the same thing:
int product(int i, int j) {
return i * j;
}
This function doesn't require any startup time to initialize its values, however it does require additional time with every call to compute the result. It's faster to start up than the array, but takes more time for each access than the array does.
Combining these two ideas gives us the lazy list.
Lazy Lists
[edit | edit source]Lazy lists are like arrays with a few major differences:
- They aren't necessarily declared with a predefined size. They can be any size, even infinitely long.
- They don't calculate their values until required, and only calculate what's needed when it's needed.
- Once their values have been calculated, they can be stored for fast lookup.
The opposite of lazy lists are eager lists. Eager lists calculate and store all their values immediately, like the arrays in C. Eager lists cannot be infinitely long because they need to store their values in memory, and computers don't have infinite memory.
Raku has both types of lists, and they are handled internally without intervention by the programmer. Lists which can be lazy are treated lazily. Lists which cannot be lazy are eagerly computed and stored. Lazy lists give a benefit in terms of storage space and calculation overhead, so Raku tries to use them by default. Raku also provides a number of constructs that can be used to support laziness and improve performance of list calculations.
Ranges
[edit | edit source]We've already seen ranges. Ranges are lazy by default, which means all the values in the range aren't necessarily calculated when you assign them to an array:
my @lazylist = 1..20000; # Doesn't calculate all 20,000 values
Because of their laziness, ranges can even be infinite:
my @lazylist = 1..Inf; # Infinite values!
Iterators
[edit | edit source]Iterators are special data items that move through a complex data object one element at a time. Think about the cursor in a text editor program; the cursor reads one keypress, inserts the character at its current position, and then moves to the next position to await the next key press. In this way, a long array of characters can be inserted one at a time without you, the editor, having to move the cursor manually.
In the same way, iterators in Raku traverse through arrays and hashes automatically, keeping track of your current location in the array automatically so you don't have to. We've already seen a use of iterators in our earlier discussion on loops, although we didn't call them "iterators" by name. Here are two loops that perform an identical function:
my @x = 1, 2, 3, 4, 5;
loop(my int $i = 0; $i < @x.elems; $i++) {
@x[$i].say;
}
for @x { # Same, but much shorter!
$_.say;
}
The first loop iterates through the @x
array manually using the $i
variable to keep track of the current location, and using the $i < @x.length
test to make sure we haven't reached the end. In the second loop, the for
keyword creates an iterator for us. The iterator automatically keeps track of our current position in the array, automatically detects when we've reached the end of the array, and automatically loads each subsequent value into the $_
default variable. It's worth mentioning that we can make this shorter still by using a few Raku idioms:
.say for @x;
What are Iterators?
[edit | edit source]Iterators are any object that implements the Iterator
role. We'll talk about roles a little bit later, but it will suffice for now to say that a role is a standard interface that other classes can participate in. Because they can be any class, so long as it has a standard interface, iterators can do anything we define them to do. Iterators can traverse arrays and hashes easily, but specially-defined types could also iterate over trees, graphs, heaps, files, and all sorts of other data structures and concepts.
If a data item has an associated iterator type, it can be accessed through the .Iterator()
method. This method is called internally most of the time by structures like the for
loop, but you can get access to it if you really need to.
Feeds
[edit | edit source]Feeds give a nice graphical way to show where data is moving in complex assignment statements. Feeds have two ends, a "blunt" end and a "sharp" end. The blunt end connects to a data source which is a list of values. The sharp end connects to a receiver take can take at least one element at a time. Feeds can be used to send data from right-to-left or left-to-right, depending on the direction that the feed is pointing.
my @x <== 1..5;
say @x # 1, 2, 3, 4, 5
@x ==> @y ==> print # 1, 2, 3, 4, 5
say @y # 1, 2, 3, 4, 5
Layered feeds move data from one to the other. However feeds with two points append onto the last item in the feed chain:
my @x = 1..5;
@x ==> map {$_ * 2} ==> @y;
say @x; # 1, 2, 3, 4, 5
say @y; # 2, 4, 6, 8, 10
@x ==>>
@y ==> @z;
say @z # 1, 2, 3, 4, 5, 2, 4, 6, 8, 10
Gather and Take
[edit | edit source]We can write our own kinds of iterators using the gather
and take
keywords. These two keywords act a lot like the pointy blocks that we've seen previously. However, unlike pointy blocks, gather/take can return values. Like pointy blocks, gather/take can be combined with loops to form custom iterators.
gather
is used to define a special block. The code of that block can perform an arbitrary calculation and return a value with take
. Here's an example:
my $x = gather {
take 5;
}
say $x; # 5
This isn't so useful by itself. However, we can now combine it with loops to return a long list of values:
my @x = gather for 1..5 {
take $_ * 2;
}
say @x # 2, 4, 6, 8, 10
The take
operator performs two actions: It takes a capture of the value it's passed and returns that as one of the results of the gather
block, and it returns the value that it's been passed for storage. We can easily combine this behavior with a state
variable to use values recursively.
my @x = gather for 1..5 {
state $a = $_;
$a = take $_ + $a;
}
say @x; # 2, 4, 7, 11, 16