Wednesday, September 3, 2008

Looping in distributed programming

for (init_expression; loop_condition; loop_expression)
        program statement;

while (loop_condition)
        program statement;

The simple ‘for’ and ‘while’ loops above have so permeated so many programming languages that it's hard to imagine a programmer that hasn't written them innumerable times. So much so that most programmers would not bother to question its function, it is after all intuitive and self explanatory.[1]

However, as we started to look at parallel distributed execution, we realised the dynamics of these constructs are pivotal. This essay will dare to delve deeper into the simple loop construct to question how to make it perform. In particular I will focus on loop parallelisation.[2]

Historically the vast majority of loop constructs have been written to execute in a single processor, and that is exactly how the creators of the original construct intended us to use it. However, over the last decade we have seen the emergence of some large data sets, not the least of these being the Internet. This provides many classes of problems that require us to loop over such large data sets that a single processor, no matter how fast, is simply not going to return a result within tolerable real-time constraints.
Simultaneously, processor vendors are moving to an increasingly “multicore” strategy. Intel recently told developers to prepared for “thousands of cores”.[3]

There has historically been a happy marriage between the way humans think about problems and the way computers execute them – both, even if multitasking, essentially boil down to linear execution. As such the programmer writing an algorithm typically thinks and writes one step after another. But when it comes to distributed computing, suddenly we are asking computers to work in ways many people cannot – at least not alone. Programming for distributed computing expects the single programmer to derive and (somehow) write an algorithm as if an undefined number of machines would be collaboratively executing it.

With this in mind, it becomes abundantly clear that our timeless loop construct is inadequate for distributed computing. We either need a construct that informs the computer network what can be distributed, or we need an interpreter that can determine at run time what can and can't be distributed. Of course simply because something can be distributed doesn't necessarily mean it should – for example if the data set is small the distribution could be all overhead. However, there is no doubt there are many problems for which this isn’t the case.

Let us first look at the different types of loops to help us understand which can and can't be distributed.

Independent Loops

The first I like to call the Independent Loop. Such loops will execute the same logic/function over an array of data. It might be for example 'scoring' the relevance of a set of data that has been collected. The key to this type of looping is that the only data being modified is the data being looped (or the corresponding element in and identically sized 'result' dataset). An example of this type of looping is:

int data[] = {5, 4, 1, 6, 3};
for (int i = 0; i++; i < data.length)
        data[i] *= data[i];

The advantage of Independent Looping is that it can be maximally distributed. We could comfortably split 'the data' into any number of arbitrary parts and execute the same multiplication 'for' loop across each part. Concatenating the resulting data sets would leave us with an identical result as to if we had executed the data set in the single algorithm.

This type of looping is essentially a form or superset of Map-Reduce as discussed by Jon in his article on "Map-Reduce and Parrallelism" posted in July, 2008.

Accumulative Loops

The next loop type I call the Accumulative Loop. An Accumulative Loop is similar to the Independent Loop in that the function is executed on a single dataset one element at a time. However, rather than (or as well as) modifying the dataset element, the algorithm is modifying at least one variable outside the loop. What is key in this type of loop is that the sequence of modification to the external element isn’t important. This is an important distinction as if the outcome is dependent on the order of execution, then the algorithm is not deterministic in a distributed environment.
An example of a use of this kind of loop is to sum the values of a dataset:

int data[] = {5, 4, 1, 6, 3};
long sum = 0;
for (int i = 0; i++; i < data.length)
        sum += data[i];

As with the Independent Loop this kind of loop can be distributed. Care obviously needs to be made to ensure the various versions of the externally updated variables are ultimately reunited.

Dependent Loops

The last type of loop I'd like to introduce is the Dependent Loop. This is one where the result of one iteration is dependent on the last, or the outcome of the loop is dependent on the order of execution. The dependency may be data, or control related. A data dependent example is:

int data[] = {5, 4, 1, 6, 3};
for (int i = 1; i++; i < data.length)
        data[i] += data[i-1];

One such control related dependency may even be the decision whether to execute the next iteration. This is common in while loops, for example the conventional algorithm for deriving the greatest common devisor of two numbers u and v:

while (v != 0)
        temp = u % v;
        u = v;
        v = temp;

When the result of the next iteration of the loop relies upon the former we cannot distribute the execution. Instead, if possible, the code should be rewritten to be Independent or Accumulative.
An interesting observation is that the boundary case of infinite looping is a class of Dependent Loop. This is because the Independent and Accumulative Loops iterate over a predefined dataset. The infinite loop is a particularly dangerous boundary case in distributed programming as if released undetected it could literally consume all of the resources. Most applications will have an infinite loop somewhere, generally as a message processing construct (equivalent to the while(true) { } in your Java Runnable) - but these are implictly dependent because generally these loops will be consuming some kind of resource (i.e. a message queue or similar).

Of interest to me was that all three types of algorithms can be written using 'for' loops (or even 'while' loops), and hence the construct is actually too flexible to be useful in distributed programming. If for example we have:

for (...) { blah.doSomething() }

It is very difficult to determine whether this loop is Dependent, Accumulative or Independent. This may be even more difficult in a dynamic functional language such as Ruby or Lisp as you might be using function pointers that are unknown at compile time. [4]

Instead in our development we have introduced new constructs in our distributed language that allow the programmer and the computer to cleanly separate these types. The programmer must specify their intention at design time and this significantly simplifies the compiler's task of determining which loops will be distributed and which will not.

[1] The Do loop was first introduced in FORTRAN 66 in 1966. See:
[2] For details on other loop optimisation techniques take a look at
[3] See
[4] This same issue is referenced in Jon’s Map-Reduce and Parallelism article.


GM said...

Most of your article seems pretty sane, but the introductory sentence is just nonsense:

‘for’ and ‘while’ loops above have so permeated the very fibre of functional programming

While there _are_ functional programming languages out there with these constructs, I'd hardly call them the "fibre" of FP - in fact I assert that the fibre of FP is referentially transparent and certainly doesn't have "program statement"s at all.

Maybe you meant "imperative" or "procedural" programming?

Back to your main thrust though, would you say that accumulative loops are dependent loops with an associative property? Should we have a mechanism for programmers to provide associativity hints to the compiler or RTS?

Doug said...

Hi Greg, thanks for your comment. I agree, ‘fibre’ and focus on functional languages weren’t my intention. Correction in place.

We actually debated whether Accumulative Loops are a sub class of Independent Loops. The Dependent Loop must be dependent on other iterations of the loop - this is the distinctive property that make such loop types difficult to distribute.

A mechanism for the association is definitely needed - in our case so far for Accumulative Loops we have been custom building the functions and providing the associative properties on the input. The definitions have mainly helped us understand generalised properties of a type of loop, and helped us understand what we can and can't distribute. Associativity hints may be a very good way of generalising this construct. We have already been using compiler hints to allow the programmer more input over what is and isn't distributed.


macgeejackyl said...

The developers of Spinomenal approached the graphics responsibly, so the sensation of being at a tennis match is wholly created. For example, in the background, find a way to|you probably can} hear the ball hitting the racket when hit. 카지노 사이트 We’ve got it all—everything from the high-stakes fashion of blackjack, the fast-paced excitement of craps, to the joys and strategy of Pai Gow poker.

jacentgaa said...

Any well-rounded on line casino boasts a vibrant video poker part, and that is the case with the best on-line casinos in South Korea. With titles like Joker Pro, Aces & Eights, Deuces Wild and Jacks or Better, have the ability to|you probably can} take pleasure in selection of|quite so much of|a wide 먹튀사이트 먹튀프렌즈 range of} top-quality video poker video games with partaking gameplay and some wonderful payout potential. All the best on-line on line casino websites in South Korea offer up a list of desk video games to strive your hand at, too. If you favor video games of ability, you'll be able to|you probably can} strive staples like Poker, Roulette and Blackjack. There are plenty of versions additionally, like La Partage Roulette, Sic Bo, Atlantic City Blackjack, Punto Banco, Single Deck Blackjack, Oasis Poker heaps of|and plenty of} more. The traditional welcome bonus is 100 percent, leading to a bonus of a lot as} €100; this value can multiply the player’s initial budget.

salterearnhardt said...

However if you want to|if you would like to} progressive jackpots, for example, it’s a characteristic which is just obtainable if you play for actual. That mentioned, free spins aren’t the identical as ready to|with the flexibility to|having the flexibility to} play free slots or the free model of the sport. When you play the free model, is not a|there is not any} chance to 토토사이트 win actual cash – even when you reach the bonus rounds, it’s just a simulation of actual gameplay. So, you can to|you probably can} redeem a cell casino bonus or play free casino video games directly in your cell browser.