Tuesday 2 February 2010

Distributed Computing - Parallel Computation

Why would you even want to use 100’s or even 1000’s of computers? What would you do with such a large number of networked computers?

Well the simplest answer is that there are many applications that we utilize which require good response time, such as Search engines, and File sharing. By increasing the number of computers working in parallel we can actually increase the execution time, giving users faster responses to resource queries.

  • Parallel computing – utilizing multiple CPU’s within a single computer

  • Distributed Computing – utilizing several computers connected through a network




Although many of the problems are similar between the two, parallel computing concentrates more on Communication (as computers may need to exchange data), and Synchronisation (for more information click here) issues. We have to ensure that the communication between any two machines is quick and accurate otherwise we lose the advantage of parallel computing.

Parallelized applications


Unfortunately, some applications cannot be parallelized even if we use more than one machine. All individual applications have their own parallelization needs and requirements; the following example better illustrates this



Application A can be computed in parallel because variables X, Y and Z can be computed independently, that is to say that they do not depend upon any previous variables. However application B cannot be parallelized because each variable must wait for the answer of the previous variable, so even if we used multiple machines the order of execution would still be sequential and so would not achieve an increase in the speed of execution.

Execution Time


An applications execution time can be represented schematically as a graph identifying the instructions that need to be executed in parallel. Although parallelization can give benefits like speeding up computation, tradeoffs are added due to communication requirements adding delay; therefore it may not always be a good idea to add parallelization over sequential execution.













Sequential Execution Time = 3 + 1 + 2 + 3 = 9

Parallel Execution Time = 3 + 5 + 2 + 1 + 3 = 14

The parallel execution time is that specified by the longest path. In the above example it can be seen that adding parallelism does not improve this application, so sequential execution may be best.

No comments:

Post a Comment