Wednesday, 9 September 2009

Computation - Time & Space Complexity

Time is a precious thing


Imagine, if you will, a children’s puzzle game – the Tower of Hanoi. With this, you have to move – say 8 disks – from one post to another. Each disk has to be moved – one at a time – onto another post, provided that the disk – if it exists – that it lands on is not smaller. Continuing on with the 8 disks, to solve such as simple puzzle will take a child 2n-1moves. So for 8 disks, this will be 255 moves – that’s quite a lot.



Now, say we have a given puzzle that consists of 64 disks! This number is so ridiculously high you will have given up long before you even move the disks 3000 times. Can you work out how many moves it will take? The answer is at the end. :P

OK, so maybe a children’s game was a bad example to start off with :( . Let’s take a look at a more realistic problem – one that will really get your head spinning.

This problem is known as the Travelling Salesman. A salesman has to find the shortest route to visit a number of different towns, each one just once, and returning back to the starting point. He is supplied with the distance between every pair of towns. For example, suppose there are five cities to be visited, London, Manchester, Birmingham, Cambridge and Oxford, with pair wise distances (approximate) as in the table below. For simplicity, we assume that the distance between any two cities is the same in reverse, i.e. the given relation is symmetric.



















































London



Birmingham



Manchester



Cambridge



Oxford



London



0











Birmingham



118



0









Manchester



201



86



0







Cambridge



62



99



162



0





Oxford



62



66



158



101



0



OK, so now let’s have a look at a brute force approach. For 5 cities, the normal number of routes would be 24, however, since the distances coming back in this case are the same as going their, then for this example it is only 12 routes. These are:























































Route



Length


London – Manchester – Birmingham – Cambridge – Oxford - London

549


London – Manchester – Birmingham – Oxford – Cambridge – London

516


London – Manchester – Cambridge – Birmingham – Oxford – London

590


London – Manchester – Cambridge – Oxford – Birmingham – London

648


London – Manchester – Oxford – Birmingham – Cambridge – London

586


London – Manchester – Oxford – Cambridge – Birmingham – London

677


London – Birmingham – Manchester – Cambridge – Oxford – London

529


London – Birmingham – Manchester – Oxford – Cambridge – London

525


London – Birmingham – Cambridge – Manchester – Oxford – London

599


London – Birmingham – Oxford – Manchester – Cambridge – London

566


London – Cambridge – Manchester – Birmingham – Oxford – London

438


London – Cambridge – Birmingham – Manchester – Oxford – London

467



It’s pretty clear to see that the shortest route is London – Cambridge – Manchester – Birmingham – Oxford – London, with a length of 438 miles.

Fair enough, so maybe 5 cities wasn’t that many. But, what if we had 10 cities? The number of routes here would be a staggering 3,628,800! What about 15 cities?

As the number of cities increases, it becomes evident that we need some kind of formulae. This formula is of course: (n-1)! / 2. Please remember that the (!) is factorial. Here, the ‘n’ is the number of cities in question. So, going back to the 15 cities, we get the answer of 45,589,145,600 routes!!! That will take up an amazing amount of time.

These two examples both have an example of computational cost, mainly in terms of time. Theoretically speaking, computing the solutions for a large problem is an almost impossible task. This is because it would take up way to much time.




Counting Instructions


Let’s take a quick look at a simple Java program:





int sum = 0;

int i = 0;

while (i<n)

{

sum = sum + i;

i = i + 1;

}

So, lets run through this with a value of n=2. As we run through, it is clear that there will be some assignments. Here, we have the first 2 assignments – before the while loop – and then we get the 2 assignments inside of the body of thewhile loop. Convince yourself that the body of the loop will be executed twice – when i=0 and when i=1. SO we have assignments like this:

  • int sum = 0;           x1

  • int i = 0;                x1

  • sum = sum + i;       x2

  • i = i + 1;                 x2


This means we have 1+1+2+2 = 6 total assignments. Don’t think this is all though. Now we have to consider the number of comparisons that are made for the while loop. The comparisons are for (i<n) which will look like:

  • (0<2)    -           yes

  • (1<2)    -           yes

  • But we still need to compare for when i=2.

  • (2<2)    -           no


In total we have 3 comparisons. This gives us a total of 9 assignments and comparisons. As like with the tower of Hanoi, and the traveling salesmen, this fragment of code will increase exponentially in the number of executions of assignments and comparisons as the value of n increases. For example, if we make, say, n=20. Then we get:

  • assignments    =          42

  • comparisons    =          21

  • total executed =          63


What we now need to do is to have a look at how long it takes to execute this simple piece of code. So we implement some time commands in, and we have:





for (int n = 100; n <= 10000000; n*=10)

{

long start = System.nanoTime();

long sum = 0;

int i = 0;

while (i<n)

{

sum = sum + i;

i = i + 1;

}

long stop = System.nanoTime();

System.out.println(n + “ iterations took “ + (stop-start) + “ns”);

}

The output for this was:





100 iterations took 2000ns

1000 iterations took 16000ns

10000 iterations took 160000ns

100000 iterations took 890000ns

1000000 iterations took 2872000ns

10000000 iterations took 28946000ns

100000000 iterations took 293872000ns

We can, of course, use this figures to estimate how much real-time, say, an individual Assignment / comparison takes. The real-time for when ‘n’ was 108 was found to be 293872000ns. This therefore gives the cost of executing the loop body once is 2.9ns. We assumed a total of three assignments and comparisons per iteration. Hence the real-time cost is just under 1 nanosecond per assignment (or comparison).




Big-O Notation


Given two functions f(x) and g(x) over the real numbers we say that f(x)O(g(x)) as ‘x’ tends to infinity, if and only if there is some positive real number ‘c’ and real number ‘x0’ such that |f(x)| ≤ c|g(x)| for all xx0. For example, let f(x) = 3x2 + 2x and g(x)x2. Then:

|3x2 + 2x|        = 3x2 + 2x for x ≤ 1

≤ 3x2 + 2x2

= 5x2

= 5|x2|

Hence, by taking the constant ‘c’ as 5, and ‘x0’ as 1, we have established that f(x)=O(g(x)).

Now let h(x) = 3x2 - 2x + 4 and g(x) = x2. Then

|3x2 - 2x + 4|   ≤ 3x2 + 2x + 4 for x ≤ 1

≤ 3x2 + 2x2 + 4x2

= 9x2

= 9|x2|

Hence by choosing ‘c’ as 9 and ‘x0’ as 1, we have established that h(x) = O(g(x)). Again

Importantly, the above gives an indication of how one can justify that given any unary polynomial function f(x) of degree n , we have that f(x) = O(xn).




Space Utilization


So far, our examples have only been considering one aspect of performance, namely the time. Even though the computers of today have gigabytes of memory, the use of memory is still another crucial resource that we should be interested in measuring. Consider the following program fragment:





int max = -1;

for (int index = 0; index<n; index++)

{

if (a[index] > max)

{

max = a[index];

}

}

Again, rather than counting the actual memory used (in bytes, kb, Mb, Gb, or what ever), let us abstract a little. Let each scalar variable, such as ‘max’ or ‘i’ be counted as 1. Let each array variable be counted as the number of elements in the array times the size of an individual element. Hence for the above example program fragment, we have a space count of ‘2+n’, assuming that ‘n’ denotes the number of elements in the array a (indeed, ‘n’ is what we considered as the problem size). We have therefore built an expression that gives the pseudo space utilization of the program fragment in terms of the problem size. It is clear from this example that the program fragment has a space utilization that is O(n), i.e. the use of space grows linearly with the size of the problem.

Overall, programs, even as small as these, take up alot of time and memory!

2 comments:

  1. [...] examples you may have – hopefully? – seen in a previous post – if you haven’t here it is: http://computersciencesource.wordpress.com/2009/09/09/comp10042-computation-time-space-complexity/ [...]

    ReplyDelete
  2. [...] Sadly I am a little too lazy to type it all out again. Yep, that’s right, i have already wrote a post on Time and Space Complexity If your mind it a little rusty, and you need too kick those gears back into motion, then here you go: http://computersciencesource.wordpress.com/2009/09/09/year1-computation-time-space-complexity/ [...]

    ReplyDelete