An Introduction to Computer Science with AutoIt

From AutoIt Wiki
Revision as of 04:38, 11 November 2012 by Hyperzap (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page is still a work in progress.

Before starting this series, it is assumed you have a firm grasp of programming in the AutoIt language, and a firm grasp on general aspects of computers.During this series, I will attempt to augment your programming knowledge with an informed understanding of some elements of computer science. I will show you a practical and efficient approach to solving computational and implementational problems, as well as provide you with a range of common algorithms and methodologies that will enrich your programming.So lets get started …

What is computer science?

Wikipedia says:

Computer science is the scientific and practical approach to computation and its applications....
A computer scientist specializes in the design of computation systems.

As such, computer science has two elements:

  • Approach to problem solving
  • Theoretical understanding

During this series, I will focus on both of these objectives, and present them in such a way that they can be used to inform your programming. I will try and keep the theory component to only what is required, to avoid creeping out the less mathy readers with all the abstraction. That said, it is important to remember that the theory is important; and abstracting in this way will broaden your ability to think, reason, and solve computational problems.

Computer Science :: A Practical Approach

If you asked a computer scientist and an ordinary programmer to come up with a solution to the same problem, you would likely get two entirely different pieces of code. These variations are drawn from the considerations each individual have, and the process at which they arrive at their solutions.

A typical programmer's considerations:

  1. A clean, working solution
  2. good code readability

A computer scientists' considerations:

  1. A working solution, with known limitations
  2. Efficiency / Scalability
  3. code readability


In this way, the problem solving process can be broken down into a few distinct phases:



1. Understand the problem. No really.

A key element of time is spent stepping through and investigating the properties of the problem. Only by doing this, is it possible to realise the limitations of any specific solution, and have epiphanies where you realise new optimizations that greatly improve the efficiency of your solution. For example, consider the task is to write a function that finds the n-th number of the Fibonacci sequence.

The Fibonacci sequence looks something like: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 …

The n-th number in the sequence is dependent on the sum of the two numbers before it.

Knowing this, a typical programmer might write a function like this:

Func Fib($n)
	if $n = 0 return 0
	if $n = 1 return 1
	return fib(n-1) + fib(n-2)
Endfunc


When the function is called, it repeatedly calls itself to find the previous two Fibonacci numbers, which in turn call the function again, and so on (a technique known as recursion). However, a computer scientist will investigate the nature of the problem. For instance, he may find that a solution like the one above, when given any value for n, needlessly repeats calculations it has already done previously:

fib(5)
1. fib(4) + fib(3)
2. (fib(3) + fib(2)) + (fib(2) + fib(1))
3. ((fib(2) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
4. (((fib(1) + fib(0)) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))

And the computer scientist would use this understanding to his advantage when coming up with a solution.



2. Evaluating solutions based on efficiency / scalability

Before solutions can be evaluated, one needs a benchmark on which to evaluate them on, which brings us to complexity theory. If you excuse the fancy name, complexity is really just a rough measurement of the speed and efficiency of some solution/algorithm.

Normally, one talks about time complexity, i.e. the fast your solution is, but memory complexity may also be a consideration.

The measurements are in reference to n. n is some number that represents the input to our program. So under different problems and algorithms, n can be the number of elements in an array, the number of input, the nth number that we are trying to find....etc.

There are three ways of looking at complexity; best case, average case, and worst case scenario. Lets look at an example:

For $x in $array[0]
	if $array[$x] = $val return $x
Next
return $notfound

In the worst case, the element we are searching for is not in the array. Thus, it must search through every element in the array to find a result. Thus, we say it has a worst case time complexity of n, where n is the number of items in the array.

The shorthand for writing worst-case is called big-o notation, and it is written like this: O(n).

If we were to graph y=n, you would find a straight line. Thus, we can say the performance of this solution is directly proportional to the number of elements in the array. This is a good thing, as it means that performance does not degrade at an increasing rate for large values of n.

In the best case scenario, the element we are looking for is in the first element of the array, so complexity is o(1) i.e.: it only ever takes one iteration. Notice that the little o is used for notation in this case.

I leave it to an exercise to the reader to determine what the average case time complexity is. Θ (???)

In computer science, we are almost always only concerned about the worst-case scenario, and the shape of the graph if we plot y=timecomplexity where x=n. For instance, O(n^2) looks like this:

http://sqlinthewild.co.za/wp-content/uploads/2011/11/On2_thumb.png

By the shape of the graph, it is clear that it takes an increasing amount of time per value of n, as n becomes large. Thus, it is clear that this algorithm doesnt scale well for large values of n.

Additionally, when writing the notation for complexity, we only care about the highest power of n. Say you had some function that had O(44n + 4n^2 + 88), you would write it as O(n^2), as all other terms become insignificant as n gets really large. Moreover, as we are only concerned about the shape of the graph, we can drop the numbers before the n.

In this way, you should evaluate the efficiency of your potential solutions, and be able to project how the solution will behave when heavily loaded (i.e.: large values for n). You should then pick one which is the best balance of complexity, and ease of implementation.A large part of this series will be detailing how to implement various algorithms, which have varying complexities. Often, there will be many possible choices, and you will need to pick the one best for you.



3. Pick, implement, and test your solution

The final stage of development. Going back to our Fibonacci function example, our computer scientist may decide to store the results of calculations (like answer of fib(2)) to stop unnessessary recursion and repeated calculations. This is a technique known as dynamic programming, and it greatly improves the complexity of this algorithm, and works really fast for large values of n.

Conclusion

This concludes part one of this series. In the next part, An Introduction to Computer Science with AutoIt pt2, we start to go through comventional algorithms / solutions to sorting, and in future parts, move on to datastructures, and searching algorithms.