Taking notes: Complexity of Algorithms (using Java)

Just wanted to take this as my own notes from http://introcs.cs.princeton.edu/java/41analysis/.

Just take not that from Quadratic algorithm down below, aren't just that perfect algorithms that you would enjoy when handling large data structures.

Complexity	Description	Examples
1	Constant algorithm does not depend on the input size. Execute one instruction a fixed number of times	Arithmetic operations (+, -, *, /, %) Comparison operators (<, >, ==, !=) Variable declaration Assignment statement Invoking a method or function
log N	Logarithmic algorithm gets slightly slower as N grows. Whenever N doubles, the running time increases by a constant.	Bits in binary representation of N Binary search Insert, delete into heap or BST
N	Linear algorithm is optimal if you need to process N inputs. Whenever N doubles, then so does the running time.	Iterate over N elements Allocate array of size N Concatenate two string of length N
N log N	Linearithmic algorithm scales to huge problems. Whenever N doubles, the running time more (but not much more) than doubles.	Quicksort Mergesort FFT
N²	Quadratic algorithm practical for use only on relatively small problems. Whenever N doubles, the running time increases fourfold.	All pairs of N elements Allocate N-by-N array
N³	Cubic algorithm is practical for use on only small problems. Whenever N doubles, the running time increases eightfold.	All triples of N elements N-by-N matrix multiplication
2^N	Exponential algorithm is not usually appropriate for practical use. Whenever N doubles, the running time squares!	Number of N-bit integers All subsets of N elements Discs moved in Towers of Hanoi
N!	Factorial algorithm is worse than exponential. Whenever N increases by 1, the running time increases by a factor of N	All permutations of N elements

Estimating memory usage.

To pay attention to the cost, you need to be aware of memory usage. You probably are aware of limits on memory usage on your computer (even more so than for time) because you probably have paid extra money to get more memory. Memory usage is well-defined for Java on your computer (every value will require precisely the same amount of memory each time that you run your program), but Java is implemented on a very wide range of computational devices, and memory consumption is implementation-dependent. For primitive types, it is not difficult to estimate memory usage: We can count up the number of variables and weight them by the number of bytes according to their type.

type	bytes
`boolean`	1
`byte`	1
`char`	2
`int`	4
`float`	4
`long`	8
`double`	8

type	bytes
`byte[]`	16 + N
`boolean[]`	16 + N
`char[]`	16 + 2N
`int[]`	16 + 4N
`double[]`	16 + 8N
`int[][]`	4N^2 + 20N + 16
`double[][]`	8N^2 + 20N + 16

type	bytes
object reference	4
`String`	40 + 2N
`Charge`	32
`Charge[]`	36N + 16
`Complex`	24
`Color`	12

Primitive types. For example, since the Java int data type is the set of integer values between -2,147,483,648 and 2,147,483,647, a grand total of 2^32 different values, it is reasonable to expect implementations to use 32 bits to representint values.
Objects. To determine the memory consumption of an object, we add the amount of memory used by each instance variable to the overhead associated with each object, typically 8 bytes. For example, a Charge.java object uses 32 bytes (8 bytes of overhead, plus 8 bytes for each of its three double instance variables). A reference to an object typically uses 4 bytes of memory. When a data type contains a reference to an object, we have to account separately for the 4 bytes for the reference and the 8 bytes overhead for each object, plus the memory needed for the object's instance variables.
Arrays. Arrays in Java are implemented as objects, typically with two instance variables (a pointer to the memory location of the first array element and the length). For primitive types, an array of N elements uses 16 bytes of header information, plus N times the number of bytes needed to store an element.
Two-dimensional arrays. A two-dimensional array in Java is an array of arrays. For example, the two-dimensional array inMarkov.java uses 16 bytes (overhead for the array of arrays) plus 4N bytes (references to the row arrays) plus N times 16 bytes (overhead from the row arrays) plus N times N times 8 bytes (for the N double values in each of the N rows) for a grand total of 8N^2 + 20N + 16 ~ 8N^2 bytes.
Strings. A String uses a total of 40 + 2N bytes: object overhead (8 bytes), a reference to a character array (4 bytes), threeint values (4 bytes each), plus a character array of size N (16 + 2N bytes). Note that when working with substrings, two strings may share the same underlying character array.

Typically the JVM allocates memory in 8 byte blocks so a string of size 1, 2, 3, or 4 would consume the same amount of memory (48 bytes). Float and Double each use 16 bytes, as would a user-defined data type just containing a single double instance variable.

Perspective.

Good performance is important. An impossibly slow program is almost as useless as an incorrect one. In particular, it is always wise to have some idea of which code constitutes the inner loop of your programs. Perhaps the most common mistake made in programming is to pay too much attention to performance characteristics. Your first priority is to make your code clear and correct. Modifying a program for the sole purpose of speeding it up is best left for experts. Indeed, doing so is often counterproductive, as it tends to create code that is complicated and difficult to understand. C. A. R. Hoare (a leading proponent of writing clear and correct code) once summarized this idea by saying that "premature optimization is the root of all evil," to which D. Knuth added the qualifier "(or at least most of it) in programming."

Perhaps the second most common mistake made in developing an algorithm is to ignore performance characteristics. Users of a surprising number of computer systems lose substantial time waiting for simple quadratic algorithms to finish solving a problem, even though linear or linearithmic algorithms are available that are only slightly more complicated and could therefore solve the problem in a fraction of the time. When we are dealing with huge problem sizes, we often have no choice but to seek better algorithms.

Improving a program to make it clearer, more efficient, and elegant should be your goal every time that you work on it. If you pay attention to the cost all the way through the development of a program, you will reap the benefits every time you use it.

Search This Blog

Geek Gogie (Paul Namuag's Blog)