Taking notes: Complexity of Algorithms (using Java)

Just wanted to take this as my own notes from http://introcs.cs.princeton.edu/java/41analysis/.

Just take not that from Quadratic algorithm down below, aren't just that perfect algorithms that you would enjoy when handling large data structures.

ComplexityDescriptionExamples
1Constant algorithm does not depend
on the input size. Execute one
instruction a fixed number of times
Arithmetic operations (+, -, *, /, %)
Comparison operators (<, >, ==, !=)
Variable declaration
Assignment statement
Invoking a method or function
log NLogarithmic algorithm gets slightly
slower as N grows. Whenever N
doubles, the running time increases
by a constant.
Bits in binary representation of N
Binary search
Insert, delete into heap or BST
NLinear algorithm is optimal if you
need to process N inputs. Whenever
N doubles, then so does the running
time.
Iterate over N elements
Allocate array of size N
Concatenate two string of length N
N log NLinearithmic algorithm scales
to huge problems. Whenever N
doubles, the running time more
(but not much more) than doubles.
Quicksort
Mergesort
FFT
N2Quadratic algorithm practical for use
only on relatively small problems.
Whenever N doubles, the running
time increases fourfold.
All pairs of N elements
Allocate N-by-N array
N3Cubic algorithm is practical for use on
only small problems. Whenever N
doubles, the running time increases
eightfold.
All triples of N elements
N-by-N matrix multiplication
2NExponential algorithm is not usually
appropriate for practical use. Whenever
N doubles, the running time squares!
Number of N-bit integers
All subsets of N elements
Discs moved in Towers of Hanoi
N!Factorial algorithm is worse than
exponential. Whenever N increases
by 1, the running time increases
by a factor of N
All permutations of N elements



Estimating memory usage.

 To pay attention to the cost, you need to be aware of memory usage. You probably are aware of limits on memory usage on your computer (even more so than for time) because you probably have paid extra money to get more memory. Memory usage is well-defined for Java on your computer (every value will require precisely the same amount of memory each time that you run your program), but Java is implemented on a very wide range of computational devices, and memory consumption is implementation-dependent. For primitive types, it is not difficult to estimate memory usage: We can count up the number of variables and weight them by the number of bytes according to their type.

typebytes
boolean1
byte1
char2
int4
float4
long8
double8
    
typebytes
byte[]16 + N
boolean[]16 + N
char[]16 + 2N
int[]16 + 4N
double[]16 + 8N
int[][]4N^2 + 20N + 16
double[][]8N^2 + 20N + 16
    
typebytes
object reference4
String40 + 2N
Charge32
Charge[]36N + 16
Complex24
Color12
 
  • Primitive types. For example, since the Java int data type is the set of integer values between -2,147,483,648 and 2,147,483,647, a grand total of 2^32 different values, it is reasonable to expect implementations to use 32 bits to representint values.
  • Objects. To determine the memory consumption of an object, we add the amount of memory used by each instance variable to the overhead associated with each object, typically 8 bytes. For example, a Charge.java object uses 32 bytes (8 bytes of overhead, plus 8 bytes for each of its three double instance variables). A reference to an object typically uses 4 bytes of memory. When a data type contains a reference to an object, we have to account separately for the 4 bytes for the reference and the 8 bytes overhead for each object, plus the memory needed for the object's instance variables.
  • Arrays. Arrays in Java are implemented as objects, typically with two instance variables (a pointer to the memory location of the first array element and the length). For primitive types, an array of N elements uses 16 bytes of header information, plus N times the number of bytes needed to store an element.
  • Two-dimensional arrays. A two-dimensional array in Java is an array of arrays. For example, the two-dimensional array inMarkov.java uses 16 bytes (overhead for the array of arrays) plus 4N bytes (references to the row arrays) plus N times 16 bytes (overhead from the row arrays) plus N times N times 8 bytes (for the N double values in each of the N rows) for a grand total of 8N^2 + 20N + 16 ~ 8N^2 bytes.
  • Strings. A String uses a total of 40 + 2N bytes: object overhead (8 bytes), a reference to a character array (4 bytes), threeint values (4 bytes each), plus a character array of size N (16 + 2N bytes). Note that when working with substrings, two strings may share the same underlying character array.
Typically the JVM allocates memory in 8 byte blocks so a string of size 1, 2, 3, or 4 would consume the same amount of memory (48 bytes). Float and Double each use 16 bytes, as would a user-defined data type just containing a single double instance variable.

Perspective.

 Good performance is important. An impossibly slow program is almost as useless as an incorrect one. In particular, it is always wise to have some idea of which code constitutes the inner loop of your programs. Perhaps the most common mistake made in programming is to pay too much attention to performance characteristics. Your first priority is to make your code clear and correct. Modifying a program for the sole purpose of speeding it up is best left for experts. Indeed, doing so is often counterproductive, as it tends to create code that is complicated and difficult to understand. C. A. R. Hoare (a leading proponent of writing clear and correct code) once summarized this idea by saying that "premature optimization is the root of all evil," to which D. Knuth added the qualifier "(or at least most of it) in programming."
Perhaps the second most common mistake made in developing an algorithm is to ignore performance characteristics. Users of a surprising number of computer systems lose substantial time waiting for simple quadratic algorithms to finish solving a problem, even though linear or linearithmic algorithms are available that are only slightly more complicated and could therefore solve the problem in a fraction of the time. When we are dealing with huge problem sizes, we often have no choice but to seek better algorithms.


Improving a program to make it clearer, more efficient, and elegant should be your goal every time that you work on it. If you pay attention to the cost all the way through the development of a program, you will reap the benefits every time you use it.

Comments

Popular posts from this blog

Converting sectors into MB - Useful in understanding the sectors in iostat in Linux

What is Disk Contention?

Installing MySQL from source: Could NOT find Curses (missing: CURSES_LIBRARY CURSES_INCLUDE_PATH)