=============================================================== Sum3 Simple Implementation (Serial) =============================================================== .. highlight:: c :linenothreshold: 0 For an input array of arbitrary integers (that is, we make no assumptions as to the range of the integers), *Sum3* can be solved in :math:`O(n^2)` time using relatively simple algorithms. The purpose of this chapter is to emphasize the point that parallelism is not always the best solution to a problem. Sometimes, making algorithmic improvements (resulting in better theoretical complexity) can provide better speedups than parallelism; especially if you don't have the hardware resources to take advantage of a parallel algorithm. Using a Hash Table =================== The easiest approach to code is to take advantage of hash tables. Hash tables are implemented in many languages, so they are cheap from a coding perspective, and they offer :math:`O(1)` time for inserting an element and determining if an element is in the hash table ... under ideal circumstances. The approach is to: #. Load all numbers in a hash table #. Generate all possible pairs of numbers #. Check if the negation of the sum of each pair of numbers is in the hash table. However, this approach suffers from some drawbacks. First, hash table operations can degrade to linear time under certain conditions (although this can be managed rather easily). Second, care must be taken to ensure that unique triples are identified. For example, if the input array was ``-2, 0, 4``, one must ensure that ``-2+-2+4 = 0`` does not get counted as a valid triple (since -2 only appears once in the input). Using a Sorted Input Array =========================== A slightly more complicated :math:`O(n^2)` algorithm requires that the input array be sorted. The algorithm contains a loop that will traverse the input array from first to last. Assume a sorted input array ``array`` containing ``n`` elements. We will first find all triples that add up to 0 and involve the first element in ``array``. We set ``a=array[0]``. A triple is then formed with ``b=array[0+1]`` and ``c=array[n-1]``. If ``a+b+c > 0``, then we need to reduce the sum to get it closer to ``0``, so we set ``c=array[n-2]`` (decrease the largest number in the triple to get a smaller sum). If ``a+b+c < 0``, then we need to increase the sum to get it closer to 0, so we set ``b=array[0+2]`` (increase the smaller number to get a larger sum). This process continues until ``b`` and ``c`` refer to the same array element. At the end of this process, all possible triples involving the first array element have been computed. To compute all triples involving the first array element, we took a pair of numbers from the array. After examining the pair, one number was excluded from further consideration in future triples involving the first array element. Therefore, for ``n`` array elements, we will construct ``n-2`` triples due to the following: the first array element is used in every triple, leaving us to create pairs of the remaining ``n-1`` elements; the first pair uses two elements and eliminates one from further consideration; all following pairs will reuse 1 element form the previous pair, and eliminate one element from consideration. Thus, computing all triples involving a single element requires :math:`O(n)` time. Repeating this for ``n`` elements in the input array requires :math:`O(n^2)` time. A C++ implementation of this algorithm follows: :: #include #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100+1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); // sort the data sort( data, data + dataSize ); // do the Sum3 computation. O(n^2) int count = 0; int a,b,c, sum; // array elements int j,k; // array indices for (int i = 0; i < dataSize-2; i++){ a = data[i]; j = i+1; k = dataSize-1; while( j < k ) { b = data[j]; c = data[k]; sum = a+b+c; if( sum == 0 ){ cerr << a << " " << b << " " << c << endl; count++; } if( sum < 0 ){ j++; } else { k--; } } } cout<< count < 2`` 0s, there will be ``n choose 3`` triples that sum to 0. 2. A sequence of repeating numbers exists such that a pair of those numbers along with a third forms a triple that sums to 0. For example: `` -4, 1, 2, 2, 2`` contains 3 triples that sum to 0. When such a sequence is detected with length ``n > 2``, there will be ``n choose 2`` triples that sum to 0. Note that a correct handling of this case can also handle the case where 3 or more 0s exist. 3. The situation in the above example when two sequences of repeating numbers exist such that the repeated element of each sequence along with a third number not equal to those elements sum to 0. For sequences of non-equal numbers ``x`` and ``y`` with respective lengths ``m >= 1`` and ``n >= 1``, such that there exists a number ``z | x+y+z = 0``, there will be ``x*y`` triples that sum to 0 for each individual copy of ``z``. The following program handles duplicates correctly. **Problem 1** and **Problem 2** are handled in a single conditional, the first ``if`` statement in the ``while`` loop. Note that when such a situation occurs, we have effectively found the point at which ``j`` and ``k`` converge, and so we break out of the while loop. **Problem 3** is handled in the second ``if`` statement in the ``while`` loop; the modification of ``j`` and ``k`` are again handled specially in this situation (since they may be increased or decreased by more than 1). The third ``if`` statement in the ``while`` loop handles the situation in which duplicates do not occur. The final ``else`` block in the ``while`` loop handles the normal increment of ``j`` or decrement of ``k`` when a triple with a non-zero sum is visited. The complete solution for handling duplicates is: :: #include #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); // sort the data sort( data, data + dataSize ); // do the Sum3 computation. O(n^2) int count = 0; int a,b,c, sum; // array elements int j,k; // array indices for (int i = 0; i < dataSize-2; i++){ a = data[i]; j = i+1; k = dataSize-1; while( j < k ) { b = data[j]; c = data[k]; sum = a+b+c; if( sum == 0 && b == c ) { // case where b == c. ie, -10 + 5 + 5 // or where a == b == c == 0 int num = k-j+1; count += (num*(num-1))/2; break; } else if( sum == 0 && (data[j+1] == b || data[k-1] == c )){ // case where there are multiple copies of b or c // find out how many b's and c's there are int startj = j; while( data[j+1] == b ) j++; int startk = k; while( data[k-1] == c ) k--; count += (j-startj+1) * (startk-k+1); j++; k--; } else if( sum == 0 ){ // normal case count++; j++; } else { // if sum is not 0, increment j or k if( sum < 0 ) j++; else k--; } } } cout<< count <