========================================================== Review of Multithreaded Programming ========================================================== Sum3 Problem Overview ^^^^^^^^^^^^^^^^^^^^^^^ Basic Description ================== The *Sum3* problem is described by a rather simple question: Given a set of :math:`n` integers, how many triples of distinct elements in that list add up to 0. For example, given the following list:: -1, -2, 0, 2, 3 the answer is 2:: -1 + -2 + 3 = 0 -2 + 0 + 2 = 0 The simplest way to code a solution to the *Sum3* problem is to use a double nested looping structure to generate the indices of all possible triples in the input array. Clearly, this is simple to express in code but has an unfortunate time complexity of :math:`O(n^3)`. For example, the following C++ code uses 3 ``for`` loops to achieve a correct solution. .. highlight:: c :linenothreshold: 0 :: #include using namespace std; int main( ) { int dataSize = 5; int* data = new int[ dataSize ]; data[0] = -1; data[1] = -2; data[2] = 0; data[3] = 2; data[4] = 3; // do the naive Sum3 computation. O(n^3) int count = 0; for (int i = 0; i < dataSize-2; i++) for (int j = i+1; j < dataSize-1; j++) for (int k = j+1; k < dataSize; k++) if (data[i] + data[j] + data[k] == 0) count++; cout<< count < #include using namespace std; int getRandInt( int* dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int* data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); int *dataPtr = &data[0]; // do the naive Sum3 computation. O(n^3) int count = 0; for (int i = 0; i < dataSize-2; i++) for (int j = i+1; j < dataSize-1; j++) for (int k = j+1; k < dataSize; k++) if (dataPtr[i] + dataPtr[j] + dataPtr[k] == 0){ count++; } cout<< count < 0``, then we need to reduce the sum to get it closer to ``0``, so we set ``c=array[n-2]`` (decrease the largest number in the triple to get a smaller sum). If ``a+b+c < 0``, then we need to increase the sum to get it closer to 0, so we set ``b=array[0+2]`` (increase the smaller number to get a larger sum). if ``a+b+c = 0``, we have a triple that adds up to 0, and we either set ``c=array[n-2]`` or ``b=array[0+2]`` in order to test the next triple (in the code below, we set ``c=array[n-2]``). This process continues until ``b`` and ``c`` refer to the same array element. At the end of this process, all possible triples involving the first array element have been computed. To compute all triples involving the first array element, we took a pair of numbers from the array. After examining the pair, one number was excluded from further consideration in future triples involving the first array element. Therefore, for ``n`` array elements, we will construct ``n-2`` triples due to the following: the first array element is used in every triple, leaving us to create pairs of the remaining ``n-1`` elements; the first pair uses two elements and eliminates one from further consideration; all following pairs will reuse 1 element from the previous pair, and eliminate one element from consideration. Thus, computing all triples involving a single element requires :math:`O(n)` time. Repeating this for ``n`` elements in the input array requires :math:`O(n^2)` time. A C++ implementation of this algorithm follows: :: #include #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100+1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); // sort the data sort( data, data + dataSize ); // do the Sum3 computation. O(n^2) int count = 0; int a,b,c, sum; // array elements int j,k; // array indices for (int i = 0; i < dataSize-2; i++){ a = data[i]; j = i+1; k = dataSize-1; while( j < k ) { b = data[j]; c = data[k]; sum = a+b+c; if( sum == 0 ){ cerr << a << " " << b << " " << c << endl; count++; } if( sum < 0 ){ j++; } else { k--; } } } cout<< count < 2`` 0s, there will be ``n choose 3`` triples that sum to 0. 2. A sequence of repeating numbers exists such that a pair of those numbers along with a third forms a triple that sums to 0. For example: `` -4, 1, 2, 2, 2`` contains 3 triples that sum to 0. When such a sequence is detected with length ``n > 2``, there will be ``n choose 2`` triples that sum to 0. Note that a correct handling of this case can also handle the case where 3 or more 0s exist. 3. The situation in the above example when two sequences of repeating numbers exist such that the repeated element of each sequence along with a third number not equal to those elements sum to 0. For sequences of non-equal numbers ``x`` and ``y`` with respective lengths ``m >= 1`` and ``n >= 1``, such that there exists a number ``z | x+y+z = 0``, there will be ``x*y`` triples that sum to 0 for each individual copy of ``z``. The following program handles duplicates correctly. **Problem 1** and **Problem 2** are handled in a single conditional, the first ``if`` statement in the ``while`` loop. Note that when such a situation occurs, we have effectively found the point at which ``j`` and ``k`` converge, and so we break out of the while loop. **Problem 3** is handled in the second ``if`` statement in the ``while`` loop; the modification of ``j`` and ``k`` are again handled specially in this situation (since they may be increased or decreased by more than 1). The third ``if`` statement in the ``while`` loop handles the situation in which duplicates do not occur. The final ``else`` block in the ``while`` loop handles the normal increment of ``j`` or decrement of ``k`` when a triple with a non-zero sum is visited. The complete solution for handling duplicates is: :: #include #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); // sort the data sort( data, data + dataSize ); // do the Sum3 computation. O(n^2) int count = 0; int a,b,c, sum; // array elements int j,k; // array indices for (int i = 0; i < dataSize-2; i++){ a = data[i]; j = i+1; k = dataSize-1; while( j < k ) { b = data[j]; c = data[k]; sum = a+b+c; if( sum == 0 && b == c ) { // case where b == c. ie, -10 + 5 + 5 // or where a == b == c == 0 int num = k-j+1; count += (num*(num-1))/2; break; } else if( sum == 0 && (data[j+1] == b || data[k-1] == c )){ // case where there are multiple copies of b or c // find out how many b's and c's there are int startj = j; while( data[j+1] == b ) j++; int startk = k; while( data[k-1] == c ) k--; count += (j-startj+1) * (startk-k+1); j++; k--; } else if( sum == 0 ){ // normal case count++; j++; } else { // if sum is not 0, increment j or k if( sum < 0 ) j++; else k--; } } } cout<< count < using namespace std; int main( ) { int dataSize = 5; int* data = new int[ dataSize ]; data[0] = -1; data[1] = -2; data[2] = 0; data[3] = 2; data[4] = 3; // do the naive Sum3 computation. O(n^3) int count = 0; for (int i = 0; i < dataSize-2; i++) for (int j = i+1; j < dataSize-1; j++) for (int k = j+1; k < dataSize; k++) if (data[i] + data[j] + data[k] == 0) count++; cout<< count < #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } // define a struct to pass information to the threads struct threadInfo{ int myID; int *dataPtr; int dataSize; int count; }; void* partial3SumThread( void* arg ) { // type cast the argument back to a struct threadInfo * myInfo = static_cast( arg ); // each thread only works on half the array in the outer loop // compute the bounds based on the ID we assigned each thread. // remember, we only have 2 threads in this case, so we will hard code a 2 int start = ((myInfo->dataSize / 2) * myInfo->myID ); int stop = ((myInfo->dataSize / 2)* (myInfo->myID+1)); if( myInfo->myID == 1 ) stop =myInfo->dataSize-2; // do the naive Sum3 computation. O(n^3) for (int i = start; i < stop; i++) for (int j = i+1; j < myInfo->dataSize-1; j++) for (int k = j+1; k < myInfo->dataSize; k++) if (myInfo->dataPtr[i] + myInfo->dataPtr[j] + myInfo->dataPtr[k] == 0){ myInfo->count++; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); // allocate thread handles pthread_t worker1tid, worker2tid; // allocate and set up structs for 2 threads threadInfo info1, info2; info1.myID = 0; info1.dataPtr = data; info1.dataSize = dataSize; info1.count = 0; info2.myID = 1; info2.dataPtr = data; info2.dataSize = dataSize; info2.count = 0; // allocate space for a return value int returnVal; // call the worker threads if ( returnVal = pthread_create( &worker1tid, NULL, partial3SumThread, &info1 ) ) { cerr<< "pthread_create 1: "<< returnVal < #include #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); // do the naive Sum3 computation. O(n^3) int count1 = 0; int count2 = 0; omp_set_num_threads( 2 ); #pragma omp parallel { int count = 0; int myID = omp_get_thread_num(); // each thread only works on half the array in the outer loop // compute the bounds based on the ID we assigned each thread. // remember, we only have 2 threads in this case, so we will hard code a 2 int start = (( dataSize / 2) * myID ); int stop = ((dataSize / 2)* (myID+1)); if( myID == 1 ) stop =dataSize-2; for (int i = start; i < stop; i++) for (int j = i+1; j < dataSize-1; j++) for (int k = j+1; k < dataSize; k++) if (data[i] + data[j] + data[k] == 0){ count++; } if( myID == 0 ) count1 = count; else count2 = count; } cout<< count1 + count2 < #include #include using namespace std; int getRandInt( int * dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num = num * -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int * data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); int count = 0; // do the naive Sum3 computation. O(n^3) #pragma omp parallel for for (int i = 0; i < dataSize-2; i++) for (int j = i+1; j < dataSize-1; j++) for (int k = j+1; k < dataSize; k++) if (data[i] + data[j] + data[k] == 0){ #pragma omp critical { count++; } } cout<< count < #include #include using namespace std; int getRandInt( int *dataVec, int dataSize ) { // load up a vector with random integers int num; for( int i = 0; i < dataSize; i++ ) { // integers will be 1-100 num = rand() % 100 +1; if( rand( ) % 2 == 0 ) { // make some integers negative num *= -1; } dataVec[i] = num; } } int main(int argc, char * argv[] ) { int dataSize = 0; if( argc < 2 ) { std::cerr << "usage: exe [num of nums] [optional seed value]" << std::endl; exit( -1 ); } { std::stringstream ss1; ss1 << argv[1]; ss1 >> dataSize; } if( argc >= 3 ) { std::stringstream ss1; int seed; ss1 << argv[2]; ss1 >> seed; srand( seed ); } else { srand( 0 ); } // create a data vector int *data = new int[ dataSize ]; // load it up with random data getRandInt( data, dataSize ); int count = 0; // do the naive Sum3 computation. O(n^3) #pragma omp parallel for reduction(+:count) for (int i = 0; i < dataSize-2; i++) for (int j = i+1; j < dataSize-1; j++) for (int k = j+1; k < dataSize; k++) if (data[i] + data[j] + data[k] == 0){ count++; } cout<< count <