Department of Computer Science Intro to Parallel Computing
Programming Sets
Recall that a set is just a collection of objects. The order in which the objects are listed
doesn’t matter. So the set A = {3, 1, 2} is the same as the set B = {2, 3, 1}. Repetitions
also don’t matter. So the sets A and B are the same as the set C = {3, 1, 2, 1}. The union
of two sets D and E is the set whose elements are the elements that belong to D, to E, or
to both D and E. For example, if D = {1, 3, 5, 7} and E = {1, 2, 3, 4}, then their union is
D ∪ E = {1, 3, 5, 7} ∪ {1, 2, 3, 4} = {1, 2, 3, 4, 5, 7}.
The intersection of two sets D and E is the set of elements that they have in common. So
the intersection of the sets D and E is
D ∩ E = {1, 3, 5, 7} ∩ {1, 2, 3, 4} = {1, 3}.
The difference of two sets D and E is the set of elements belonging to the first, but not the
second. So the difference of the sets D and E is
D − E = {1, 3, 5, 7} − {1, 2, 3, 4} = {5, 7}.
Program 2
For programming assignment 2, you should write a C program that implements a set abstract
data type with operations union, intersection and set difference. Input to the program will
be a sequence of single characters separated by white space. These indicate which operations
should be carried out:
’u’ or ’U’: Find the union of two sets,
’i’ or ’I’: Find the intersection of two sets, and
’d’ or ’D’: Find the difference of two sets.
’q’ or ’Q’: Quit (freeing any allocated memory),
For the union, intersection, or difference, the user will enter two sets A and B, and your
program should output the result of the operation on the sets.
The elements of the sets will be nonnegative ints, listed in increasing order, without
repetitions. The last element in an input set will be followed by a negative value. For
example, the set D = {1, 3, 5, 7} would be entered as
1
1 3 5 7 -1
You can assume that input sets will be sorted lists of nonnegative ints in increasing order
with no duplicates, and each input set will be terminated by a negative value. So you don’t
need to check the input sets for correctness.
Implementation Details
Each set should be implemented as a sorted, singly linked list of ints, and there should be
no repeated values in the stored linked list. So D should be stored as
head_p -> 1 -> 3 -> 5 -> 7 \
(The backslash (\) denotes the NULL pointer.) The set D should not be stored as
head_p -> 3 -> 1 -> 7 -> 5 \
or
head_p -> 1 -> 3 -> 3 -> 5 -> 7 \
Note that the restriction that the lists are sorted and that they don’t store duplicates
doesn’t mean that we can’t accurately represent sets: any set of nonnegative ints can have its
elements listed in increasing order with no repeated values. The purpose of the restrictions
is to reduce the complexity of the program’s internal storage, and to make it possible to
efficiently implement the operations.
Since the sets are stored in sorted order, all three operations can be implemented as
variations on the merge operation. Recollect that we merge two sorted lists by comparing
the elements of the list pairwise and appending the smaller element to the new list. If the
input lists are A and B, and the merged list is C, we can define pointers a_p, b_p and c_p
that reference the current element in each list, respectively. So the merge operation proceeds
as follows:
while (a_p != NULL && b_p != NULL)
if (a_p->data <= b_p->data) {
c_p = Append(&C, c_p, a_p->data);
a_p = Advance(a_p);
} else { // b_p->data < a_p->data
c_p = Append(&C, c_p, b_p->data);
b_p = Advance(b_p);
}
while (a_p != NULL) {
c_p = Append(&C, c_p, a_p->data);
a_p = Advance(a_p);
2
}
while (b_p != NULL) {
c_p = Append(&C, c_p, b_p->data);
b_p = Advance(b_p);
}
You should use this algorithm as the basis of your implementations. Note that at no point
do you sort any of the lists!
Each of the set operations, A ∪ B, etc., should be implemented by a function. These
functions should not change the input arguments A and B. If the result of the operation
(e.g., C) is passed in to the function (instead of being created and returned by the function),
then, of course, the function can change it. Note that none of the functions implementing
the operations should do any I/O (except for debug I/0). The results should be printed by
the calling function (e.g., main).
Since each new operation in the program reads in new sets, your program won’t be reusing
the elements of the sets involved in the previous operation. So if you don’t free the
nodes in each of the lists after printing the results of an operation, your program will have
a memory leak: memory allocated from the heap will be lost.
Development
You should compile and test your program using Linux, since the parallel computers we have
access to all use Linux (rather than Windows or MacOS X).
Working with linked lists and pointers can be confusing