Lab 48: The Curse Of The Clumpino JUnit: Clumps Test.java All serious programming languages have the "A-list" data struc

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

Lab 48: The Curse Of The Clumpino JUnit: Clumps Test.java All serious programming languages have the "A-list" data struc

Post by answerhappygod »

Lab 48 The Curse Of The Clumpino Junit Clumps Test Java All Serious Programming Languages Have The A List Data Struc 1
Lab 48 The Curse Of The Clumpino Junit Clumps Test Java All Serious Programming Languages Have The A List Data Struc 1 (422.2 KiB) Viewed 26 times
Lab 48: The Curse Of The Clumpino JUnit: Clumps Test.java All serious programming languages have the "A-list" data structures such as sets and lists lounging in the language itself or at the very least in its standard library, since these data structures are so immensely useful in solving problems universally through all possible problem domains. However, not every data structure ever invented passes the muster of practical applicability to earn its entry past the velvet rope into this exclusive club. In this lab, we will implement the disjoint set data structure, one of the lesser known "C-list" data structures who never got his big break to the major leagues and eventually drifted back to working at his old man's used car dealership in his Rockwellesque home town, occasionally crying over beers how he could have been a contendah when he was given the chance to speed up this more famous algorithm who totally promised to take him to the B-list as his plus one! The disjoint set data structure is worth appreciating on its own, even if we never went on to actually implement any of the classic algorithms that gain from the help of this plucky little hometown champ. Create a new class Clumps into your labs project folder. Each object of this class represents some partition of numbers from 0 to n-1 into disjoint clumps so that at any moment, each number belongs to exactly one clump. Initially, every number belongs to its own singleton clump. Clumps are actually nothing but good old disjoint subsets, but during this lab, we shall use the more colourful word "clump" to emphasize how these subsets will be crudely clumped together to produce new bigger clumps, as if they were made of clay. Before looking at the implementation details, here are the public methods that define what this data structure will do for its user code. public clumps (int n) The constructor of this class to initialize the data structure for integers 0,..., n - 1. public boolean sameClump (int a, int b) Determines whether the integers a and b are currently part of the same clump. public boolean meld(int a, int b). If a and b are already in the same clump, nothing happens. Otherwise this method combines the two clumps that contain integers a and b into a single clump that contains all integers that were originally in the same clump as either one of the arguments a or b. This method should return true if the clumps of a and b were disjoint but became the same clump as a result of this operation, and return false if a and b were already part of the same clump so that nothing happened in this meld. public int clumpSize (int a) Page 171 of 271
Counts how many numbers are currently in the same clump as the integer a, and returns this count (Code inside this method will not actually he counting anything, but will simply look up the value from the size table that the other methods continuously update as they mutate the data structure.) These clumps can only ever grow larger when they meld, since no clump is ever broken apart into smaller clumps. Eventually all numbers belong to the same clump, after which point nothing can ever change. Such monotonicity often allows liberties in the data structure design to allow a solution that would not hear the presence of entropy-increasing operations such as "remove" or "split". The disjoint set data structure allows the methods saneClump, meld and clumpsize to work in time that is, for all purposes both practical and impractical, a small constant that is independent of the value n. Even though ane meld operation might theoretically combine two clumps of size n/2 into a clump of size n, the internal updating of the data structure still happens in O(1) constant time! It may seem strange that updating a data structure with billion elements needs no more time than updating a data structure with a hundred elements, so let us now pull away the cheap plywond and the black velvet cloth that has been to the laundry a couple of times too many to uncover the secret inside this magic trick. To allow the previous operations to be implemented efficiently for arbitrary clumps, the current partitioning of elements into clumps will be encoded in two arrays private int[] parent; private int size; The values of these fields are initialized to array objects created in the constructor when we know the value of that they need to be hig enough to contain. The actual elements are initialized with assignments parent[a]-a and size [a]-1 for each number a from zero to 7-1. Every clump has exactly one representative, kind of a primus inter pares that can be any one of the numbers currently in that clump, as long as one of these numbers shoulders the responsibility of acting as the representative for the entire clump. Which one of the numbers in that club assumes this role depends on the precise order of the meld operations that led the entire structure to its current state. The number a is the representative of its clump if and only if parent[a]--a. Otherwise, to find the representative of the clump that the integer a currently belongs to, continue looking for the representative of the integer parent[a] the same way, until you arrive at the representative of the clump, which you recognize from the fact that it equals its own parent. (And, Like in the song, is its own grandpa.) This operation of looking for the clump representative of the given number should implemented as a private helper method private int findRepresentative(int a) that follows the parents all the way up to find the representative of the clump that contains the integer a. After this representative has been found, this method should perform path compression as explained on the Wikipedia article about the disjoint set data structure, since this cleanup Page 172 of 271 operation costs us essentially nothing while it will massively speed up future queries for numbers whose parent chain ever leads to the representative through this same waypoint number. To perform this path compression, simply follow the parents from a to its representativer the exact same way as hefore, and reassign parent [x]=r for every integer x along this path. The method sameClump should at this point be a one-liner that merely checks that its parameters a and b have the same representative. The array size keeps track of how many numbers belong to each clump. However, this size information is maintained only for those numbers that are actually representatives of their own clumps. The value of size [x] for any other number x is never needed for anything. The method meld should first check whether a and b are already in the same clump, and return false to indicate having done nothing to nobody in that case. Otherwise, find the representatives of a and is in the structure, let us call these representative numbers ra and rix. In principle, either one of the two assignments parent [ra]=rb or parent[rb]=ra would combine these two clumps into one, but it is more efficient to reassign the parent of the smaller clump to rein in the growth of the implicit tree structure in these parent values. Remember to also update the size Information for the representative that you chose to make the parent of the other representative. Once the Clump object for partitioning n elements has been created, each operation sameClump, meld and clumpsize works in time that is for all practical purposes a constant independent of the total number of elements n. There are some technical details best left for the second course an algorithms, such as the fact that this seemingly "constant time" is not a constant for adult realsies, but a constant plus a certain weird little function of "inverse Ackermann" that is difficult to explain at this point of your computer science studies. This function depends on n and will grow without bound as n increases, but grows so excruciatingly slowly to make glaciers look like light beams in comparison. The author can guarantee without any hesitation that you will not find a slower-growing function than this one anywhere in the materials that comprise your undergraduate computer science education. Treating this function as an infinitesimally minuscule rounding error in the constant running time can never give you any hassle in any possible counterexample, not just in this real world of ours but in any imaginable one Page 173 of 271
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply