Page 1 of 1

This assignment is to write a C++ for natural language processing. General notes 1. Your assignment should be carefully

Posted: Thu Jun 02, 2022 8:09 am
by answerhappygod
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 1
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 1 (139.14 KiB) Viewed 18 times
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 2
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 2 (154.54 KiB) Viewed 18 times
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 3
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 3 (72.75 KiB) Viewed 18 times
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 4
This Assignment Is To Write A C For Natural Language Processing General Notes 1 Your Assignment Should Be Carefully 4 (44.83 KiB) Viewed 18 times
This assignment is to write a C++ for natural language processing. General notes 1. Your assignment should be carefully organised with the same kind of expectations as the previous assignments. 2. Please provide your compilation notes in README file, to detail your OS or compiler version. 3. Other than the initial command line input, the program should run without any other user input(s). User behaviour As a data scientist, you are required to perform some basic text mining on paragraphs (strings). Each paragraph is with the same length. We want to display individual paragraphs, find out how many different characters from paragraphs, and perform statistics on those paragraphs. You are to write the main program to support and demonstrate the functionality required here. Your program should compile to main and run as: ./main num len • num : A positive integer. The number of paragraphs. • len: A positive integer. The number of characters within one paragraph (or its length). Containers You are to implement two classes: • A Paragraph class for storing context. A Document class containing a collection of paragraphs.
Paragraph This class should be used to store characters (you can simply this as a single string). The following methods should also be provided. constructor: this Paragraph class should have a parametric constructor, while the input argument is len. That is, this constructor will produce a random string with the length of len. This string should be a mix of characters (both digital numbers and single letters), and it will be considered as the context of this Paragraph instance; • destructor: this Paragraph class should have a destructor; • display: this method should display the context of one Paragraph instance; • difference: this method should take another Paragraph instance to compare. The result is the element-wise difference from two paragraphs. For instance, the difference between paragraphs of abcde and abgze is 2: abcde (vs) abgze result: 00110 => 0 and 1 represents the same and different characters, respectively Document This class should be used to store collections of Paragraph instances. The following methods should be provided. • display: this method should display the context of all Paragraph instances; • minDifference: this method should determine the minimum difference across all Paragraph in- stances within this Document; • characterStatistics: this method should determine the minimal and maximal count of characters across all Paragraph instances within this Document;
Example ./main 3 10 As num = 3 and len = 10, so you should generate context for 3 paragraphs. Each paragraph has 10 characters. The display function is used to print out contexts for all paragraphs (again, the context should be generated randomly): PM696PpV39 ZsDzdeQ7mR 10W95W79ZN => this is for the 1st paragraph => this is for the 2nd paragraph => this is for the 3rd paragraph 2 For 1st and 2nd paragraph, we have: PM696PpV39 (vs) ZsDzdeQ7mR => 1111111111 => 10 for 1st and 3rd paragraph, we have: PM696PpV39 (vs) 10W95w79ZN => 1110111111 => 9
for 2nd and 3rd paragraph, we have: ZsDzdeQ7mR (vs) 10W95w79ZN 1111111111 => 10 As such, we have minDifference = min(10,9,10) = 9; For characterStatistics, the output should be like: min count as characters vs counts: R 1 z 1 w 1 s1 pl m 1 11 e 1 d 1 W 1 V 1 01 Q1 N 1 M 1 D 1 51 31 max count as characters vs counts: 94