can you please solve this problem?

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

can you please solve this problem?

Post by answerhappygod »

can you please solve this problem?
Can You Please Solve This Problem 1
Can You Please Solve This Problem 1 (112.53 KiB) Viewed 9 times
Can You Please Solve This Problem 2
Can You Please Solve This Problem 2 (87.16 KiB) Viewed 9 times
1 Introduction AlgoNLP is a company that offers a language processing software. They are continuously enhancing their product but for lack of resources they outsource. some of the features to third party providers. One such feature in their pipeline is the Sentence Reconstructor. This feature reads an incorrect text block in which spaces and punctuation are removed and all individual words are joined together (possibly such texts is produced by old faulty scanners) and returns the correctly tokenized text (after consulting a proper dictionary) with spaces inserted between individual words. They are calling on you to implement the algorithms for this feature. Needless to say, they expect an efficient implemen- tation. 2 Definition The algorithm takes as input the dictionary D and the corrupted texts, returns as output the corrected text s. The dictionary D is a collection of words, and every word wED contains only alphabetical characters, and specifically contains no spaces or punctuation symbols. 2.1 Assumptions For the purposes of this assignment you may assume that all inflections of a word are contained in the dictionary D, e.g. works, worked, working can all be looked from D. Additionally, you may assume that no grammaticality or sensicality check is required, that is if it can be derived from the given input, the output may well be non-grammatical or nonsensical. If no deconstruction is possible for given output, you may output an empty string. 2.2 Input size Let d ID and n = |s be respectively the size of input dictionary (in words) and corrupted text (in characters). Then you may assume d≤ 5-104 • n ≤ 10³ 2.3 Input structure In your implementation you will read your input as one single text file with the following structure. • first line contains an integer value d such that 0≤d≤ 5-10° • second line contains the corrupted text s, such that its length n= [s] ≤ 103 the dictionary content follows in the next d lines, each word w is repre- sented in a line and Vw ED. | ≤ 50.. 2 2.4 Example input 1 Consider the input file below. The algorithm should be able to construct the corrected sentence as "happy families are I alike every unhappy family is un- happy in its own way" (opening sentence of Tolstoy's Anna Karenina). 22 happyfamiliesareallalikeeveryunhappyfamilyisunhappyinitsownway sad happy family families all one any are is all none alike unlike like every 250 each any unhappy its his their way 2.5 Example input 2 Consider the same input file as in Section 2.4, but with last line content changed to road instead of way. Then the algorithm is not able to deconstruct the given text and outputs empty text instead. 3 Requirements & Implementation You have to design an efficient algorithm that solves this problem, and then efficiently implement this algorithm in your favourite programming language. You should find an implementation that allows O(1) lookup for the dictionary. Then your solution should not depend on the Id), the size of the dictionary, but only on n the size of the corrupted text. You do not need to perform input validation, you may assume the input is valid according the problem definition and the assumptions given above. But you should expect problem inputs on the
upper bounds as given in Section 2. You also have to prepare an assignment report in you explain and analyse your solution. The assignment report should include problem statement your approach to solving it outline your algorithm detail any data structures that you have used for this purpose • show the mapping of the actual problem into the data structure domain that you have chosen identify any sub-problems within the general problem, if any • time complexity analysis of your implementation • code snippets from your implementation that you think are relevant (but not copy paste all source code) • anything else that you think is important You can work in pairs for this assignment. You may use any programming language that you want. If it is anything other than Java, Python, Javascript, Kotlin, C++, Go, C#, Scala or Ruby please notify the instructor beforehand. You need to submit (via Turnitin) the project report and (via email) the full project build. Your report must include data structures elaboration and justi- fication, algorithms and their time complexity analysis. 4 Extra credit If more than one reconstruction is possible, then output them all.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply