COMPSCI 119 LAB #4 – Character Pair Counts In this assignment you will be reading in a text file (that we provide to you

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

COMPSCI 119 LAB #4 – Character Pair Counts In this assignment you will be reading in a text file (that we provide to you

Post by answerhappygod »

COMPSCI 119 LAB #4 – Character Pair Counts
In this assignment you will be reading in a text file (that we
provide to you) and creating counts of all two-letter pairs that
occur in the file. There is an extra credit component to this
assignment. To illustrate the process, suppose that the file
contains a single line of text:
Computer Science Rocks!
This text contains upper case letters, lower case letters, and
characters that are neither (the blanks and the exclamation point).
The letter pairs are, in order:
CO OM MP PU UT TE ER SC CI IE EN NC CE RO OC CK KS
Note that lower case letters are converted to upper case, and
non-letters “break” the patterns (that is, the blank between
COMPUTER and SCIENCE prevents the letter pair RS from occurring).
Nonletters do not participate in the letter pairs otherwise.
We want to keep a count of each of these letter pairs using a
dictionary. The key to the dictionary is the two-letter upper case
string, and the value is the number of times that string occurs.
For example, if the letter pair "XY" occurs 3 times and the letter
pair "YZ" occurs 5 times, the dictionary would contain the
following: {"XY":3, "YZ":5} (remember that there is no guarantee
about the order in which the keys appear in the dictionary).
Your program will consist of two functions only. The first
function is ReadFileAsListOfStrings from the bottom of page 366 of
the Companion (the version that strips off line breaks from the
ends of lines read in from file). Type this function in exactly the
way it is in the book and do not make any changes to this function.
The second function is called Process, with one parameter – that
parameter is called Filename and is the name of the file to read in
and scan. You will not have a Main function in this assignment.
Your code framework will therefore look as follows:
def ReadFileAsListOfStrings (Filename):
Infile = open(Filename, "r")
X =
Infile.readlines()
Infile.close()
return
[C.rstrip("\n") for C in X]
def Process (Filename):
# All your new code goes here
return
Your Task:
Fill in the Process function so that it reads in the file
specified in Filename, computes the two-letter counts, then prints
them out in ascending order by letter-pair (alphabetical order).
This must work for any file name without changing your code to do
so. That is, if we have two files A.txt and B.txt in our program
folder that we want to check, we would do this by typing
Process("A.txt") on one line and Process("B.txt") on the next.
The file that we will want you to process is called
Gettysburg.txt and is available for download from the Moodle page
(put it in the same folder as your Python code). It contains the
text from Abraham Lincoln’s Gettysburg Address. To process that
file, you would type Process("Gettysburg.txt") at the >>>
prompt in the command shell. The first and last parts of the
expected printout are:
AB 2
AC 2
AD 5
AG 2
AH 1
AI 2
AK 1
AL 8

WE 11
WH 8
WI 1
WO 2
YE 1
This tells us that the letter-pair AB occurs twice in the file,
the letter-pair AD appears five times, the letter pair WE appears
11 times, and so on. You will have to figure out how to extract the
keys from the dictionary, sort them, and then use those keys to
print out each key and its count.
I need help with filling in the function that completes
def Process(Filename):
The Gettysburg address text that is needed for this
assignment is pasted below:
Four score and seven years ago our fathers
brought forth on this continent, a new nation,
conceived in Liberty, and dedicated to the
proposition that all men are created equal.
Now we are engaged in a great civil war,
testing whether that nation, or any nation
so conceived and so dedicated, can long endure.
We are met on a great battle-field of that war.
We have come to dedicate a portion of that field,
as a final resting place for those who here gave
their lives that that nation might live. It is
altogether fitting and proper that we should do
this.
But, in a larger sense, we can not dedicate --
we can not consecrate -- we can not hallow --
this ground. The brave men, living and dead, who
struggled here, have consecrated it, far above
our poor power to add or detract. The world will
little note, nor long remember what we say here,
but it can never forget what they did here. It
is for us the living, rather, to be dedicated
here to the unfinished work which they who fought
here have thus far so nobly advanced. It is rather
for us to be here dedicated to the great task
remaining before us -- that from these honored
dead we take increased devotion to that cause for
which they gave the last full measure of devotion --
that we here highly resolve that these dead shall
not have died in vain -- that this nation, under
God, shall have a new birth of freedom -- and that
government of the people, by the people, for the
people, shall not perish from the earth.
Abraham Lincoln
November 19, 1863
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply