Library of Congress Python 3 Project PROJECT DESCRIPTION: You are working in the Library of Congress as a data entry int
Posted: Sun Jul 03, 2022 11:24 am
Library of Congress Python 3 Project
PROJECT DESCRIPTION:
You are working in the Library of Congress as a data entry intern. The library is looking to overhaul some of their older works and wants to make sure that they are correct and get some basic data from each file. As you open your assigned files, you discover that they have all been scrambled together. The lines from each of your three texts have been placed in random order and the only clue that you have is the identifier at the end of each line.
Each line in the scrambled file contains the line in the text file, a line number, and a three-letter code that identifies the work. Each of these items is separated by the | character. For example,
Your task is to write a program that reads each line in a text file (specified in sys.argv), separates and unscrambles the texts, and collects the basic data you’d first set out to collect. For each text, you must determine
Its longest line (and the corresponding line number),
its shortest line (and corresponding line number), and
the average length of the lines in the entire text.
If there are multiple lines with the shortest or longest length, use the line number as a tiebreaker: earlier lines are “shorter” and later lines are "longer". The average should be rounded to the closest integer.
The summary of data should be stored in a file named novel_summary.txt. The summaries should be sorted by three-letter code and should be formatted as follows:
The texts themselves should be stored in a file named novel_text.txt. This file should contain the three-letter code for a work followed by its text. The lines must all be included and should be ordered and should not include line numbers or three-letter codes. The texts should be separated by a single line with five - characters. The result should look like the following:
MY CODE FOR THE FIRST TWO BOOKS (I've already sorted the scrambled text into lists of lists that I've named my_woo, my_ttl, and my_alg):
I am now using these for loops to calculate the longest, shortest, and average lengths of the lines. I my IDE tells me that there is an error on line 74, " total_line_lenths_ttl += len(line[0])NameError: name 'total_line_lenths_ttl' is not defined ". I'm not sure what is going wrong?
27 longest_woo = 0 28 shortest_woo 29 average_woo = 0 30 longest w = 31 shortest_w = = 9999 32 total lines_woo = 0 33 total_line_lengths_woo 43 44 45 46 47 48 49 58 51 52 53 34 num_lines_woo = 0 35 for line in my_woo: 36 37 38 39 40 41 42 11 11 50 else: = 0 # when we create the for loop, we need to establish the how we are going to store # so we make variable names and assign placeholder values if len(line [0])<shortest_woo: shortest_woo = len (line [0]) num_lines_woo += 1 shortest_w = line [0] shortest_w_line = line[1] total_line_lengths_woo += len(line[0]) elif len(line [0]) >longest_woo: longest_woo = len (line [0]) num_lines_woo += 1 longest w = line [0] longest_w_line = line[1] total_line_lengths_woo += len(line [0]) total_line_lengths_woo += len(line) num_lines_woo += 1 average_line_woo = round(total_line_lengths_woo/num_lines_woo) 54 sorted_woo = sorted (my_woo, key-lambda x: int(x[1]))
60 longest_tt1 = 0 61 shortest_ttl 62 average_ttl= 63 longest_t = 64 shortest_t = 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 = 9999 0 65 total lines_ttl = 0 66 total_line_lengths_ttl = 0 67 num_lines_tt1 = 0 for line in my_ttl: if len(line 11 11 [0])<shortest_ttl: shortest_ttl = len(line[0]) num_lines_ttl += 1 shortest_t = line [0] shortest_t_line = line[1] total_line_lenths_ttl + len(line [0]) else: elif len(line [0])>longest_ttl: longest_ttl = len(line[0]) num_lines_ttl += 1 longest_t = line [0] longest_t_line = line[¹] total_line_lengths_ttl += len(line[0]) total_line_lengths_ttl += len(line) num_lines_ttl += 1 average_line_tt1 = round(total_line_lengths_ttl/num_lines_ttl) 85 86 sorted_tt1 = sorted (my_ttl, key=lambda x: int(x[¹])) 87 print (average_line_ttl)
PROJECT DESCRIPTION:
You are working in the Library of Congress as a data entry intern. The library is looking to overhaul some of their older works and wants to make sure that they are correct and get some basic data from each file. As you open your assigned files, you discover that they have all been scrambled together. The lines from each of your three texts have been placed in random order and the only clue that you have is the identifier at the end of each line.
Each line in the scrambled file contains the line in the text file, a line number, and a three-letter code that identifies the work. Each of these items is separated by the | character. For example,
Your task is to write a program that reads each line in a text file (specified in sys.argv), separates and unscrambles the texts, and collects the basic data you’d first set out to collect. For each text, you must determine
Its longest line (and the corresponding line number),
its shortest line (and corresponding line number), and
the average length of the lines in the entire text.
If there are multiple lines with the shortest or longest length, use the line number as a tiebreaker: earlier lines are “shorter” and later lines are "longer". The average should be rounded to the closest integer.
The summary of data should be stored in a file named novel_summary.txt. The summaries should be sorted by three-letter code and should be formatted as follows:
The texts themselves should be stored in a file named novel_text.txt. This file should contain the three-letter code for a work followed by its text. The lines must all be included and should be ordered and should not include line numbers or three-letter codes. The texts should be separated by a single line with five - characters. The result should look like the following:
MY CODE FOR THE FIRST TWO BOOKS (I've already sorted the scrambled text into lists of lists that I've named my_woo, my_ttl, and my_alg):
I am now using these for loops to calculate the longest, shortest, and average lengths of the lines. I my IDE tells me that there is an error on line 74, " total_line_lenths_ttl += len(line[0])NameError: name 'total_line_lenths_ttl' is not defined ". I'm not sure what is going wrong?
27 longest_woo = 0 28 shortest_woo 29 average_woo = 0 30 longest w = 31 shortest_w = = 9999 32 total lines_woo = 0 33 total_line_lengths_woo 43 44 45 46 47 48 49 58 51 52 53 34 num_lines_woo = 0 35 for line in my_woo: 36 37 38 39 40 41 42 11 11 50 else: = 0 # when we create the for loop, we need to establish the how we are going to store # so we make variable names and assign placeholder values if len(line [0])<shortest_woo: shortest_woo = len (line [0]) num_lines_woo += 1 shortest_w = line [0] shortest_w_line = line[1] total_line_lengths_woo += len(line[0]) elif len(line [0]) >longest_woo: longest_woo = len (line [0]) num_lines_woo += 1 longest w = line [0] longest_w_line = line[1] total_line_lengths_woo += len(line [0]) total_line_lengths_woo += len(line) num_lines_woo += 1 average_line_woo = round(total_line_lengths_woo/num_lines_woo) 54 sorted_woo = sorted (my_woo, key-lambda x: int(x[1]))
60 longest_tt1 = 0 61 shortest_ttl 62 average_ttl= 63 longest_t = 64 shortest_t = 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 = 9999 0 65 total lines_ttl = 0 66 total_line_lengths_ttl = 0 67 num_lines_tt1 = 0 for line in my_ttl: if len(line 11 11 [0])<shortest_ttl: shortest_ttl = len(line[0]) num_lines_ttl += 1 shortest_t = line [0] shortest_t_line = line[1] total_line_lenths_ttl + len(line [0]) else: elif len(line [0])>longest_ttl: longest_ttl = len(line[0]) num_lines_ttl += 1 longest_t = line [0] longest_t_line = line[¹] total_line_lengths_ttl += len(line[0]) total_line_lengths_ttl += len(line) num_lines_ttl += 1 average_line_tt1 = round(total_line_lengths_ttl/num_lines_ttl) 85 86 sorted_tt1 = sorted (my_ttl, key=lambda x: int(x[¹])) 87 print (average_line_ttl)