Divide a text into structurally meaningful segments using
regular expressions
-compare changes in world frequencies
-use pre-assembled word sentiment scores to assign emotional
valences to segments of text.
with open('Austen_pride.txt', 'r') as f:
pride_raw = f.read()
[ ]
with open('Austen_emma.txt', 'r') as f:
emma_raw = f.read(
print(pride_raw[:99])
volumes = re.split('VOL.*', pride_raw)
[
for volume in volumes:
print('======')
print(volume[0:99])
volumes = volumes[1:]
# verify
for volume in volumes:
print('======')
print(volume[:99])
[ ]
pride_chapters = []
for volume in volumes:
chapters = re.split('CHAPTER.*', volume)
for chapter in chapters:
pride_chapters.append(chapter.replace('
', ' '))
for chapter in pride_chapters:
print('***')
print(chapter[:99])
pride_chapters = pride_chapters[1:]
for chapter in pride_chapters:
print('***')
print(chapter[:99])
[ ]
print(emma_raw[:25])
[ ]
re.findall('VOL.*', emma_raw)
[ ]
re.findall('CHAPTER.*', emma_raw)
[ ]
def chapter_segments(text):
text = re.split('VOL.*
chapter_list = []
for volume in text:
chapters = re.split('
for chapter in chapters:
if len(chapter) >
chapter_list.append(chapter.replace('
', '
for chapter in chapter_list:
print('***')
print(chapter[:50])
return(chapter_list)
[ ]
emma_chapters = chapter_segments(emma_raw)
After these analyses write a program to clean this
text.
data is this-----
VOLUME I
CHAPTER I
Emma Woodhouse, handsome, clever, and rich, with a comfortable
home
and happy disposition, seemed to unite some of the best blessings
of
existence; and had lived nearly twenty-one years in the world with
very
little to distress or vex her.
Divide a text into structurally meaningful segments using regular expressions -compare changes in world frequencies -use
-
answerhappygod
- Site Admin
- Posts: 899604
- Joined: Mon Aug 02, 2021 8:13 am
Divide a text into structurally meaningful segments using regular expressions -compare changes in world frequencies -use
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!