We also might want to look at only valid words in our data set. A word will be a valid word if all three of the followin
Posted: Mon May 02, 2022 11:41 am
We also might want to look at only valid words in our data set. A word will be a valid word if all three of the following conditions are true: • The word contains only letters, hyphens, and/or punctuation* (no digits). • There is at most one hyphen '-'. If present, it must be surrounded by characters ("a-b" is valid, but "-ab" and "ab-" are not valid). • There is at most one punctuation mark. If present, it must be at the end of the word ("ab,", "cd!", and "." are valid, but "a!b" and "C.," are not valid). NB: for this question, the 3rd condition will also apply to apostrophes despite "valid" words containing them. Write a function valid_words_mask(sentence) that takes an input parameter sentence (type string) and returns the tuple: (int, list[]), where: • int is the number of valid words found. • list[] contains the booleans True or False for each word in sentence (in sequence) depending on whether that word is valid. *Assume that a punctuation mark is any character that is not an alphanumeric (except for hyphens, which are handled separately as per the instructions). For example: Test Result (4, [True, True, True, True]) sentence = "these are valid words" print(valid_words_mask(sentence)) (0, [False, False, False]) sentence = "!this 1-5 b8d!" print(valid_words_mask (sentence)) (2, [True, True, False]) sentence = "mciheal mnefiodonvass? W-O-W" print(valid_words_mask (sentence)) sentence = "it's Minecraft, not Mine-Craft!!" (2, [False, True, True, False]) print(valid_words_mask(sentence))