(20%) Consider the following toy example: Training data: I am Sam Sam I am Sam I like Sam

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899603
Joined: Mon Aug 02, 2021 8:13 am

(20%) Consider the following toy example: Training data: I am Sam Sam I am Sam I like Sam

Post by answerhappygod »

20 Consider The Following Toy Example Training Data S I Am Sam S S Sam I Am S S Sam I Like S S Sam 1
20 Consider The Following Toy Example Training Data S I Am Sam S S Sam I Am S S Sam I Like S S Sam 1 (49.53 KiB) Viewed 34 times
(20%) Consider the following toy example: Training data: <s> I am Sam </s> <s> Sam I am </s> <s> Sam I like </s> <s> Sam I do like </s> <s> do I like Sam </s> Assume that we use a bigram language model with Laplace smoothing based on the above training data. a) Give the following bigram probabilities estimated by this model: P(dol <s>) P(do Sam) P(Saml <s>) P(Sam|do) P(1|Sam) P(I do) P(like) Note that for each word Wn-1, we count an additional bigram for each possible continuation Wn. Consequently, we have to take the words into consideration and also the symbol <s>. b) Calculate the probabilities of the following sequences according to this model: (1) <s> do Sam I like (2) <s> Sam do I like Which of the two sequences is more probable according to our LM?
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply