were given n experts. Then over any sequence of T rounds, and any expert i, we had Inn E number of mistakes by RWM) < (1
Posted: Sat Nov 27, 2021 10:30 am
were given n experts. Then over any sequence of T rounds, and any expert i, we had Inn E number of mistakes by RWM) < (1 + 5)mi + E Here mi is the number of mistakes made by expert i until time T. In Section 4 of the notes, we observed that mi <T, so dividing the above by T and choosing ε := we get that for any i Inn T mi Eſrate of mistakes by RWM) < + 2 T In n T I.e., for any expert i (which includes the best expert at time T) the average regret (versus that expert), which is our mistake rate minus that of the expert's mistake rate, goes to zero as T +0. Inn T (a) Above, we assumed we knew the time horizon T and hence could set ε = What if we don't know T? Here's one algorithm: for s = 1,2, ..., play 28 rounds of RWM (starting from scratch) with ε = Inn 28 (VA Inn T Show that the average regret of this algorithm after time T is O (b) Here's a different extension. Now you don't just want to compare yourself to the best you could have done by choosing a single expert and sticking with them. Call an deterministic algorithm K-fickle if over the time horizon T, it follows the advice of some expert iſ for the first tį steps, then i2 for the next t2 steps, etc, and then ik for the last tk steps, where each i; € [n], t; 20 and -1 t; = T. (Assume you know T, else you can use the "guess-and-double” idea from part (a).) Give an algorithm such that for any K-fickle (deterministic) algorithm A, E[# mistakes by your algo] < (# mistakes by A)(1+8) + O(K log(nT)) E Your algorithm is allowed to run in time (nT)º(K),