![]() Secondly, each socket has a very distinct reward. ![]() For the more advanced search mechanisms they can quickly locate and lock onto the best socket since there are so few to test. Therefore, even with the random search of the Greedy algorithm, there’s a 20% chance of finding the best socket. However, in terms of finding which algorithm is best, it has a couple of major drawbacks. The socket problem we’ve used up to now was deliberately made very simple, to allow the exploration and exploitation mechanisms of each algorithm to be illustrated. Since Thompson Sampling doesn’t require a parameter to be set, this isn’t an issue, and so this may be the deciding factor when choosing which algorithm to use. Making a bad choice for these parameters could lead to a degraded performance of the algorithm. However, it should be noted that both Optimistic Greedy and UCB require a parameter to be set (these parameters are the initial values for Optimistic Greedy and the confidence value for UCB). So, from the point of view of charging Baby Robot, any of the Optimistic Greedy, UCB or Thompson Sampling algorithms would do the job. It actually fails to reach the maximum charge in the time available. And, worst of all, is the Greedy algorithm.Since it continues to explore the set of sockets throughout the run it fails to fully exploit the best socket. Epsilon Greedy on the other hand takes slightly longer to reach maximum charge.So the optimal socket has been located quickly and then exploited to the full. Each has taken only 300 time steps to reach the maximum required charge of 3600 seconds of charge, when the maximum available charge from any socket is 12 seconds of charge. In each case, the regret for these algorithms is nearly zero.Optimistic Greedy, UCB and Thompson Sampling all reach the maximum required charge in approximately the same number of time steps (which is why the lines for Optimistic Greedy and Thompson Sampling are obscured by that for UCB).Figure 6.1: A comparison of bandit algorithms on the 5-socket power problem.
0 Comments
Leave a Reply. |