Tuesday, December 10, 2019

Show HN: Beating the Optimal Stopping Rule with Game Specific Knowledge

Ten Thousand Rounds of the Victor Game – Beating the Optimal Stopping Rule with Game Specific Number Distribution Knowledge

Author
codetrotter
Published
2019-12-08
Last modified
2019-12-09
Topics
simulation
statistics
mathematics
secretary-problem
applied-probability
decision-theory
Technologies
html5
vanilla-js
css
Page sources
https://github.com/ctsrc/simulated-victor-game
Round # Algo: AD Algo: CT Algo: OSR Algo: GN
Win%
Table 1: Simulation Results Summary for n rounds played.

The Victor Game, as I call it because the guy that created the game we are talking about here is named Victor, is a game that has characteristics similar to The Secretary Problem. (Page sources for his game are available on GitHub). The Victor Game differs from The Secretary Problem in that the numbers in The Victor Game are generated in a specific, known way. With this knowledge, it is possible to make better decisions in the game than one would do if one were applying the Optimal Stopping Rule from The Secretary Problem when playing The Victor Game.

User GistNoesis on Hacker News explains:

The distribution chosen here is ~ uniform for the number of digits. [github.com/victorqribeiro/googol/blob/7cb61ff/js/main.js#L10]

This is not the same hypothesis as the problem stated in the video [youtu.be/OeJobV4jJG0]. The video talks about the "secretary problem". In [the secretary problem] you know nothing about the distribution. You just know that the number are ordered.

Here because you know the distribution you can do better by using a strategy which choose to stop depending on the value of the number you have so far. For example if at the 15th pick you get a number with 99 digit, you can safely assume that it is better to stop now (every time you pick you have only a ~1% chance of finding a 99-digit number needed to do better so keeping it means you win (99/100)^(50-16) = 70%, whereas the non-exploitable strategy (i.e. optimal strategy for the secretary problem) 50/e=18 will tell you to keep going and you would lose 70% of the times).

For the secretary problem the theoretical way to sample the number is to set it up as a game where the first player plays the game and the second pick the numbers adversely so that the first doesn't win. A good strategy for the second player is to pick a distribution from a big set of varied distributions, and sample the number from this distribution.

Simulation

We simulate 10,000 rounds of The Victor Game, where we pit the original Optimal Stopping Rule (OSR) up against the algorithm described by GistNoesis (GN) for the specific way that numbers are being generated in The Victor Game. Additionally, we have two baselines for comparison:

  • Baseline 1: Randomly decide upon a tile to select prior to having gotten any information about what numbers are found in any of the tiles (apriori decision; AD).
  • Baseline 2: Simulate a coin toss (CT) which we look at to decide whether to stop or continue, giving us a 50% probability of saying stop each time we are asked whether to stop or to continue.

Results from the simulation are summarized in table 1.

Please note that the percentages at the bottom of table 1 reflect the percentage of rounds in which the "correct" choice was made for each of the four algorithms. The algorithms are given the same inputs for each round, but each of them do not "know" what the other algorithms are choosing and are not affected by whether or not the other algorithms choose the biggest number or not. In any round, they could all make the "correct" choice, or three of them could, or two of them could, or one of the could, or none of the could. In other words, it's not necessarily a case of one winner and three losers for each individual round.



from Hacker News https://ift.tt/2YuqJZz

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.