Distribution of tournament wins
August 19th, 2008
Someone started a thread on rec.gambling.poker where he is seeking backers for a series of online tournaments he wants to play. He links to a page on officialpokerrankings.com that gives his recent online tournament results to prove to potential backers that he’s a winning player.
OfficialPokerRankings.com is one of a handful of sites that purport to give summary statistics of the historical performance of online players. Others include sharkscope.com and thepokerdb.com.
I’m not particularly impressed by any of them. They don’t sample from the same online games, each has a slightly different bias. The one thing they all have in common is that they report simple descriptive statistics assuming that the samples are of independent, identically distributed random variables. This is even though the samples are clearly no such thing.
Our player looking for backing gives us some data that shows a win average of $28 per tournament (the last 120 days) – which is pretty good since most are low buyin events. I have put the data in a spreadsheet.
But the distribution of wins is a little iffy. One win is over 7 standard errors above the mean. That’s fairly huge. A clear outlier.
The median win is a loss of $11 per event.
A outlier is more than just an extreme value. It’s a value that’s so extreme that it violates any distributional expectations. There are some statistical and graphical tests based on std. dev, on medians, on boxplots, etc.
Omitting that 7 standard deviation outlier gives a mean win of about $2 per event, a pretty big difference.
Another slight outlier is almost 4 standard errors above the mean. I don’t think that’s enough of an outlier to call for exclusion (I think 7 standard errors is) but omitting it gives us an average loss of over $12 per event.
All any of this really tells me is that the statistical model that the stats published by that website is pretty much bullshit and and you can’t really draw any conclusions from those numbers as presented.
It’s not about sample size. It’s about a bad statistical model.
It’s really not about sample size, Patti.
There’s a lot of problems with the statistical model used by those who try to interpret a mean as meaningful. You can start with a lack of identically distributed random variables in the sample.
The sample is some kind of mixture. You have to answer Mixture of what? before you start fixinig on what statistics you should be calculating.
Means are almost certianly the wrong statistic. No matter what the sample size.
> No kidding… my big wins are outliers?
One of them is. Over 7 standard errors above the mean is a clear outlier.
I think you don’t understand what outlier means. It means there is something wrong. Sometimes it’s a data clollection error, in this case it’s a statistical model error.
The error is in assuming that the sampling is done from a common population. It’s likely not the case at all. Some other statistical model is needed
Wow. Blew my mind. Here I am
> thinking that I should average a final table in every 1000+ entrant event
> I play. Thanks for correcting my thinking.
When you have clear evidence that the data violates assumptions about independent, identically distributed sampling you need to rething the meaning of such an average.
Properties of Distributions, creating and using models, validation of models, skew, tournaments, outliers | Comments (0)
George Dantzig
June 23rd, 2008
George Dantzig is the guy who developed the simplex algorithm for linear programming and is often considered one of the fathers (if not the father) of operations research.
Michael Trick has a blog post about an upcoming international meeting of operations research academics and he includes a photograph of a everyone who attended the first such meeting in 1957. He asks if we can identify George Dantzig in the picture.
I don’t recognize Dantzig in the picture but was able to pick him out because of his height.
I met Dantzig once, at a conferance in the early 70’s.
I was a graduate student at LSU and my advisor at the time, John Pisa, had been a student of Dantzig. Pisa didn’t attend the conference but I had co-authored a presentation with Dick O’Neil of LSU’s computer science department on column generating techinques and I went (I was a grad student in the business school). Dick introduced me to John Tomlin who I got to know a little big at the conference Tomlin introduced me to Dantzig.
When I got back to Baton Rouge I told Pisa that I’d met Dantzig. All he said was, “Short, isn’t he?”
Uncategorized, academics | Comments (0)
Decision making and sugar levels
May 31st, 2008
I’m diabetic so my body doesn’t process sugar in the same way that most people’s body does. So the topic of this post might not apply to me, but even though I think it’s really interesting.
I usually think of Decision Theory as a blend of mathematics and psychology. As this article points out though it’s also a blend of mathematics, psychology, and biology.
ABSTRACT:
This experiment used the attraction effect to test the hypothesis that ingestion of sugar can reduce reliance on intuitive, heuristic-based decision making. In the attraction effect, a difficult choice between two options is swayed by the presence of a seemingly irrelevant “decoy” option. We replicated this effect and the finding that the effect increases when people have depleted their mental resources performing a previous self-control task. Our hypothesis was based on the assumption that effortful processes require and consume relatively large amounts of glucose (brain fuel), and that this use of glucose is why people use heuristic strategies after exerting self-control. Before performing any tasks, some participants drank lemonade sweetened with sugar, which restores blood glucose, whereas others drank lemonade containing a sugar substitute. Only lemonade with sugar reduced the attraction effect. These results show one way in which the body (blood glucose) interacts with the mind (self-control and reliance on heuristics).
REFERENCE:
Masicampo, E. J., & Baumeister, R. F. (2008). Toward a physiology of dual-process reasoning and judgment: Lemonade, willpower, and effortful rule-based analysis. Psychological Science, 19, 255-260.
H/T. Decision Science News.
Decison theory | Comments (0)
Poker Analysis
May 23rd, 2008
Machine Learning (Theory) is a computer science blog that often touches on topics related to mathematical modeling or analysis of poker. Even when he doesn’t explicitly mention poker.
In a recent post he talked about the Netflix recommender competition.
Machine Learning talks about three levels of model characteristics that he thinks are important, calling them the What, the Which, and the How.
What is about what characteristics of the data you’re going to focus on.
Which is about the mathematical or statistical model we will fit to the data.
How is the way we’re going to use the resulting model.
It’s an interesting post, just read it.
A recent post on rec.gambling.poker asks a question about poker analysis that has a focus on the Which part of the analysis and ignores the What part.
what percentage do you actually need to have be winners
and losers to do well in a session.
Such a question misses the point because it assumes the wrong What part of the analysis. Even commenters that realize that it’s the wrong question focus on the How part of the analysis and ignore the What part.
Let’s say that you managed to come up with a solid “X% of
pots necessary” number. How would you use that information?
You could use the information to go back to the What level and refocus on the data that will help you do the analysis.
All it really takes to have a winning session is to win 1 pot, so long as that pot is large and your contribution to all other pots is small. So the minimum percentage of pots won required approaches zero. It’s not zero, but how close to zero it gets depends on how big the pot you win is and how little you contribute to pots you win.
So we’ve identified the What part of the analysis. What matters isn’t pots won, it’s how much money other people put in the pots you win and how little money you put in the pots you lose.
That’s not a dramatic discovery. It’s a general truth that’s been known by winning poker players for a very long time. It’s something I talked about in my book and it’s something most poker books address. Certianly the ones worth reading address it.
Uncategorized, recommender system | Comments (0)
Forecasting WSOP entries
May 22nd, 2008
Last year I tried to use some crude mathematical forecasting models to forecast the number of entries to the WSOP and I failed miserably.
It’s that time of year again. Various blogs are speculating on the number of entries.
I’m going to be making a forecast again this year, but it’s going to be more adhoc than last year. First I’ll be looking at how fuel prices are effecting air travel. I’ll be doing that later.
Uncategorized, creating and using models, wsop predictions | Comments (0)
Some operations research educational software
May 16th, 2008
Transient and steady-analysis of discrete-time and continuous-time Markov chains up to 100 states. Calculates performance measures including queue-length probabilities and waiting-time probabilities for basic queueing models.
Mostly for demonstration purposes, this software has modules for coin-tossing, roulette, Buffon’s needle problem, central limit theorem, simulation of queues, traveling salesman problem, dynamic programming, linear programming, and integer programming
Uncategorized | Comments (0)
Pig, a bar-room dice game
May 16th, 2008
In Pig, two (or more) players compete. The winner is the first player to reach a score of 100. The players alternate rolling a die. There’s a clear advantage to going first, so you flip a coin to see who goes first.
At each turn you roll the die one or more times. If a roll results in a 1 then your turn is over and you get no points for that turn. If the roll results in any other number then you accumulate those points for the turn and get an option to roll again. You can continue to roll until you either roll a 1 (in which case all points accumulated for that roll are lost) or decide to stop.
Dice Games and Stochastic Dynamic Programming analyzes this game (and some related games).
dynamic programming | Comments (0)
Gambler’s Ruin and Natural Selection
May 15th, 2008
Panda’s Thumb has a post on the similarity between the mathematics of the gambler’s ruin problem and the mathematics of natural selection
In fact, the Gambler’s Ruin shows a similar behavior—its mathematics is similar to (but not identical to) the population-genetic case. If you toss coins with a stake of $1 against a house which has $1,999,999 to wager, and you both keep playing until one holds the whole $2,000,000, if the game is fair you will be the ultimate victor one time out of 2,000,000, and the rest of the times the house will win. But if you have a 1% advantage, so that on each toss you have a 50.5% chance of winning, you will be the ultimate victor nearly 1% of the time. Mostly you will be ruined, but you will bankrupt the house 20,000 times as often as you would if the toss were fair.
Uncategorized, Gambler's Ruin | Comments (0)
Sample size revisited
May 10th, 2008
Two plus two has another thread on that perennial question: How many hands do I need to play to know whether or not I’m a winning player?
How many hands to know…
if you’re a winning or losing player? 5,000? 10,000? ie. how long does it take for the bad beats/luck to even out enough so that you have an accurate idea of how you’re doing at poker?
The responses are the standard ones. Confidence intervals. Poker has large standard deviations. You need large samples. Blah, blah, blah.
None of the responses address the fundamental problem with trying to use classical confidence intervals to determine whether your win rate is “significantly” bigger than zero — non-stationarity. Your results stream doesn’t come from a constant distribution, the underlying mean and variance of the process changes.
The bigger the sample you use the more likely you’re dealing with a mixture of distributions that is extreme.
Bigger samples don’t give you more reliable estimates of your win rate. Classical confidence interval analysis requires an assumption of a sample stream of independent, identically distributed observations.
The answer is to look at smaller samples, not larger samples.
Pick one hand a day. Not randomly, but pick the one that involved the most money, the biggest one hand swing you had that day. Won or lost doesn’t matter. Then look at that hand in great detail. Did you make any mistakes? Should you have made that pot bigger? Should you have put less money in that pot? What prior information did you have about the opponent hand distribution? Was the result consistent with that? etc, etc.
That kind of analysis, a hand a day for 30 days, will give you tons more information about whether or not you’re a winning player than confidence intervals derived from the results of a million hands.
And it’s a statistically sound approach while confidence intervals are not statistically sound when you are not sampling from a constant distribution.
creating and using models, sample size | Comments (0)
Bodog and Blogging
May 5th, 2008
I try not do do much pimping for poker sites with this blog, but I’ll make an exception for Bodog. They’re a supporter of this blog and generally support poker blogging.
One of the ways they support blogging is by offering an added money, small buy-in invitational tournament every week to bloggers. If you don’t have a poker blog but want to play in the WSOP blogger tournament then just email me and I’ll sign you up as a contributer to one of my blogs so that you’ll be eligible to enter.
Don’t Miss Tuesday’s May 6th Bodog Poker Blogger Tournament!
Start Time 9:05 pm ET.Bodog is proud to host the Bodog Poker Blogger Tournament Series where poker bloggers worldwide are gathering each Tuesday to compete for cash and to be a part of Team Bodog 2008.
Details at http://www.bodogbloggertournament.com
This tournament is open to poker bloggers
For assistance with registration, call Bodog’s Poker Customer Service
at 1-866-909-2237 or email http://www.bodogbloggertournament.com/contact
Please include your blogger screen name, Bodog Member Account ID number
and your poker blog URL address.(If you are new to Bodog, please sign up at http://poker.bodoglife.com)they then need to find the Online Poker Blogger Tournament in the software and register as they normally would
Good luck
