Well, what the odds makers want to do is get the same amount of money to be bet on each team in a game. And if that happens, because of the 10 to the 11 odds, the losers pay off the winners and there's a little bit money left over for the casino. So if there are two bettors on a game and each bet's $11, that's $22 bet, and one wins, the other loses.
So the $10 pay is paid off out the $11, that's a $1 left for the casino. 1 out of 22 is 4.5%. So if the casino gets an even split of the money on a game, then they make a 4.5% profit on that game. It doesn't matter which team wins. So that's what they want to do. Now, if one team has an 80% chance of winning and the other team has a 20% chance of winning, whatever that means, but everybody likes the 20% team, then the point spread is going to be shifted https://casinoslots-sa.co.za/online-casino-list. So to get more money bet on the team, that really should be the one that everybody's betting on. So the betting public is foolish, the point spread may give a statistical advantage to the smart bettor who knows that the money's coming in on the wrong team. So that's different from roulette. On the other hand, if the betting public is-- it's different from roulette in the sense that it's sort of like the stock market. The odds are determined by the public, and if the public doesn't know what they're doing, the bettor who does know what he or she's doing can take advantage of that situation. On the other hand, if the betting public is smart, equalizing the betting action will also equalize each team's chance of winning, and the game will be equivalent to a coin toss. So you don't want to get 10:11 odds for betting on a coin toss. So in order to win in the long run, you have to find situations in which the betting public isn't smart. In other words, in which your chance of winning is greater than 50%. So it turns out that if you just look at some basic situations, like how often does the home team win, well if you look at the years 1993 to the present, the home team actually wins the game almost 60% of the time. But when you factor in the point spread, it's almost a 50/50 split. So if you just think I'm just going to bet on the home team because I know they win 60% of the time, you're going to lose because the point spread completely equalizes that. So what do people do? Well, in one form or another, a lot of sports bettors do data mining. Of course, some of them don't actually use software, they just pour through data by hand. But nowadays there are a lot of sophisticated sports bettors who actually use software. So in general, data mining software is used to find predictive patterns in large data sets. So people use data mining for all kinds of things-- investment strategies, web surfing patterns just to try to find patterns. I wrote some software to search football data that sort of points out some of the issues of data mining. I was actually motivated by badgering from some attorney friends who were avid sports bettors and didn't like doing all the stuff by hand. They thought it would be great to automate the process of trying to find good betting situations. So I did that a couple years ago. It basically uses sequel to query a football database. It has a user-friendly interface necessary for lawyers. There's a procedure in the software that automatically generates query, so it does a random queries. And it finds good results and saves them in a separate file, and just forgets about bad results where good means better than coin tossing. So it uncovers trends or possibly-- it uncovers trends that the user wouldn't think of. There's no best strategy when you look for patterns like this. But instead you try to collect a bunch of good strategies, such as it turns out that by data mining from 2000 to 2006, the Baltimore Ravens were 17 and 3 versus the point spread. That's an 85% success record when they lost their previous game and their opponents play their previous game on the road. So the question, of course, is for any thoughtful data miner, is this really meaningful or is it just a random coincidence that you found by data mining? So first of all, you have to define what a good strategy is. Win percentage doesn't work because 1 and zero is 100%, and that's, of course, meaningless. 17 and 3 is only 85%. So as I mentioned a minute ago, one way to define good, or weird, or out of the ordinary is to compare the strategy to an appropriate probability distribution. In this case, coin tossing. So you see if the strategy does a lot better than coin tossing. It turns out that that 7 and 3 record of the Ravens, if that's really coin tossing. In other words, if it's just a random fluke. The chance of getting something like that is about 1 in 1,000. So just use the binomial distribution. For those of you who know basic probability distributions, the chance of getting 17 and 3 are better, out of 20 is about 1 in 1,000. So a statistician would reject the null hypothesis, which means you would say that's not just chance. So in other words, that would be considered, in some context, a good strategy. So the question is should you bet on Baltimore in this situation? Well, interestingly enough, technology changes the meaning of unlikely and context in situations like this. There's something called the law of very large numbers, which says given enough opportunity, weird things happen just due to chance. So when you're mining through data doing lots of queries, saving the good ones-- by good, I mean good compared to coin tossing-- discarding the bad ones, the meaning of 1 in 1,000 sort of changes a little bit. So in fact, using a modest PC, this particular data mining software does about 1,000 queries a minute when it's doing the automated query thing. It's generating queries at random, analyzing to see if they're better than coin tossing, saving them in a file if they are. Anyway, that's lots of opportunity for weird things to happen just due to chance. So even if you generate this data by tossing coins, you'll find 17 in 3 result about once a minute. And you can turn this on and let it run all night, so you can find all kinds of great stuff. So statisticians consider something statistically significant if the chance of occurring is less than 1 in 20 under a suitable hypothesis. In the data mining environment, even 1 in 1,000 doesn't necessarily mean rule out chance. So more generally, these kinds of techniques in which you're doing repeated queries and analyzing them sort of comes under the general heading of multiple comparisons. You're testing lots of statistical hypotheses at once. What happens is you might find some useful things. You might find some strategies in which you'll win money, but a lot of the stuff you're going to find is just random stuff. So that's one of the issues with-- it's one of the big issues when you're doing things like data mining of this type that you have to deal with. And of course, there are some standard procedures for dealing with that. The main remedy is you apply your strategy that you've developed by looking at some data to a fresh set of data and see how it works there. So what statisticians would call it, or decision theorists, you'd have a training sample and then a new data set that you'd apply your thing to. So if it's just due to chance from mining the training set, it won't work on the new data. Or you can do a simulation, do a lot of random queries and compare your results to a probability model. So for example, if you're doing a lot of random queries on this football data, and you draw a graph of the results, you should see a record of 17 and 3 occurring with a frequency of about 1 in 1,000 if it's just due to chance. If you get some weird pattern in your histogram or whatever graph you're looking at, then maybe there's something other than coin tossing going on. Any questions so far? AUDIENCE: Well, it seems like you could also just bet on all of these clues. And the ones which are still-- which are useless are going to be 50/50 for you. It would be a small amount, as we've seen today. And the ones which are actually useful will win the money. DR. MICHAEL ORKIN: That's exactly what lots of people do, except with the part about winning money. It turns out that the 10:11 are a little bit deceptive, and you have to win more than 52.4% of the time in order to make a profit because of that. But that is a standard strategy. People they, either by hand or using software, find lots of these types of situations that look like they give you an edge, and they'll just bet on them. And they'll even write them, like this one's really good. So I'll bet more on the one that's really good than the one that's just sort of good, and I think some people are modestly successful doing things like that. Any other questions? So let's see, an earlier version of the software was featured in Wired magazine in 2002 in the street credit section. But let's actually look at the software for a sec, if I can get it up Here So this is called the optimizer. Let's say you want to see how the home team does from-- so it has this sort of user-friendly interface-- 2001 to 2006 in the month of September. Can you see that? I guess you can sort of see it. You get the statistics, and you can see that the home team from 2001 to the present, not including last week's games. So SU means straight up, they're 144 and 110. They won close to 57% of their games. But versus the point spread they're almost 50/50 at home-- this is just the home team. And this Z-value is sort of that's the measure of how far away from coin tossing it is. A Z-value, a large Z-value means very unlikely under coin tossing. There's another kind of bet in football called the over/under where you bet on whether the total score will be over or under a particular number. So this software keeps track of that, too. So if I wanted to look at a particular team, let's look at Atlanta. How did they do in September? So Atlanta's 5 and 3 on the road, 4 and 4 at home versus a spread in September and so on. Suppose I wanted to see which team had the best September record, looking at all records of this type. I just click optimize and I see that Jacksonville is actually 12 and 4 versus the spread in September, which gives them a-- so that's another one of these good records. So you can also look at what happened in the last game, won or loss. So let's look at games where the home team-- just playing around here-- won their last game and played at home in their last game, how did they do? So pretty much 50/50 against the spread. So this is basically just a statement of the query up here. Then you can also look at a list of the games for that query.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2018
Categories |