We're down to the final two weeks in our Blood Bowl league, where the top 3 in each division will advance to the playoffs. Finishing in the first seed is worth a bye which will heal your guys and earn some SPP so it's pretty clutch to get all the way to first, but it's still important to hit the top 3. My division currently has 4 teams mathematically eliminated and 4 teams clumped up near the top. 4-0-1, 4-1, 4-1, 3-1-1. The 4-0-1 team still has to play both of the 4-1 teams in the last two weeks. The draw came between the 4-0-1 and 3-1-1 teams. And the 4-1, 4-1, 3-1-1 form a cycle where each beat one of the others. I was planning on looking at all the different ways the last 2 weeks could play out when I remembered the Sports Club Stats website. It simulates the games in the regular season of the NHL (and many other leagues) to paint a picture of who's likely to be making the playoffs. It also has a feature allowing you to upload your own rec league so I decided to look into creating a league on there to see if it could handle our league. It turned out to be fairly easy to do (I had to type out all the games for every week and include the scores for finished games) but I actually like doing repetitive data entry and it didn't take very long.
The results are here.
I believe the simulator places a fairly large emphasis on prior goal differential being a predictor of future results. He's been running the simulator for many years so I imagine this predicting method works well for real sports. There's been some debate about how reasonable it is for Blood Bowl. In hockey if you get ahead by one goal early you can't reasonably expect to just hold onto the puck for the rest of the game. In Blood Bowl it can happen, and can be the right play. In my game against Randy I went up 1-0 in the first half, got the ball in the second half, and refused to move. My dwarves just stood in a big pile punching any elf that dared try to get at the ball. If I'd tried to score then there would have been a reasonable chance at winning 2-0 or 2-1, but it would have opened up an opportunity to tie 1-1 or maybe even lose 2-1 since elves can do all sorts of crazy things. I took my guaranteed win (I think I took a shot to score at the end of the half and failed) and only added 1 to my goal differential. Is that actually worse than when Robb ran up the score and won 4-0? I don't know. My team doesn't have the explosive ability to score 4 times and his does, so maybe it is reasonable.
At any rate, my group (group 2) has the 4 teams sitting between 72 and 80% chance of making the playoffs. The simulator pegs the odds of the 3-1-1 team losing another game at 2.3%. That seems really low, and is likely a result of the goal differential thing. His opponents are a combined 3-7 with a goal differential of -10 so maybe that makes sense? I did enter the odds of a tie at 25% which seems to be playing out in his numbers. Maybe that number is too high? Group 2 has only had 1 tie in 20 games. On the other hand group 1 has 5 ties in 20 games. Oh, small numbers of dice.
Despite currently having the second most points in group 2 I am currently the slight favourite to make the playoffs and to finish first. I have the advantage of still playing Sceadeau who is the only person ahead of me. If I win both of my games I'm guaranteed to finish first. Sceadeau is in the same boat. Win out and come first. On the other hand Sceadeau has two 'hard' games on the schedule since he has to play both the 4-1 teams. Lose them both and he's in a world of trouble. The other 4-1 team is really hoping we draw. I think that's his best chance at the first seed. Win out, and have us draw. Otherwise he needs to win out while Sceadeau beats me, or win out which I beat Sceadeau and then lose to a currently 1-4 team. All quite plausible, and he does have a 26% chance of finishing first.
The other division is more open. One team is eliminated. One team is guaranteed to make the playoffs and is 93% to finish first. Even if he loses both of his games he still has a good chance of finishing first. The other 6 teams all have paths to the playoffs and there's lots of interplay between them still. This week alone has current #2 in points against #3, #4 against #7, and #5 against #6. I think the large number of draws to date is what's causing the clumping and the craziness. Most everyone there wants to win while seeing everyone else tie again.
As far as my team is concerned winning this week would be awesome, but a tie is still really good. I'm guaranteed in the playoffs with a tie this week and a win next week. A draw will damage my chances of coming first and getting that precious bye... But I like playing games so maybe I don't even want a bye? We're playing cross groups for the first round of the playoffs so I'd get to play someone I haven't played yet. Sceadeau is playing Khemri and only has one guy with an agility above 2. I need to aggressively target that guy and hope he doesn't regenerate! Alternatively it's tempting to spend some money on petty cash in order to drug test his 'passers'. I don't want him to get a second guy who can handle the ball, or a first guy who can really handle the ball!
Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts
Monday, March 04, 2013
Thursday, July 26, 2012
ELOBUFF
I found out earlier today that there's a League of Legends related website which is really serious about tracking stats for the game. It's called ELOBUFF and it sounds like it does everything I'd want in a stat site. Avid readers with a good memory may remember my post about positional statistics from a few months ago where I looked at the lolstatistics site and lamented a lot of glitches in their data. It was counting my Akali games in my jungler stats. It wasn't counting my Graves games anywhere. I play Ezreal both AD and AP. In both 5s and 3s. I had no way to possibly glean relevant data from what it was giving me.
ELOBUFF sounds like it solves those problems. It doesn't just look at the end result of a game. It looks at the items you built and the summoner skills you used in an attempt to more accurately determine your role. It claims to have a bajillion filters. It tracks most games, not just your games, and can give actual statistical 'counter-picks' based on the results of hundreds of thousands of games. It stays on top of current trends to let you know how the metagame may be shifting in terms of roles, rules, and items. It sounds fantastic.
Unfortunately there is no free lunch. They're running a subscription model with a $6 per month fee. It doesn't seem to have any sort of free trial which seems a little odd. I'd think letting everyone check out their data once would make sense. It did let me look up the info for the games played at the last MLG. There were 88 games played and Janna was played in 53 of them!
I found out about ELOBUFF because they're running a promotion with MLG for the Summer Arena. Anyone who buys an HD pass to the arena gets a free month on ELOBUFF. That pass is $10 ($8 for gold members) so it's really not much more than the ELOBUFF fee itself. I was planning on buying an HD pass anyway... Except the summer arena for LoL is during WBC. The internet in the hotel was really terrible last year and I can't imagine watching an HD stream would be feasible.
On the other hand I want to support MLG and I really want them to think League of Legends is a game worth supporting. So maybe I'll buy a pass anyway and just watch the VODs on the civic holiday Monday after WBC?
ELOBUFF sounds like it solves those problems. It doesn't just look at the end result of a game. It looks at the items you built and the summoner skills you used in an attempt to more accurately determine your role. It claims to have a bajillion filters. It tracks most games, not just your games, and can give actual statistical 'counter-picks' based on the results of hundreds of thousands of games. It stays on top of current trends to let you know how the metagame may be shifting in terms of roles, rules, and items. It sounds fantastic.
Unfortunately there is no free lunch. They're running a subscription model with a $6 per month fee. It doesn't seem to have any sort of free trial which seems a little odd. I'd think letting everyone check out their data once would make sense. It did let me look up the info for the games played at the last MLG. There were 88 games played and Janna was played in 53 of them!
I found out about ELOBUFF because they're running a promotion with MLG for the Summer Arena. Anyone who buys an HD pass to the arena gets a free month on ELOBUFF. That pass is $10 ($8 for gold members) so it's really not much more than the ELOBUFF fee itself. I was planning on buying an HD pass anyway... Except the summer arena for LoL is during WBC. The internet in the hotel was really terrible last year and I can't imagine watching an HD stream would be feasible.
On the other hand I want to support MLG and I really want them to think League of Legends is a game worth supporting. So maybe I'll buy a pass anyway and just watch the VODs on the civic holiday Monday after WBC?
Tuesday, January 11, 2011
Testing Profession Skill-Ups
When I was preparing to work on the Realm First: Illustrious Cooking achievement I found a formula to work out the odds of getting a skill-up depending on when a recipe turned yellow, when it turned grey, and current skill level. I didn't find any proof of this formula anywhere or anyone really giving any justification for it. I found it in a few spots and ran with it, but I needed far less than I thought I would to max cooking. Sky needed far fewer metagems than he thought he would to max jewelcrafting, too. Did we just get a little lucky or is the formula wrong?
I want to find out by testing the formula to see if it stands up for at least one recipe. The issue then becomes that I need a lot of repetition at the same skill level to have any confidence in estimating the true odds of a skill-up at that skill level. My first thought was to just unlearn cooking over and over since it has a recipe at skill 1 that uses only vendor materials. Unfortunately it turns out you can't unlearn secondary professions, so that won't work. I could delete the character every time but you need to be level 5 to learn cooking so I'd need to level the new character every time in addition to having to transfer funds to buy the cheap vendor mats. Alternatively I could run tailoring off linen cloth or engineering off rough stone if I could gather up enough stuff. Tailoring has the advantage that I could make actual greens to disenchant to mitigate some of the cost.
How much stuff is enough stuff? The answer to that question depends on how confident I want to be. If I was testing a single event (like a die or a coin) I'd probably want to be sure within half a percent 99% of the time. Now, n = Z^2/(4E^2) where Z is pulled from a table of values for normal distributions and E is the maximum error I'd accept so I'd need to flip that coin n = (2.5759)^2/(4*(.005)^2) = 66353. That's a very big number. There's an added wrinkle for me as well in that I'm not testing a single event, I'm testing like 60 of them. There's a different probability at every skill level, so I'm estimating the odds of 60 numbers, not 1 number. To maintain that level of confidence would require 133k linen cloth per skill level. Even worse, the first few yellow points will give me a skill-up right away so getting 66k trials on them requires powering the first 24 orange ticks almost that many times as well.
What I can do, though, is relax my confidence requirements. Bolt of Linen Cloth has 25 skills between yellow and grey, so I'm expecting a 4% drop between each skill level. So if I set my maximum error to 2% I should still see the gradation between different skill levels. With 25 numbers here I should see if the formula is way wrong or not as long as most of my estimates are good, so I don't need the 99% confidence either. 95.45% (two standard deviations) seems like it should be more than good enough. With those new numbers n = 2500. Ugh. Nevermind the fact I cant force the same number of trials for each skill level, plowing 5k linen into each of the 25 skill points is absurd. We're talking a third of the linen needed to open the gates of Ahn'Qiraj just to run this test. That's not going to happen.
What if I assume I want the last skill point to have the 2500 iterations? Then the first one only gets 100 iterations. That really hurts my confidence in the first few goes but does bring my needed linen down to 70k. Is this something I can do or should I just give up? I do also get to make something from the bolts, so I get to run half again as many trials on Brown Linen Pants. 70k seems semi-reasonable but not something I'm going to farm myself. I think I'll just start buying cheap linen on the AH for a while and see how many I end up with.
I can't start now anyway, as I need a way to collect the data. The easiest way to do this has to be writing a LUA mod to output information each time something is crafted. Does the language have access to the right information? Only one way to find out, and that's to actually learn to write a WoW mod.
I want to find out by testing the formula to see if it stands up for at least one recipe. The issue then becomes that I need a lot of repetition at the same skill level to have any confidence in estimating the true odds of a skill-up at that skill level. My first thought was to just unlearn cooking over and over since it has a recipe at skill 1 that uses only vendor materials. Unfortunately it turns out you can't unlearn secondary professions, so that won't work. I could delete the character every time but you need to be level 5 to learn cooking so I'd need to level the new character every time in addition to having to transfer funds to buy the cheap vendor mats. Alternatively I could run tailoring off linen cloth or engineering off rough stone if I could gather up enough stuff. Tailoring has the advantage that I could make actual greens to disenchant to mitigate some of the cost.
How much stuff is enough stuff? The answer to that question depends on how confident I want to be. If I was testing a single event (like a die or a coin) I'd probably want to be sure within half a percent 99% of the time. Now, n = Z^2/(4E^2) where Z is pulled from a table of values for normal distributions and E is the maximum error I'd accept so I'd need to flip that coin n = (2.5759)^2/(4*(.005)^2) = 66353. That's a very big number. There's an added wrinkle for me as well in that I'm not testing a single event, I'm testing like 60 of them. There's a different probability at every skill level, so I'm estimating the odds of 60 numbers, not 1 number. To maintain that level of confidence would require 133k linen cloth per skill level. Even worse, the first few yellow points will give me a skill-up right away so getting 66k trials on them requires powering the first 24 orange ticks almost that many times as well.
What I can do, though, is relax my confidence requirements. Bolt of Linen Cloth has 25 skills between yellow and grey, so I'm expecting a 4% drop between each skill level. So if I set my maximum error to 2% I should still see the gradation between different skill levels. With 25 numbers here I should see if the formula is way wrong or not as long as most of my estimates are good, so I don't need the 99% confidence either. 95.45% (two standard deviations) seems like it should be more than good enough. With those new numbers n = 2500. Ugh. Nevermind the fact I cant force the same number of trials for each skill level, plowing 5k linen into each of the 25 skill points is absurd. We're talking a third of the linen needed to open the gates of Ahn'Qiraj just to run this test. That's not going to happen.
What if I assume I want the last skill point to have the 2500 iterations? Then the first one only gets 100 iterations. That really hurts my confidence in the first few goes but does bring my needed linen down to 70k. Is this something I can do or should I just give up? I do also get to make something from the bolts, so I get to run half again as many trials on Brown Linen Pants. 70k seems semi-reasonable but not something I'm going to farm myself. I think I'll just start buying cheap linen on the AH for a while and see how many I end up with.
I can't start now anyway, as I need a way to collect the data. The easiest way to do this has to be writing a LUA mod to output information each time something is crafted. Does the language have access to the right information? Only one way to find out, and that's to actually learn to write a WoW mod.
Subscribe to:
Posts (Atom)