Deep Blue Deep Blue

Forums

Recherche
Forums » T2R Competitive Play - English » Design a ratings system!
Montrer: Messages du jour 
  
AuteurSujet
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Design a ratings system! Tue, 24 April 2007 18:37
Hi guys,

Erps and others have stated on several occasions that they think the current system doesn't work, so lets see what we come up with. I'll start out with my opinions.

First off, let me say that ANY system, no matter how perfect, will be manipulated/abused/tricked, there's no way around it... I think with open histories, I feel this is not that big a deal, we can see the manipulation easily.

First off, I like the current system as a whole, but I think it could definately use some tweaks. As I see it, there are a few major failings of the current system. One, the ability to play non-rated games. The ELO system is designed to give you a RELATIVE score to everyone else, and by not counting some of the games, anyone who does this is effectively abusing the system. (Lets not get into the various reasons y'all play unrated, unless y'all want to discuss that in another thread).

Another is the fact that most players are most interested in their Ranking, rather than their rating... while the system is generating their rating, the Ranking is sorta just on top because we like it. This would change with a ladder system, but I think that would be worse, and I'll get into that in a minute Wink

Most importantly to many, the ELO system allows you (with ALOT of time) to beat low rated players and still earn (some) points. I think this is fine... I don't care who you play, winning, what 24032 games in a row, like kid did, is damn impressive, and I'd challenge anyone to come close. I think he was rewarded appropriately in the ratings for such a streak. On the other hand, there are players out there that refuse to play people that don't think they can easily beat, and those people do benefit in the long run.

Not, a ladder system would help this, but I think the HUGE barrier most system cause to new players is not worth it. With most ladder systems, playing someone below you on the ladder is basically just a kindness, and you get nothing for it. There are already (IMO) too many people who set up games as 'top 300' and whatnot, imagine how bad that would get with a ladder? Not to mention the fact that guest opponents would either have to not count, or be included despite non paying (both of which DoW would hate)

What I would propose is modifying the current system in the following way.


1) Make games where your opponent is within, say 50 points or you (or if you prefer, some amount of rankings) worth twice as much. (or perhaps 1.5 times, we'd have to test things out a bit) This would weight games against opponents of (theoretically) similar skill more than others. Since you lose as many points as you win, I don't think this would increase snobbery too much, while at the same time making it hard for top players to not play each other and maintain a top rating.

2) Set a max on unrated games (either some amount per week or, my preference, a %) This would still allow y'all to do tag team tournaments, play with your kids, and even play unrated on an unfamiliar map, but NOT be able to use unrated games to protect your rating. Clearly, the % involved and the penalty for going over it (I would suggest a 1 point deduction for each game over) would need to be researched, but I think the theory is good.

3) The value of a game should decrease the more times you beat a particular opponent, regardless of relative ratings. For example, say I play 5 games against erps, and he wins all 5. The 6th game would be valued at, say 75%. He wins again, it goes down to 70... I win, the next one goes back up to 80. (Again, the details would have to be researched) The goal here would be to make multi-account cheating harder, and get rid of those people that get a very high rating simply because they are the best fo 3 or 4 players they play with regularly.


Ok, Have at it!



      
SYN Stephan1972
Senior Member

Messages: 306
Enregistré(e) en :
December 2006
Re:Design a ratings system! Wed, 25 April 2007 02:31
Just a few thoughts:

any ranking system will always be manipulated, even in the real world (just look at school or hospital league tables in the UK).

re: suggestion 1: greater weight to playing players of equal rank

surely all you achieve is that players who play equally good players will fluctuate twice as fast (if they really are equally good you will win 50% and lose 50% and both will end up at the same score). Those who avoid the top players will still steadily creep up. Perhaps a better solution would be that your score is only affected by games played against higher ranked players or those no more than 200 or 300 points below you?

re: suggestion 2: not sure how playing unrated games is a problem. surely this is the same as me getting the TTR board out at home and not telling anybody about it. maybe I have missed something here?

re: suggestion 3: I think this is potentially a good suggestion. How difficult would it be to implement?
      
Zeno
Senior Member
Cadet

Pages Perso
Messages: 582
Enregistré(e) en :
February 2006
Re:Design a ratings system! Wed, 25 April 2007 02:55
Great idea,

I have been working on putting together a 'power ranking' for various combinations of skill and was planning on opening a similar thread to get opinions on how the various measures should be rated.

Here is the skeleton of the system I was considering

1. Change all ELO scores to z-scores (z = (score-mean)/standard deviation). This will normalize the scores and recognize that 1700 in multi-player is better than 1700 in 2-player. It will also provide a partial answer as to which games are most strategic, as the greater standard deviations would be explained by differential in skill. It will also tell us which scores (across all 9 rankings) are the most impressive. I am currently working on getting a baseline for mean and standard deviation for the 9 rankings, but it is tedious work. This would be used to identify a top 30, whose scores would then be modified as below.

2. Subtract points for low level of play. Staying on the Established list requires 1 rated game every 15 days. My standard would be more like 20 rated games in one month for US, Europe and 2-player, 10 rated games in a month for the variants and multi. This would lessen the effect of coasting that annoys many.

3. Add points for level of opposition. To make it fairest I planned on measuring this many ways, and averaging the results. measures would include:
number of games v top 200
number of games v top 100
number of games v top 20
proportion of games v each category above
number of different opponents in each category above
ELO-swing in each category above (doubling ELO pluses and subtracting ELO minuses.

The additions and subtractions are meant to counter two of the most contentious issues in ELO, playing few games and playing mostly weak opponents. Like you, I don't think the ladder system would work. I would add tournament results, except for the time problems. There is no way to run a QT when everyone is awake. The latest version was meant to accomodate as many as possible, but consider the strange cases such as spudamon. Getting together an Australian team for NC seems to be a remote possibility, and there is no way to get Europe, N America and Oceania in a QT at the same time. One possibility might be to change the format, so that there are 4 sets of 16, with each set of 16 playing a mini QT in a set period. Then the final four meet by arrangement within the next two days. If that happened I might say that adding tourneys would be a good thing.
      
DrakeStorm
Senior Member
Champion du Monde AdR 2014

Messages: 1053
Enregistré(e) en :
March 2006
Re:Design a ratings system! Wed, 25 April 2007 04:41
A few points:

1.) I really don't think people understand the ELO system. There is this belief that if you just had enough time, could just play enough games, against low ranked players, your rating/ranking would slowly increase till you are the #1 (ranked) player in the world.

This is completely false (under a perfect ELO system). TTR's system isn't perfect, but it's good enough. At some point you level out (give or take some variation because of lucky win streaks or bad losing ones). The exact point depends on your skill. If you truly are a 1700 player, no matter how many low rated people you play, you will end up at 1700 again as long as you are not using some other manipulation to pick particular opponents, etc. And because the TTR ELO system isn't perfect you might end up at 1705 instead of 1700, but I contend that a 1700 player who only plays other good players will also end up at 1705, so who you play doesn't matter. [this would be for the individual maps because with the overall TTR rating you can mess with your rating by playing different maps, etc.].

2.) The biggest problem however is everyone has a different view of what is best. You might say, we are looking for who is the best at TTR. But what does that mean? Best at USA map? Best at USA map against other Tops? or Best at USA map, 2 Player, during CEST time, against other Tops that don't cheat, in a best of 7 series....

I think almost all Top players could come up with some stat where they are the best. thekid has the most #1s at one time, and highest rating (?), I have the highest rating on all 9 Rankings at 1 time, someone else has/had the best 2-player/Multiplayer ranking. Or forget ratings, and you have someone who has won the most QTs, or the most games in the NC, etc.

There really is no use trying to come up with a different ranking/rating system till there is some kind of consensus on what exactly we are trying to measure.

Also are we trying to come up with a system that requires DoW approval/implementation, or just some way to manipulate the available data and process it and post it on some third party site?

3.) If there was a way to stop cheating with open games, etc., one way to make the ratings more meaningful would be to change the k-value of the ELO system. Currently it is 8 (i.e the amount you and your opponent lose/win adds up to 8 ). If you dropped it down to say 4 or less for 'regular' games and increased it to 16 or more for 'tournament' games, then those players who play in tournaments against other (good) players would rise to the top, and it would be hard to catch up to them by just playing 'regular' games.

People who can't play in tournaments get penalized, but in any high level event the same would be true. You can't really be the best chess player in the world unless you goto tournaments and play against other good players.

Cheating however needs to be removed otherwise there is an even greater incentive for people to cheat during tournaments since that is where you would get the biggest gain in rating points.

Also there would need to be some guidelines on who can organize and run tournaments, and how they would be structured, and how disputes would be handled, etc. Also what their k-value would be. For example, maybe a QT would be k-value 16 and the NC k-value 32, etc.

[Mis à jour le: Wed, 25 April 2007 07:59]

      
erps
Senior Member

Pages Perso
Messages: 1633
Enregistré(e) en :
July 2005
Re:Design a ratings system! Wed, 25 April 2007 11:06
Hi

Don't get me wrong. The ELO system works. It has some problems as Drake in a very good post explained (more than one map -> all to overall). It's a measure of relative play strength but mostly for daily games. So you don't get the stress effect (of tournaments) or the fun effect of some not so ranking driven players (and i am still pretty sure, that most 1600+ players play a lot better in important games).

And DoW will never change this system in my opinion, because every other system needs more computing power and storage capacity and may be manipulated anyway.

So we (i) have to live with it. It's okay. But it's my right (and the right of others) to value some players (or better: playing styles) more than others. And that's the problem, Drake described it very good. We will never find a value system all or even most tops (don't speaking of ALL players, that's even harder) will support.

And this is the reason why i am always organizing events and i think that we need a league (but only as a additional event, not as a replacemen for ELO). And this is the only possible way, i think. The players who value tournaments or similar events (and personally i believe that is the majority [small maybe]) will have another number one than the players who value the ELO system. That's all. I am one of the tournament style supporters. If a tennis or golf player wins all grand slam (majors) in a year, this is for me the best player in the world even if he only plays these 4 or 5 tournaments and others win more minor tournaments or "daily" games and are ranked one or two.

Ah, and one point again and leaving the discussion based on measurement: For me it's important, that a good player "lives" (better: plays and posts and talks) to some standards and there are some players setting standards and spurn them after a while. That is a matter of respect for each other. I don't think we get THIS into a ranking system Wink

bye, erps




      
SYN Stephan1972
Senior Member

Messages: 306
Enregistré(e) en :
December 2006
Re:Design a ratings system! Wed, 25 April 2007 11:34
DrakeStorm wrote on Wed, 25 April 2007 03:41

A few points:

1.) I really don't think people understand the ELO system. There is this belief that if you just had enough time, could just play enough games, against low ranked players, your rating/ranking would slowly increase till you are the #1 (ranked) player in the world.

This is completely false (under a perfect ELO system). TTR's system isn't perfect, but it's good enough. At some point you level out (give or take some variation because of lucky win streaks or bad losing ones). The exact point depends on your skill. If you truly are a 1700 player, no matter how many low rated people you play, you will end up at 1700 again




I am no expert on this, but surely your argument assumes that there is a linear relationship between the chance of winning and the number of points you gain for winning. This is true for the vast majority of games, but I imagine is not true at the extremes. If you pitch a very good player against a very poor player this will not be true. Let me explain using an analogy. Kasparov would always beat me at chess, but under the TTR ELO system would pick up some (very very very) small number of points. So with infinite time, playing only opponents like me, his ELO score would eventually reach infinity. This is because I would never, not once beat him in return. I accept that TTR is not chess, and I accept that (most of us) don't have unlimited time to devote the game, but I do think my argument shows that in principle the scores in the ELO system can be "manipulated" by playing only weak opponents many times.
      
Goscha
Senior Member
Vainqueur de la 4° World League AdR

Messages: 316
Enregistré(e) en :
January 2006
Re:Design a ratings system! Wed, 25 April 2007 14:16
Hmm
For example a 1750 ELO player battles everytime against players who are rated at 1200 ELO. Every lost costs 7,7 pts and every win gets 0,3. So he has to play 23:1 only to save his points.
Not so easy. Only to earn 0,3 points in 25 games you have to make a 24 game winning streak after the other.
Difficult enough against the lower ranked ticket freaks.

So I think the ELO System is nearly perfect.
Perhaps it can be tuned a little smarter.
Now the 1 point winning-losing-difference is at 90 ELO pts.
Perhaps the formula can be changed that it is at 70 ELO pts.

Another point:
Perhaps i stand alone with Drake.
I read, that many players are against the new ratings and that the overall ranking is still the one and only.
I say: "Different games - different high scores, the new rankings are worth more than the overall ranking. And the overall ranking is clearly false.
You can climb to the top by playing only one map.
It's like a decathlon athlete wins a battle only by jumping very high.
I think, the overall ranking has to be a ranking which counts all scores of all maps.
If we want to find out, who is the best "Overall TTR Player" we need such a Highscore list. The current overall ranking is only a fake." Perhaps another Ranking can be added called "Highest Scores" in the following style.
1. 1950 Lutz ........... USA
2. 1943 Willi............USA
3. 1930 Klaus............1910
4. 1925 Peter............Europe
...
...
...
...

[Mis à jour le: Wed, 25 April 2007 14:40]

      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Wed, 25 April 2007 17:04
stephan1972 wrote on Tue, 24 April 2007 20:31



re: suggestion 1: greater weight to playing players of equal rank

surely all you achieve is that players who play equally good players will fluctuate twice as fast (if they really are equally good you will win 50% and lose 50% and both will end up at the same score). Those who avoid the top players will still steadily creep up. Perhaps a better solution would be that your score is only affected by games played against higher ranked players or those no more than 200 or 300 points below you?



The idea is that if you're a good player, you'll gain more points playing higher quality opponents, while not making it pointless to play lower rated players (which I think is important. If you NEVER play random games, you're right, this would have no effect... it would, however, make it harder to 'duck' quality opponents to attempt to preserve your rating, which some people feel is done.

stephan1972 wrote on Tue, 24 April 2007 20:31



re: suggestion 2: not sure how playing unrated games is a problem. surely this is the same as me getting the TTR board out at home and not telling anybody about it. maybe I have missed something here?



Essentially, by playing a non-rated game, you're goign 'outside the system' and since the ELO system is totally based on relative strength, removing any game from that system essentials warps your score. For example, say your at 1700, and you play a friend that doesn't play on line much, and is at 1300, and lose two out of three unrated games.,, you SHOULD have lost about 14 points. Therefore you rating is now showing at 1700, but it SHOULD BE 1686. Over enough games, it evens out, but if you play alot of unrated games, it will warp your rating.

I admit that this is more of a pet peeve than anything, but I do feel it does adversely effect the system.
      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Wed, 25 April 2007 17:23
Some other comments:

@ Zeno: I think a z-score is a FANTASTIC idea, with one problem... we don't have the data. Since we dont know where guests are at, the info we obtain would be severely skewed. Now, since each chart is skewed the same, it might work out. The best part is that this could be done with existing data,, with some crazy Erps program. I'd say get on it! I think you're other stuff might be too complex, even to just look at for people who aren't statheads

@ Drake and Stephan: I agree 100% that ELO is way better than people think. Stephan mentioned chess with Kasparov, and for chess, I agree. With TTR, however, there's enough luck involved so that I bet even thekid playing dumbot might lose, say 1 out of 100... and the system makes it so this hurts enough to not change the rating much. As fa as weight tourney games more, this is what I was talking about when I suggested making games against close rated players worth more (you'd change the value of the game) The problem with your suggestion is implementation; obviously, a check box would be abused, and I don't think DoW would be willing to spend the time doing any sort of manual thing. Its a great idea though.

Its a good point about people having different idea of what is good, but that's the case with anything... take baseball.. some people think Home Runs rock, others like Batting Average... who's a better hitter, Ichiro or Barry Bonds? There's no system that would be definative for all people.
For example

@ Erps: A league would be great fun, when you gonna start that Wink

Its clear that you value performance under pressure (such as a tournament) over consistent everyday play (such as thekid's streak) I personally disagree, but that's OK Smile
think for your purposes, any rating system is fine, you're just looking for a way to rank people for tourneys, am I right? Wink


      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Wed, 25 April 2007 17:25
One last thing, I really like the idea of a 'high score list' .. pehaps even 'highest ever' could be included in the profile?

That would be really nice, and could give you an idea if you're playing someone new whose rating isn't solid yet, someone who used to be good but doesn't play much anymore, etc.
      
SKMorefield
Senior Member

Pages Perso
Messages: 619
Enregistré(e) en :
January 2005
Re:Design a ratings system! Wed, 25 April 2007 20:20
Goscha wrote on Wed, 25 April 2007 08:16

Hmm
For example a 1750 ELO player battles everytime against players who are rated at 1200 ELO. Every lost costs 7,7 pts and every win gets 0,3. So he has to play 23:1 only to save his points.
Not so easy. Only to earn 0,3 points in 25 games you have to make a 24 game winning streak after the other.
Difficult enough against the lower ranked ticket freaks.




This is yet another reason why some players (like me) only play tops. I don't have the TIME to spend to play 23 games to make up for a hard-luck loss. There really is no option for me to move up the ranks this way, even if I enjoyed it and wanted to pick on noobs all the time.


SKM
      
Goscha
Senior Member
Vainqueur de la 4° World League AdR

Messages: 316
Enregistré(e) en :
January 2006
Re:Design a ratings system! Wed, 25 April 2007 21:52
I think the problem doesn't exist in reality.
No top player likes to play noobs everytime.
Most of the players play only the tops.
Some players like TheKid, TPrail, An-Team and me play all comers.
From over 7000 games i opened 90%. And I didn't saw a top who opened a game like "only guests" or "below 1400". If i play, i play often even at night when tops aren't available except Phil.
Some others say:"It's not fair that the tops only play under themselves." For me the discussion is not worth. I have no strategy to choose my opponents. I never went into a game of a low ranked player. But if some want to play me, i allow it to them.
Why should that be punished?
Imho, the ELO System considers the differences between the opponents very good, but i hear often between the lines that some tops disapprove the gaming style "playing all".
And if some say, it's easier to climb on that way, i can even say " It's easier to stay in the tops by playing only other tops".
And nobody can say who is right.


[Mis à jour le: Wed, 25 April 2007 21:53]

      
psteinx
Senior Member

Pages Perso
Messages: 324
Enregistré(e) en :
November 2005
Re:Design a ratings system! Wed, 25 April 2007 22:09
ELO is a reasonable enough system, provided you're comparing apples to apples.

If everyone only played 2 player US, ELO would be pretty solid.

The problem is that all of the ratings, to various extents, are comparing apples to oranges.

'Overall' is the worst. You've got thekid, who for weeks (months?) was at the top of the list, despite playing a different map and largely different opponents than almost everyone else. Yes, he achieved a high score in Swiss and Overall and 2 player (the 3 things his game points went towards), but I'd argue that only the Swiss measure was a truly valid indicator of his strength, relative to other Swiss players.

There are 2 main reasons for this:

1) Each map/player count combo has a different luck factor. Thekid himself has posted about this - I think his conclusion was that Swiss had very low luck, US more, Europe more still, and Mega the most (or something like that). If there are 2 players of equal skill, and one is playing a game variant where luck will allow more mid-level players to beat him/her, then that player will have a lower overall and 2 player rating.

2) I'd venture that most players are most familiar with the US map, and spend most of their time there, venturing out only occasionally. But when they play different maps, their skill level is less than it would be on the US map because they're less familiar with that map. A 1500 player (rating earned mainly on the U.S. map), will likely play like a 1400 or even 1300 player on the Swiss map, but they'll still give up points as a 1500 player. Now, if both players are equally unfamiliar with the map in question, then it's a wash. But obviously, some players are specializing in the more obscure maps.

Now none of this goes against kid or anyone else playing stuff other than 2 player US. But conversely, to the extent that kid's overall and 2 player ratings were earned in that Swiss environment, then you can't use his rating to meaningfully make comparisons against other players. Is he better/worse than player X? We don't know, because the ratings are comparing apples to oranges.

The only full solution to this would be an even more detailed rating system, tracking 2 player US, 3 player US, and so on. But that's far more complexity than DoW is likely to support, or most players would care about.

In the absence of that, I agree with Wildfire that it will basically have to be like baseball - we can argue who the best hitter is - the home run slugger, the OBP machine, the speedster, the all-around guy, etc. It makes it fun, and gives more people a little slice of glory that they can shoot for.

Of course, we all know that the best player is the player who achieved simultaneously #1 US, #1 Multi, #2 2 player, and #4 Overall.

Smile Smile Smile

[Mis à jour le: Thu, 26 April 2007 00:14]

      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Wed, 25 April 2007 23:34
@ Goscha : I agree with you 100%, and play the same way.. not sure I count as a 'top' though... I sorta flit on the edge of being at the top. I think if everyone did the same, we wouldn't be having this arguement, my ideas aer sorta a comprise to make everyone happy

@ phil : I agree here too, but I think if DoW reset the ratings completely when they did the split, we might have been better off. OVerall would still be a mismash, but at least it would be more comparable to the specialize rankings. As it is, the overall is sorta a different scale (which is why Zeno's Z-score idea is great)

      
DrakeStorm
Senior Member
Champion du Monde AdR 2014

Messages: 1053
Enregistré(e) en :
March 2006
Re:Design a ratings system! Thu, 26 April 2007 00:11
I'm thinking that the most meaningful ratings are the individual map ones and the multiplayer one. Overall and 2-player can both be manipulated by playing a map like swiss or big cities. You could also manipulate the multiplayer rating if one map was easier to play multiplayer, but I think the USA map is the easiest and that is an acceptable map to most people so no need for any change.

So for the 'purists' the USA map rating would probably be the most important, and for the 'daring' it would be multi-player!

If we just got rid of Overall and 2-player, I think it would be a start in the right direction in determining who the best player(s) are. After that, we would then have to really only deal with the playing low ranked players vs. playing high ranked players issue. And I think tournaments or leagues or something could help in that regard. Also there is the playing one game every 15 days problem, but I'm not as concerned about that - I think it should be 5 games every month instead of 2, but something like 10 is too many (if people really started playing all the maps, that would be 6*10 = 60 and that's a bit much).

I really think the overall rating and 2 player rating holds back the community ALOT. If those didn't exist, I'm sure alot of Tops would venture out and play Swiss or Europe or Mega... And if the Tops are playing those maps, other people would follow.

      
psteinx
Senior Member

Pages Perso
Messages: 324
Enregistré(e) en :
November 2005
Re:Design a ratings system! Thu, 26 April 2007 00:18
Drakestorm - you're forgetting a reason many players avoid maps other than US and Europe.

A high percentage likely don't have the CD-ROM version.

And of the remainder who do, many dislike playing through that UI. I count myself among this group.

So with few players able and willing to start Big Cities or Swiss games, there are naturally far fewer players. Look at the rankings boards, and there's a big falloff from US and Europe to any of the others.

      
DrakeStorm
Senior Member
Champion du Monde AdR 2014

Messages: 1053
Enregistré(e) en :
March 2006
Re:Design a ratings system! Thu, 26 April 2007 00:46
psteinx wrote on Wed, 25 April 2007 15:18


A high percentage likely don't have the CD-ROM version.

And of the remainder who do, many dislike playing through that UI. I count myself among this group.




The CD-Rom UI is better! Its just slowwww!

But you are correct.

So then forget those and just focus on USA/Europe/Multiplayer.

On a side note: DoW should change how you open games. If you own the CD-Rom game you should be able to open the new maps through the Java Applet. Also, they should give people access to the Maps if they buy a web card! I suggest 3 tiers = 1.) guests, 2.) people who get 6-month trail access by buying a DoW game and can open USA/Europe games, 3.) people who get full access (all maps) by buying a web card (buying the CD-Rom game would be like buying a 1 year web card). Also I think people who buy a web card should have some access to 'guest' and 'inactive' stats, that way we could compute ratings, etc. better.
      
Zeno
Senior Member
Cadet

Pages Perso
Messages: 582
Enregistré(e) en :
February 2006
Re:Design a ratings system! Thu, 26 April 2007 02:35
It seems that I have offended some people with my suggestion that the power rankings should include a bonus for level of opposition. This would benefit the 'top 200 pw 200' players and hurt the open table players. I just want to say that I have a lot of respect for players such as Drake, Goscha, kid et al who have the skill to play open table to such a high level. I made the suggestion for 2 reasons.

First, the current rankings currently favor the open table players, since the lower level players are still overranked. This is your moment to shine, the new tables are skewed in favor of those who play open table.

Second, I want a measure of how well people would do head to head. In this case those who play only other tops are honing skills that give them an advantage in such meetings.

BTW, the suggestion is not meant to take anything away from the current ELO-leaders. Players such as kid are doing remarkable things with the new ELO tables. If a power table was put together, then as soon as kid reached the top in all of the other rankings, he would probably go after that one as well, and likely succeed. The power ranking I envisioned is meant to be another thing to focus on, which would be of more interest to the 'top 200 pw 200' players.
      
thekid
Senior Member
Vainqueur AdR European Map Championship 2010

Messages: 1054
Enregistré(e) en :
December 2004
Re:Design a ratings system! Thu, 26 April 2007 04:07
I didn't see anyone being offended Zeno. All of the suggestions that provide a temporary bump like a 32 scale for tourneys will be just that a temporary bump. If a 1700 player meets an equal person in the QT for 5 rounds and wins all 5 they will be at 1780. Are they a 1780 player? No, they'll give it all back in the coming weeks, unless they up their skill level.

Even worse are the ladder ideas, I used to play chess on yahoo and the ladder leaders didn't have really high ratings, they must have been playing friends and cheating. It will also lead to an awful eliteist society, which I think is close to existing right now as is. You top players started here in a time when the top players would give you games. That didn't happen for my first 6 months here. Doing that to others will cause potentially good players to lose interest and leave which, I think is bad for the game as a whole.

Here's some numbers for you to look at which I think are interesting. I started playing the U.S. map 98 games ago exclusively. My overall was at 1850 and 2 player at 2009. Right now my overall is at 1850 and 2 player 2018. I have always believed that people who play everyone had their overalls surpressed by the fact that most people play to a lower multi rating than they do a 2 player rating. And this seems to support it.
      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Thu, 26 April 2007 04:09
DrakeStorm wrote on Wed, 25 April 2007 18:46

psteinx wrote on Wed, 25 April 2007 15:18


A high percentage likely don't have the CD-ROM version.

And of the remainder who do, many dislike playing through that UI. I count myself among this group.




The CD-Rom UI is better! Its just slowwww!

But you are correct.

So then forget those and just focus on USA/Europe/Multiplayer.

On a side note: DoW should change how you open games. If you own the CD-Rom game you should be able to open the new maps through the Java Applet. Also, they should give people access to the Maps if they buy a web card! I suggest 3 tiers = 1.) guests, 2.) people who get 6-month trail access by buying a DoW game and can open USA/Europe games, 3.) people who get full access (all maps) by buying a web card (buying the CD-Rom game would be like buying a 1 year web card). Also I think people who buy a web card should have some access to 'guest' and 'inactive' stats, that way we could compute ratings, etc. better.


I agree completely... I've never trying the CD version, but I can tell you the fact that you have to use it (and therefore are limited to the 'puter you install it on) to open the alternate maps convinced me not to by the CD-Rom.

If they changed it so that by having the web card input on your account, you could access the other maps through the java applet, I'd buy it tomorrow.

I'd love access to the guest stats too, but then any guest could get the info from someone not a guest, so DoW would never go for that.

      
SKMorefield
Senior Member

Pages Perso
Messages: 619
Enregistré(e) en :
January 2005
Re:Design a ratings system! Sat, 28 April 2007 00:56
DrakeStorm wrote on Wed, 25 April 2007 18:11

I really think the overall rating and 2 player rating holds back the community ALOT. If those didn't exist, I'm sure alot of Tops would venture out and play Swiss or Europe or Mega... And if the Tops are playing those maps, other people would follow.



This is probably a true statement, although I think at present the overall is still the most important rating. As someone who plays only 2 player USA, my overall, as expected, mirrors my ranking on those other 2 charts. Whichever map you play, the overall is comparing each player playing his/her map of choice with each other, with the most dominant at whatever map coming out on top (kid is more dominant at Swiss than anyone else is at USA).

I'm not sure we should get rid of the overall yet, but I do agree that I, for one, don't want to tank my overall rating to learn a new map, especially when there isn't anyone good playing those maps (for the most part), and the other maps (besides Swiss) have so much luck. Maybe most players won't admit it, but I have a feeling lots of others feel this way also.

I do a good enough job tanking my rating playing what I'm decent at without adding more obstacles! Sad

SKM
      
DrakeStorm
Senior Member
Champion du Monde AdR 2014

Messages: 1053
Enregistré(e) en :
March 2006
Re:Design a ratings system! Sat, 28 April 2007 01:40
SKMorefield wrote on Fri, 27 April 2007 15:56

and the other maps (besides Swiss) have so much luck.
SKM


This is probably the view of alot of other top players, but it is pretty much a false statement.

What does 'luck' mean in TTR. It might be different for other people, but for me it means - if you were a good player and just played a bunch of random guests, what would your win percentage be on a particular map. If the map is 100% skill, you should win all of them, and if the map is all luck you should win 50% of them (assuming a large enough sample size).

Well from my experience, I don't notice much difference in my win percentage between US, Europe, Swiss, or Big Cities. I notice its a little bit worse for Mega (I'm trying to play this map more to figure out if I can increase my percentage or if it really is just more luck ridden), and the worst is 1910 (don't think there is much hope for this variant).

People agree there isn't a huge amount of luck in USA, but there must not be that much luck in Swiss if thekid can win 100 games in a row, or Big Cities if I can win 40-50 in a row.

Just because other tops don't play a map, doesn't mean its because of the luck factor. They might claim its that, but I think its because they are afraid to lose their precious overall points. Not having the CD ROM game, or not liking the UI is fine as an excuse, but very few tops have ever joined one of my open Big Cities games (or probably thekid's swiss games), where those 2 factors wouldn't come into play.

Again I think the reason is the protecting of one's overall rating. What top player really wants to play against thekid on the swiss map, when he is obviously way better than everyone else at it (i.e. the map takes skill and thekid has it and the other tops don't). Basically other tops don't want to learn any other maps and just come up with an excuse not to play them. Some are valid excuses, others are not.

It takes time to master a map, and I know there are people who have families and jobs and other stuff and can't play that many games, and really don't have the time to learn a map (though I think I learned Europe and Swiss within 30-40 games). But if you are logging in 100's of games, then that shouldn't be an issue.

      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Sat, 28 April 2007 02:13
@ Drake -

Not having the CD-Rom isn't an 'excuse' I'd love to play more Swiss, and try 1910 and Big cities more, but I'm not really willing to pay for it as it is, since I play more during down time at work (on a Mac) than at home, so the only using it through the program is the big problem for me.

I've never seen you open a game on an alternate map, but if I did, I'd definately jump in if I could...same with thekid and Switerland.

You gotta remember, guests are VERY fast, since they can't play any other way.

Have you ever asked someone for a game?
      
Baron Von Schmidt
Senior Member

Messages: 235
Enregistré(e) en :
January 2007
Re:Design a ratings system! Sat, 28 April 2007 09:44
Goscha wrote on Wed, 25 April 2007 15:52

I think the problem doesn't exist in reality.
No top player likes to play noobs everytime.
Most of the players play only the tops.
Some players like TheKid, TPrail, An-Team and me play all comers.



Nope, I'm an all comer but tprail is scared to play me. Twisted Evil

Baron
      
Baron Von Schmidt
Senior Member

Messages: 235
Enregistré(e) en :
January 2007
Re:Design a ratings system! Sat, 28 April 2007 10:10
[quote title=Wildfire2099 wrote on Wed, 25 April 2007 11:04
Essentially, by playing a non-rated game, you're goign 'outside the system' and since the ELO system is totally based on relative strength, removing any game from that system essentials warps your score. For example, say your at 1700, and you play a friend that doesn't play on line much, and is at 1300, and lose two out of three unrated games.,, you SHOULD have lost about 14 points. Therefore you rating is now showing at 1700, but it SHOULD BE 1686. Over enough games, it evens out, but if you play alot of unrated games, it will warp your rating.

I admit that this is more of a pet peeve than anything, but I do feel it does adversely effect the system.[/quote]

May I suggest you go back and read the many many posts BEFORE there were unrated games about this exact subject? Then you will understand better. Unrated games are a MUST. I don't do them much (incredibly rare for me) but if others didn't have them as an option it would just be another problem.

I saw Pilke in the lobby tonight and she came to do one unrated game with Anu. That is not because Pilke is trying to protect her rating but because if she played rated it would have taken up a spot in the top 5 for 15 days. Pilke is just nice that way.

There are lots of other reasons, go look at the early threads.

Baron
      
SKMorefield
Senior Member

Pages Perso
Messages: 619
Enregistré(e) en :
January 2005
Re:Design a ratings system! Sat, 28 April 2007 13:44
This is true. Think of it this way: if you played a game at home with friends it's 'unrated' right? Same thing. Kim and I play this way (unrated) when we play, just so we don't have to shuffle the cards. It's a good option and doesn't take away from the rankings. There's another mentality altogether when you are playing a rated game. I do think ppl should have to play more rated games to stay ranked though, more than one every 15 days.



SKM
      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Sun, 29 April 2007 00:30
We'll just have to agree to disagree on this one.

      
Baron Von Schmidt
Senior Member

Messages: 235
Enregistré(e) en :
January 2007
Re:Design a ratings system! Sun, 29 April 2007 09:58
Wildfire2099 wrote on Sat, 28 April 2007 18:30

We'll just have to agree to disagree on this one.




You only disagree because you do not understand. I take it you have NOT read the arguments for rated games. I'm not going to go over every single one for you, go read them. If there were not unrated games, trust me you would be complaining about all the people who are friends or family that play each other where one of the players wins most of the time. You might even say "Hey that player is bumping up their rating by playing games against their 5 year old daughter!"

In fact no one would care much about Nando/Mikkel if he played unrated vs. his friends who let him win all the time. That is what he SHOULD be doing since we have unrated games

I am sure glad we dohave an unrated option, I add my thanks to the MANY players who already thanked you DoW when unrated games were made available.

As I said there are lots of other reasons.

Baron
      
Baron Von Schmidt
Senior Member

Messages: 235
Enregistré(e) en :
January 2007
Re:Design a ratings system! Sun, 29 April 2007 10:02
Le message n'a pas de contenu

[Mis à jour le: Sun, 29 April 2007 10:03]

      
kolmo
Senior Member

Pages Perso
Messages: 556
Enregistré(e) en :
November 2006
Re:Design a ratings system! Sun, 29 April 2007 20:54
Hello,

Notwithstanding all that was said (did not read all but will soon) here are two simple propositions :

P1. Never rate games with guests.

P2. If we want to keep in the spirit of ELO, we could use the Glicko rating system. See http://math.bu.edu/people/mg/glicko/ for more details.

The first one is a no-brainer. Not enforcing this policy creates deflation. As an old chessplayer, let's just say I know what I am talking about.

The second one could be debated, but is quite simple to implement. The idea is to estimate the faithfulness of a rating according to the frequency of play. The more you play, the more your rating becomes stable.

The fact that this system uses a Baysian approach to probability, though philosophically daring, is not that important. But the fact that this system is happily implemented at chess sites like http://www.freechess.org seems an interesting asset.



[Mis à jour le: Mon, 30 April 2007 15:06]

      
dewey
Junior Member

Messages: 1
Enregistré(e) en :
March 2006
Re:Design a ratings system! Sun, 29 April 2007 23:54
oops, wrong account. Embarassed BTW, I think I mentioned in another thread, but while in New Zealand playing on my despised laptop I have been using a second account called 'SecondAccount' but I have switched that to Zeno_alt

[Mis à jour le: Mon, 30 April 2007 03:31]

      
Zeno
Senior Member
Cadet

Pages Perso
Messages: 582
Enregistré(e) en :
February 2006
Re:Design a ratings system! Sun, 29 April 2007 23:59
@ Kolmo

Interesting link. First impression is that it does not match our needs, however. Chess players may take a year off tournament play and come back rusty, or they might have been playing actively and studying, but not entering any rated tourneys. As a result, time off leads to less certainty as to where they belong, and so the Glicko method is an interesting way of getting them to the rating they deserve, while not unduly helping/harming their opponents. While I suppose the same thing could happen here, the tendency is that time off only hurts. The best way to learn the game is to play the game, and the only reasonable place to practice the game against top competition is here. This does lead to two other pet concerns that are worth mentioning.

The erps-concern: erps has expressed the concern that if the giants of the past all came back and played a single game against a clueless beginner, then he could be knocked out of the top twenty. The people from the past probably do not deserve their ratings, because they have become rusty and the quality of play has improved, and erps has said that he would like to see them treated as provisional until they have played at least twenty games. A variant of this is best expressed by SKMorefield, and is that sometimes people panic after being on a long winning streak, and become low-frequency players. Ratings fluctuate, and sometimes a 1720 player goes through a bad spell and is down to 1620. Then again, that player might hit a hot streak and go up to 1790. Now they believe that they are over-rated, and so they coast, playing a rated game every fortnight, and tons of unrated games in between, so that they can keep that top 5 position. This blends into ...

The wildfire concern: There is a method of studying without taking a hit to the ratings, and that is to open unrated games. Sometimes it is a matter of working through a rough patch. We all have times when our timing is off, when we decide to take that six next turn only to have our opp. snatch it. To get our timing back we could use a second account or play unrated. Sometimes it is a matter of trying a new style of play. Sometimes it is a matter of trying a new map. Often it is a matter of coasting. Wildfire plays all games rated, including his experimental games and times when he feels he might be playing poorly or over-rated, trusting that ELO will in the long run return him to his proper level and that continuing to play is the key to improvement. He believes that learning outside of the system penalizes others. I understand his point, but I still think that there are substantial reasons for unrated games, and that the pros outweigh the cons.
      
kolmo
Senior Member

Pages Perso
Messages: 556
Enregistré(e) en :
November 2006
Re:Design a ratings system! Mon, 30 April 2007 02:05
Zeno,

The idea of the Glicko system is exactly to take care of the problem you invoke. As soon as you play, your rating tends to show your objective strenght, with or without study.

If it is overvalued, it will fall down quite fast, solving by the way the problem of the player not playing to keep its rating up.

If it is undervalued, it will go up faster than with ELO, solving the problem of the player not playing well his first 20 games, needing to play a lot to gain a decent rating.

This last problem is, according to Glickman, a big problem for the ELO rating.

Another big problem with applying ELO ratings to T2R is that ELO ratings are more fit for series of games, like in a tournament environment, with bonus points to adjust undervalued ratings.

But the main problem with our pseudo ELO system is that it rates games against unrated opponent, which is not a well-thought decision, to say the least.

PS : Both chess and T2R aren't oriented toward proprioperception, so both should be affected by lack of practice. But we're still largely in the dark on this matter, as usual in cognitive science.
      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Mon, 30 April 2007 05:11
Baron_ wrote on Sun, 29 April 2007 03:58

Wildfire2099 wrote on Sat, 28 April 2007 18:30

We'll just have to agree to disagree on this one.




You only disagree because you do not understand. I take it you have NOT read the arguments for rated games. I'm not going to go over every single one for you, go read them. If there were not unrated games, trust me you would be complaining about all the people who are friends or family that play each other where one of the players wins most of the time. You might even say "Hey that player is bumping up their rating by playing games against their 5 year old daughter!"

In fact no one would care much about Nando/Mikkel if he played unrated vs. his friends who let him win all the time. That is what he SHOULD be doing since we have unrated games

I am sure glad we dohave an unrated option, I add my thanks to the MANY players who already thanked you DoW when unrated games were made available.

As I said there are lots of other reasons.

Baron


I've read them... playing the same person over and over again doesn't bother me, which you would see by my responses in those threads you're refering to. If fact, IIRC, I defended Nando some.

If you beat your 5 year old daughter over and over again, she's gonna have a pretty low rating, and not impact things much.

IMO, the ONLY valid reason for a non-rated game is a tag team match, since that is basically a game variant, and not a standard TTR game.

Playing unrated because you want to try a new map, or are playing a friend, etc, MOST DEFINATELY adversely effect the entire ratings system... I'll grant oe single game has a small impact, but there is still an impact, as I've stated.

You're description (or perhaps it was Erps) about how great it was for Pilke to play an unrated game so as not to take up a spot is a perfect example of people not understanding ELO.

ELO is not about a RANKING, its about a RATING. Erps' 1702 or whatever is still a 1702 whether that is the 18th highest score or the 25th, the ELO system doesn't care. Its not about RANKING. Its about determining how far above or below the average you are.


[Mis à jour le: Mon, 30 April 2007 05:23]

      
Wildfire2099
Senior Member

Messages: 654
Enregistré(e) en :
June 2005
Re:Design a ratings system! Mon, 30 April 2007 05:17
That Glicko system is interesting, but I'm not sure it would fit for TTR.. I don't think people not playing enough is really much of a problem. If it was, I'd agree that seems a good fix.

      
Baron Von Schmidt
Senior Member

Messages: 235
Enregistré(e) en :
January 2007
Re:Design a ratings system! Mon, 30 April 2007 09:45
Wildfire2099 wrote on Sun, 29 April 2007 23:11



You're description (or perhaps it was Erps) about how great it was for Pilke to play an unrated game so as not to take up a spot is a perfect example of people not understanding ELO.



yep I don't understand ELO. You caught me. Good thing I gave you a perfect example. Twisted Evil

And what you don't seem to understand is how reality works. Very Happy Smile Very Happy You say you read all the posts BEFORE the unrated games existed? hmmm. Then you would clearly see that simply by not having unrated games not much in REALITY would change. People would just play with 2nd and 3rd guest accounts (a more horrible "evil" to be sure) or even 2nd and 3rd paid accounts. They would use these for testing out games, playing friends and whatever.

In other words YOUR point for getting rid of unrated games is pointless. It would change nearly nothing.

ELO works best for games where luck has a VERY small (or no) factor. It works when there is ONE account used or perhaps when multiple accounts but they all play the same variety and number of games.

Chess uses the ELO system yes? So tell me Wildfire, is every game that Kasparov plays "rated"? If he plays a friend in his home does he need to report the game to the World Chess Federation (or however it works). I think not. Therefore there are "unrated" games all the time in any sport or competition. Now I realize I am a perfect example of someone who does not understand the ELO system Laughing Laughing (and to be honest I don't even care to look it all up and try to become an expert) but I understand reality (most of the time! Shocked and simple logic.

Unrated games are wanted by the vast majority of the players and since the game exists for the community and not the other way around, they are therefore necessary.

Baron
      
kolmo
Senior Member

Pages Perso
Messages: 556
Enregistré(e) en :
November 2006
Re:Design a ratings system! Mon, 30 April 2007 15:03
Baron,

Your argument is valid : a system does not depend on rating every game played. In fact, most chess games are only rated when playing in tournaments.

But Kasparov stopped playing a little while ago. He is now into politics. Maybe you've seen him arrested while protesting against Putin, a few weeks ago.

***

As for Chrissmmmmmmm concern : yes, the ELO system tends to stop players from playing. Chessplayers get invited because of their ratings. Keeping a sound ELO can sometimes mean that you can eat something else than bananas and baloney. That is certainly not a metaphor.

As soon as the T2R community creates tournaments, the same problem will probably show up. Part of this problem belongs to morals, which isn't our business to deal with. But part of the problem belongs to mathematics, which is our business to deal with.
      
psteinx
Senior Member

Pages Perso
Messages: 324
Enregistré(e) en :
November 2005
Re:Design a ratings system! Mon, 30 April 2007 20:14
Re: Wildfire's criticism of unrated games.

I see the following reasons to play unrated:

1) Any kind of variant rule system - not just the TAG team that you mention, but any pre-agreed variant between players. Before the new US maps came out last year, tp and I (and I think a few others) tried a number of variant rules (enforced by mutual agreement) to make the game more interesting.

2) Learning a new map. You think a player should be rated even while learning a new map. I tend to disagree. The problem is that if you go into a new map blind and learn it as you go (rated), you will emerge from your provisional status with a fairly low rating that will take a LONG time to claw your way out of. This is especially true of the variant maps that get less play, because you're also more likely to play provisional players AFTER your provisional period is over.

For instance, I recently started playing Europe again (after a small # of games about a year ago). I played a couple of unrated games, but more or less just dove in. The result was that I lost a bunch during provisional, but was actually quite good by the time my provisional period was up. But it took me a looong time after the end of provisional for my ELO rating to match my estimation of my 'actual' rating. Basically, the period of instability and rapid learning lasts longer than 20 games, but post-provisional, it takes a while to get your score up to a reasonably accurate figure.

3) Playing in 'mellow' mode. For those of us who care about our ratings/rankings (likely everybody in this thread), there may be times when you want to play, but want to take it easy. (Perhaps a multi-game when you've had a beer or two). Why not have unrated games for these purposes, if everyone knows what they're getting into?

That said, I think some tops perhaps abuse the system and play too high a percentage of unrated games. I do it very rarely myself. But I still think it's valid to have unrated games, and agree with Baron that if they didn't exist, many players would just use guest accounts for much the same purpose.

[Mis à jour le: Mon, 30 April 2007 20:17]

      
kolmo
Senior Member

Pages Perso
Messages: 556
Enregistré(e) en :
November 2006
Re:Design a ratings system! Mon, 30 April 2007 21:31
Phil,

What you say reminds me of an important feature of chess servers : they distinguish ratings according to the time of the game (lightning, blitz, etc) and the variant (bullet, double, Fisher, etc).

These distinctions suggest (in a remote way, I admit) that we could separate casual from tournament ratings. The idea would be simply to measure competitive strenght with competitive play. We could then use something like the ATP system for tennis, or stick with a ELO-related system. If we stick to ELO, we should at least stop rating games with guests, which is beyond my understanding.

This way, if you want to be on top, you'll have to win tournaments, not just lots and lots of games against any well-chosen player around.
      
psteinx
Senior Member

Pages Perso
Messages: 324
Enregistré(e) en :
November 2005
Re:Design a ratings system! Mon, 30 April 2007 23:58
Well, I think it's all sort of moot anyways, because I really doubt anyone outside of the DoW realm will be able to implement any 'revised' rating system - just too cumbersome, and likely to be ignored by players anyways.

And frankly, I'd rather see the DoW folks work on other issues relating to TTR instead of further tinkering with the ratings system.
      
Pages (2): [1  2  >  » ]     
Sujet précédent:Nein... Du Bist Mein Freundin Nicht Mehr (To Zugbegleiterin)
Sujet suivant:QUICK Tournament QT 01/07 2007-04-14 Saturday 19 CEST
Aller au forum: