Elo System

How the ELO Ranking System Works

The Fantasy Debate team presents a comprehensive national ranking system for Lincoln-Douglas, Public Forum, and Policy debate. The system is heavily based off of the ELO rating system as implemented for the US Chess Federation, which is widely considered among the most statistically accurate ranking systems. It is also used by some (official and non-official governing/ranking bodies) to determine The World Chess Federation (which has used the system for over 40 years), Major League Baseball, College Football, The National Scrabble Association, FIFA Men's Soccer, and Women's Soccer rankings (lets not forget Starcraft 2's hidden ranking system as well). The ELO rating system calculates a numerical ‘rating’ for each debater based off of his or her competitive history at varsity tournaments. A rating usually falls between 0 and 3000, and changes after every round. This aspect sets our system apart from systems before. Unlike previous debate ranking systems which estimates a debater’s strength based off of the strength of the tournaments field, our system calculates each and every round independently. In practice, our system is several orders of magnitude more accurate than previous systems. Every new debater starts off with a ranking of 1500. This is the average rating for the nation. When a round occurs, the system will look at the rating of the two debaters and calculate the percent chance that each of them will win. Depending on the actual outcome of the round, the rating will change according to the ELO formula. If you would like to see the actual formula we use for this here it is:

RE =1/(10^(-(A-B) /400)+1)

K(RA - RE)= Ratings change

A and B= First and Second Debaters; Ratings

RE = Expected Outcome

RA= Actual Outcome

K = A Constant of 32

RE is the expected outcome, A and B are the ratings of the debaters, K is the constant 32, which is the statistically suggested value for an initial rating of 1500. For example, if someone with the rating of 2200 debates someone with the rating of 1200, he is statistically very favored (2200 would represent a top debater and 1200 a relatively inexperienced debater). The outcome is logical—if the favored debater were to win, her rating would only increase marginally, and the opponent would have his rating decreased by the same amount; however, if the 1200 debater were to win, her rating would greatly increase as she was the underdog, and the higher rated debater would have her score decreased.

Finally, the national rankings are determined by placing all the debater’s rating in descending order. Intuitively, the debater with the highest rating becomes the top ranked debater in the nation.

This system completely eliminates bias (it can even account for side bias if we deem it statistically significant) and only becomes more accurate as we use it over time. We have decided to use a full years worth of data from 2009-2010 (over 15,000 rounds) to make this year’s data even more accurate. As time goes on, we will continue to add as many backlogged tournaments as we can find and you send us.

A note on All-Time rankings (why it seems inaccurate):

The All Time Rankings represent a long term goal for Fantasy Debate. As the ELO system is universal and can compare debaters between seasons, we intend to house a historical record comparing debaters intra-seasonally; however, until a few years of data has been entered into our system these rankings will remain skewed. Having two years of data for some debaters versus one year for others favors those debaters with more rounds. This is why you see inflated ratings for debaters from later seasons relative to those who have graduated.

A note on our LD data (why it may be inconsistent across pages):

All of our data is collected from public sources. This means that the data will have errors, typos, and other abnormalities. We take several steps to fix these errors, but not all of them are possible to fix. When we display our data, we try to give you the most accurate picture possible in the context of the page you are looking at. This means that some of our data may not seem consistent because we are calculating the same statistics from different datasets that have different error tolerance levels. Don't worry though, our data is actually more accurate than the public data because of the error checking we do.

A final note on our data:
Our statistics exclude rounds that do not actually happen at tournaments, such as coach-overs, byes, and forfeits. This sometimes causes a discrepancy between what people think their statistics are and what their true statistics actually are. Also, please remember that certain statistics can only be reliably gathered from pre-elimination rounds and are therefore not included in our database (These statistics are AFF and NEG rounds, PRO and CON rounds, and speakerpoints). Due to the way tournament directors sometimes record debaters who dropped out of a tournament, we might not be able to record your rounds against dropped debaters; however this is an extremely rare circumstance. We have worked for months to assure that our algorithms are correct and have added over a dozen automatic safeguards to assure that our information is as accurate as possible. Nevertheless, human error is always possible when entering data. If you believe there to be an error in your statistics, please contact Contact at fantasydebate.com and make note of it in the subject. We take the integrity of our data very seriously and will address your concerns as soon as possible.