What’s Up Watson: Observations From IBM Watson’s Player Rankings

Keggers
4 min readMay 19, 2021

In the past few years we’ve seen big leaps forward in how to understand professional Overwatch. We’ve had a huge surge in the integration of stats through the Stats Lab. We’ve had the introduction of the Replay Viewer which has allowed people to analyze every small move from every single player in the game. These features have greatly advanced our understanding of the game and introduced much more nuance into discussions about how good players and teams are. This season the Overwatch League announced their partnership with IBM to bring IBM’s Watson into the professional Overwatch scene. Specifically Watson would be tasked with understanding and distill players and teams performances

There’s a lot of unknowns about how Watson actually evaluates. We don’t have access to that information but maybe we can get inside the machine and try to understand a bit more of it’s thought process and what stats or factors it values after one cycle of tournament play.

So what roles did Watson rate as the most impactful?

Firstly I wanted to look at how Watson evaluated specific roles for this cycle and what roles were seen as the highest scoring for each team. For this I looked at the highest scoring player on each team and simply marked down what position they played. The split looked like this:

14 Tanks, 2 Damage, 5 Supports. Watson finally showing us the truth we all knew in our hearts. Damage players really don’t matter.

Jokes aside, if you were a tank in this cycle then chances are Watson saw you as the best player on your team. Conversely the split for those who Watson saw as the lowest scoring player on the team shaped out like this; 3 Tanks,6 Damage, 11 Supports. At this point it’s probably best to highlight the difference between the two sub roles at Tank and Support. As you can tell from the chart above the Tank players that scored the highest on their team according to Watson were mostly Main Tank players. Specifically the split between the 14 Tank players that scored the highest was 8 Main Tanks to 6 Off Tanks. For all 5 of the Supports that scored highest on their team, they all were Flex Supports. While for the 11 Supports who scored the lowest on their team, 9 of them were Main Supports.

In summation Watson scored Main Tanks the highest and Main Supports the lowest.

Has Watson identified any players it thinks is hard carrying their team?

Watson’s player rankings so far tend to cluster players near their teammates. If you go to the Watson power ranking page right now and just take a cursory glance you’ll see that Watson rates team performance as being a big part of the players performance. This essentially means that if we are going to find a standout player we have to look at their score relative to their own team so first let’s look at the gap between the Watson score of a teams highest scoring and second highest scoring player.

As we can see for the most part the difference between the top player on a team and the next player is not that big. Apart from Han “Silver3” Haibo on the L.A Valiant most of the point differences between the top players and their closest teammate is usually between 1 & 6. This would lean into a theory that Watson, as of right now, doesn’t really divorce player performances from their teams overall performance. What about Silver3 though? What about him makes him so much better than the rest of his team? If i’m honest I have no earthly clue. Considering Watson’s evaluation is performed purely using the statistics available from the Overwatch engine it might be theoretical possible to reverse engineer some idea how what Watson values in a players performace. Naturally we can’t know what Watson values but just looking at the raw numbers that the rest of the Valiant players have put up for this cycle it’s difficult to understand WHAT exactly Watson is valuing.

This is an outlier though and while it may seem odd it should by no means be taken as a negative. This is only some observations and potentially deductions from the first tournament cycle of an endeavour that we’re going to learn more and more about over the course of this season. It’s important to note that we still don’t really know what these player ranks will tell us. What Watson ends up telling us about a players performance might not be what we want out of Watson. I know for me I would love for Watson to be able to help evaluate players performances on bad teams but judging by the opening series of results from this cycle, that might not be something that Watson does. What Watson will give us is still to be seen and I for one will be intersted to see how it develops.

--

--