The gaming website Polygon took some minor heat this week when it retroactively updated its review of the newly released SimCity. Polygon takes a unique route toward grading games: the site makes it clear in its policy that a review can be altered when something about the game changes. It exercised that policy by lowering SimCity‘s score from a 9.5 to an 8, then again to a 4, when server issues prevented the reviewer (and a lot of other players) from being able to play the game.
Here’s a snippet of the policy:
Polygon’s reviews and database have been built based on the idea of updates, or “bumps,” as I’ve called them. If a game changes in a substantive way, we can add an update to our reviews that informs you how and why, and we can modify our scores accordingly. This will appear on the reviews in question as a timeline of that game’s evolution and our corresponding recommendation (or lack thereof). The original review score will never vanish or go away, but our readers will be able to better understand where our opinions as a site reside over time for games we review.
That’s a great policy! Games are, for better or worse, more changeable in a world where you can download new content, or experience new issues, as a game ages. That puts them in a completely different category than movies or music, which, for the most part, are static throughout their lives. (Maybe Highway 61 Revisited gets an outtake or remastering, but it’s not fundamentally changing.) Technology does not affect your enjoyment of a movie or an album in the same way; nobody has ever given a movie a bad review because the projector screwed up.
On the surface, there’s reason to think scaled game reviews are helpful to consumers. An abstract opinion gets turned into digits, and those digits can be objectively compared. Can’t decide what to drop your $60 on? This system makes it easy. Art designed for consumption is difficult to quantify, and inherently encourages rating.
Trouble is, the system in the game review world, most popularly some variation of a 10-point scale with increments, is especially distorted. Take a look at this graph, via Joystiq:
It shows a sampling of reviews from the gaming websites IGN and GameSpot back in 2006, and you can see something similar today by taking a gander at GameRankings.com. Reviews tend to lean toward the higher end of their scales, closer to a seven or eight than a five average. (That could mean something devious, but I’m more inclined to believe it’s a natural function of people who like a lot of games reviewing those games.) That doesn’t matter if someone is trying to compare two games–they’re both on that distorted scale, after all–but if someone is making a decision to either buy or not buy a game, and isn’t familiar with the distortion in the ratings system, they might have some trouble. You’re reading a review and decide that eight, after all, is pretty good. Certainly above average. But what you might end up with is a middling game.
There are ways to solve this problem. You can turn a review, as Kotaku has, into a binary answer to the question: Should you play this game? (Maybe not totally binary: SimCity got a “Not Yet” from them.) You can get rid of reviews entirely. You could move the scale down to five and hope that mitigates the distortion. You can audit your reviews and see if they fall into a bell curve.
But the best and most honorable way to fix the issue to do away with scales entirely. Give the reader some credit: not everyone is going to skim down to the bottom, see what score a game got, and move on, and why should reviewers cater to that type of reader, anyway? Each person decides what to play based on personal tastes, and they should have to read a review to get the necessary context for their decision. In fact, I’d wager, gamers are more likely than other review-readers to weigh their options by reading the text of a review. A game, after all, can be a big investment in time and cash–at $60 for new home console games, it’s six times as much as a movie ticket or an album on iTunes–and that necessitates more information. You can still add and update to a review as time goes on and the games change, and because text (hopefully) has nuance that numbers can only guess at, it means any updates will seem less dramatic but more informative.
So what does a digit put down on a scale offer? At best, it’s a reflection of what’s in the review, which makes it repetitive. At worst, a distorted scale changes the reader’s opinion of the text, which nobody writing a review wants. Reviewers don’t spend all that time testing games, composing their thoughts, and creating a critical essay to have their work replaced with a number.