Or
Avoiding the 7-9 Scale: An Exercise in Different Review Scales

One Critic, Two Scores

This scale was first kicked around by Jon over at the Taipei Gamer Blog, and it caught my attention via the Penny Arcade news post "On Perspective." The idea is to evaluate the game in two separate but equal ways: in the ways that it uses the video game medium to its advantage, and the ways it carries itself as art. Jon explains further:
 

The L-Score [L for Ludology***] is the score which is most closely related to the uniqueness of the medium. I have argued before that games are different from art because they aren't simply admired, they are also played. It is the interactive nature of games which ludologists emphasize, and so one can think of the L-Score as a metric for gameplay and game design. Mechanics, systems, and level design are the key components measured by the L-Score.

If the L-Score is a measure of a game's design, then the N-Score [N for Narratology] is a measure of its artistic achievement. The narrative, in this case, is defined rather broadly. It consists of the game's music, writing, visual style, sound design, overall setting, etc. All of these factors influence the player's involvement in the game and are therefore important even if they don't have much of a direct impact on the actual gameplay.

Essentially, Jon attempts to appease both the people who value games because they're not like other forms of art and the people who value their artistic merits above most else. This scale acknowledges that although games have the capacity to be high art, the things that makes games what they are are equally valuable. Something like a schmup may not receive a very high N-Score, but it would receive a high L-Score -- assuming it's good, of course. The advantages of a scale that can give Gears of War both a "perfect" score and an extremely low score at the same time cannot be overlooked.

Though I'm sure it's not the scale's express purpose, it does a very good job of sticking it to Metacritic. If a game gets a 10/0 on this scale, the reasonable thing for the aggregator to do would be to take the average of both scores and give the game a 5, but that would be missing the point entirely. It could also take the highest of the two scores, but that would also be missing the point. Simply put, it's impossible to turn two numbers that mean completely different things into one thing that represents both.

But while it's very good at evaluating something by two distinct standards, it's very confusing from a consumer perspective. Assuming that a person is simply looking at all the reviews for a particular games and stumbles upon a review set up this way, it will take them a while to understand what the scores are trying to say. It's a great tool to encourage in-depth discussion of games, but not necessarily to answer the ultimate question a review should be answering.

Percentage Chance

Percentage scales are often interchangeable with the 100-point scale, but I think there's potential to do something different here. Rather than thinking of a score in terms of "How good is this game?" we can instead change our thinking to "What are the chances someone will like this game?"

This mindset changes the way one should approach a score. Though the best game ever made might merit a 10 out of 10, will everyone who reads it like it as much as you did? There are many niche titles in gaming, and though many of them are good, one can easily see why someone would or wouldn't like a title entirely dedicated to flying around and being a flower.

I like this system because it accounts for both taste and quality, but it's not without faults. By this metric, it's impossible to achieve a 0% or 100%, since no one game will appeal to everyone's taste, and not everyone will hate a game. Besides, regardless or your intent, people and aggregators alike will just see your score as a number anyway. Additionally, get too bogged down with demographics and tastes, and too many games are going to end up near 50% if you're using the scale correctly. While that assertion may certainly ring true, it won't help anyone make a decision. This scale is more the field of marketing and statistics.

Yes or No

Speaking of statistics, my stats teacher once told me something that stuck with me: No matter what the chances of a given event are of happening, for any individual trial, the probability of any outcome will always be either 1 or 0; either that outcome will occur or it won't. And for someone who's on trial for murder, that's all that matters.

With that in mind, the Yes or No system (it's not a scale, really) comes into play. You're either going to buy a game or you're not, so why not just get rid of all this 3/5, B+, $40 business and just tell a consumer if they want to buy a game or not. This system is best at answering our question because it does nothing else. You could always buy a game later -- at a lower price -- but the decision has already made, just postponed.

Though it seems effective from a consumer standpoint, it lacks a certain amount of granularity. A decision of yes or no will always be up to the consumer, regardless of what the reviewer thinks; a reviewer will never have the exact same taste as the reader, so the authoritativeness of this system rings false. Besides, use this system long enough and you're going to start attaching caveats to every review -- "yes if you're a fan of the genre, no if you're not" -- that it will make the concept of deciding more and more tricky and the system more and more pointless.

Buy/Rent/Don't Buy

A slight variation on the Yes or No system, but no less worth going over. If Yes or No attempts to answer the question of whether you should buy the game, Buy/Rent/Don't Buy answers the question of whether you should play a game. It's this important difference that adds the "Rent" option. This system primarily takes into account length; if a game is only about four hours long but really good, then a perspective buyer should probably rent the game in order to get the experience of the game without paying for something that they might end up regretting.

Though this system is more versatile than Yes or No, it still suffers from some of the same flaws. You'll often have to attach caveats to such reviews, which will make it difficult in deciding whether one should play the game or not. Still, both of these systems are great for at-a-glance evaluations and could possibly save a Christmas or two.

Conclusion

So, which system shall I use to evaluate games from now on? Well, I'm gonna sleep on it for a bit****. Bu, in writing this I've contemplated my options much more clearly than had I just paced about my room for an hour. I've also hopefully given some of you something to talk about, and maybe I'll bounce some ideas off of some of you. Who knows, maybe someone important will pick this up and start using some of these ideas. And hopefully give me a job*****.

And if there are any sort of other review systems you'd like to mention, please feel free to do so.


*For now, previous reviews won't be scored, since adding a score would reflect my thoughts on a game now rather than when I wrote the review. Whether or not one should go back and change or add scores is another story. If you absolutely must see scores attached to my previous reviews, you can check my Giant Bomb or 1UP pages.

**I've never understood why letter-grade scales skip the letter E; perhaps to drive home the point that "F" stands for failing?

***You can find an article further explaining Ludology here.

****By which I mean I'm going to take a short break from writing reviews because I'm not made of money, kids.

*****I can dream, can't I?

Pages: /2
< 1 2
Comments (18)
I would personally use the buy/don't buy/rent. Unless you are going to make a chart justifying a number score then I will just assume you pulled a number out of you ass*. *You being the proverbial you, not the actual you.
I'm personally done with reviews as solely a 'should you buy this or not' purpose. So many people are writing those reviews that even pursuing it is redundant and, in my opinion, boring and uncreative. I think you would have much more success by abandoning the audience that scores are trying to reach. If you want to write about video games it's better to write to an audience that wants to read about video games, rather then an audience that just mindlessly searches for fanboy validation or is incapable of buying a game without someone telling them to. To that effect I don't write reviews unless I'm writing them as a diatribe against game mechanics or design decisions that I feel are harming the industry. For those I sum up my feelings at the end of the day with a single word that sums up my emotional response to the game, since my reviews are all about that first impression the game gives you. Still ultimately we are all here to find our own way of doing things. If you want to try and salvage consumer advocacy as a game writing formula then more power to you. I just worry that such things may be a dead end left over from the good old days before the magazine crash.
What a great, well thought out post! I think the best answer is to use a different scoring system depending on which game you are reviewing. For instance, Halo:CE would have been a buy, but when reviewing sequels perhaps it is better to base the review off the first game? And thank you for teaching me what Ludology is!

Just read this through your featured writer article. Great post! If I ever write a review again (though I sincerely hope I never do) I'm definitely going to use the L/N separate scores system. Seems like it would provide a ton of good information if someone was too busy to read the entire review.

I haven't paid attention to scores since the layoff at Ziff -- I care about what someone has to say about a game, not whether or not it's worthy my money. I can make that decision; I want to know how a game makes you feel, how it affects you, and such. I think far too many people in the gaming community focus on scores. Think about the really interesting material you read on Bitmob or elsewhere -- how much of it deals with scores? 

@Jason: Right on, brother!  I write reviews at another site where I have to give scores.  I usually just go with my gut.  I have one friend who constantly gives me grief about my scores, to the point where I often wonder if he even reads the reviews.  I gave Darksiders a 9 and he yelled at me to, "stop peddling my illegitimate candy."

Whatever that means...

EDIT: I forgot to mentioned that he has never played Darksiders.  He called out my score based on the average score it got from large outlets (which, according to him, is an 8, but I honestly wouldn't know).

The cash money scale is incredibly hard to use. I've been wrestling with a Red Dead Redemption review using that, and honestly, I can't remotely even consider how to gauge it. Does it have enough content for 60$? Yes it does, it has a lot of content. However, what's the baseline? 6$ an hour or play? More or less? I don't know.

What I wanted to do is take what was in the game, describe it as objectively as possible, then my subjective analysis of it, and then judge if it's worth the asking price. However, it may be because my English isn't that good, but I found it increasingly hard to properly write.

I'm no fan of the Buy/Rent/Skip scale because I cannot rent games where I live. Personally, I'd like reviewers to move away from an evaluation of a product's monetary worth and towards discussing its overall quality. So while I'm no fan of scores, I'd rather they didn't directly connect to talk of sales.

That L/N score intrigues me even though it makes my head spin.

For the record, the industry review average is 6.8. Perhaps the two best known sites are GameSpot and IGN, which come in at 6.7 and 6.9 average, respectively. These statistics are courtesy of GameStats.

On to the subjective part: while the current system could certainly use improvement, I do not have a particular problem with the current scoring scale. I would expect the average game to have a score of about 7, and that is borne out by the statistics above. I find that there are some games near the 7.0 line that any given person will like, but the chance of more people liking a game is proportional to a rising score.

I still think every scale is far too flawed to warrant use. No one will ever take a 5 (on a 10-point scale), a 3 (on a 5-point scale), a C (on a grade scale), a 50%, or anything else like that to mean a game is average as majority of games are typically scored a bit higher than that.

Score descriptors also make no sense. Most explain that their highest score is reserved for games that everyone should enjoy, but I can think of a single game that EVERYone will enjoy. They also explain their lower scores by saying only fans of the genre will enjoy such a game, but shouldn't those players be even more picky than most when it comes to their genre?

Ultimately, the biggest problem with review scores is the way they undermine the actual review itself. Why bother reading WHY a game is good or bad when you can just check the score in a fraction of the time it takes to read the review and "know" the same thing?

Not too sure how much of the scoring aspect of things is really an issue of people not being willing to use more of their scale or a lot of readers imposing their own perceptions on other people's scales or reviews. I'm willing to bet that like any industry that hasn't tanked and crashed that there's some awful product, some rare exceptional product. and a the majority of average to slighty above average, as the above cited GameStats figure would indicate.

I tend to trust the people that are doing this professionally (especially having seen some of the stuff I had sent to me covering games for a newspaper) have have probably played more than anyone's fair share of broken games and technically awful games over most that haven't done it and have the warped mindset equivalent to the sports fan that thinks an NBA benchwarmer is a talentless bum losing perspective that person is more talented at his craft than 99% of the people on this planet. People tend to lose track of the fact a lot of crap that falls into the lower end of that scale either never sees a release or usually doesn't even get covered in most enthusiast press. 

I'm personally not a fan of review scores in any capacity any more than I really care for a star system for a movie, but they serve their purpose for the people that use them. And while there's the intellectual merit of having an intellectual discussion about said scales, there is no "best" scale because such a thing assumes a some standard audience all seeking the same thing. Had the conversation with a former editor before and caught his ire for making observation that much of this on-going discussion is a bit in vain because of vain nature of the people that participate. Mainly, that more people I observe tend to be more interested in being "right" and having everything tailor to them than to acknowledge that the different styles of reviews and review scales the are basic result that different users seek different things from them.

Some people want a deeper discussion of themes. Some people want the rundown of features. Some people are tech obsessed and want to know how something will look and sound on their $4000 set-up. Some people are looking to validate their own opinions. And know what, some people are busy, are going to skim the text of a review but are going look at that score as a general indicator of whether to take a chance on a buy or rental.

Gamers love to take the press and game companies to task for things that are the result of their own decision. They often gripe most about things aren't necessarily problems, just things that aren't they way they want them. There's enough variety out there for people to find the review type they that suits them without us constantly bellyaching about people catering to and serving an audience that isn't them.

 

Used correctly, the “monetary value” rating system should be the most informative, and thus the one most likely to make Metacritic’s system irrelevant.

To begin with, I disagree with Mr. Vazquez that “[i]deally, a game's evaluated price and actual price will match.”  The “ideal” game -- from the gamer’s point of view -- should be one with an evaluated price that exceeds the actual price.  For example, Portal is currently available for $20.  In retrospect, I would have been willing to pay $30 for the game if I had known ahead of time how much I was going to enjoy the first play-through.  That’s equivalent to a score of a 15 on a 1 to 10 scale.  

The problem with using a 1 to 10 scale (or other fixed-range metrics) is that it imposes an arbitrary limitation on the reviewer’s ability to communicate his/her level of satisfaction or dissatisfaction with a game.  So, a 15 for Portal would be a technically invalid score, even though you might think it earns that score.  That’s why the monetary value system is superior:  there are no constraints.

It’s much more rhetorically effective for a reviewer to conclude that “the developer should have paid me to play this game” than it is to simply award the lowest possible score.  If you allow for a negative dollar value, a $0 score means “play it only if you get it free,” and a negative value means “even if you get it for free, you’re never going to get those hours of your life back.”

The other problem with fixed-range scoring systems is that once you deduct points, the game can never earn them back.  A game may, for example, have a host of technical issues, but could still be worth the asking price.  The monetary value system can account for that.  There’s a devoted community of Vampire - the Masquerade - Bloodlines fans out there that would argue that the game is still worth buying on Steam for 20 bucks, even though it’s buggy as hell.

The other strength of the monetary value system is that it provides a more practical way to score games with monthly subscription fees.  If you have to drop $15 a month to play a given MMO, a reviewer should be able to offer a more nuanced judgment about the value of the subscription, not just the initial purchase price of the base game.

At the end of the day, gamers want two pieces of information from professional reviewers:  (1) whether the game’s systems work reasonably well, and (2) whether the game is fun to play.  The reviewer can break things down any way he/she likes:  +$20 for a great story, -$5 for technical issues, etc.  As long as the monetary value system is logical, consistent, and doesn’t have any arbitrary constraints, it can be a much more effective means for the reviewer to communicate with the reader, compared to a fixed-range scoring system that uses undefined units.

Is a 72/100 the same as a 7/10, and are either of those equivalent to a C-?  I have no idea. But if you tell me the Orange Box was worth $75 at release, I’ll pay attention.

BTW:  this is an example of the (hopelessly amateur) way I write a review:  http://itoeunited.blogspot.com/2009/12/new-zombies-just-like-old-zombies.html

@Maxx: The fact that the average score for a website is above the natural average of 5 means that people should not take the fact that a game works to mean that it merits a certain score. As Gerren Mentioned, the truly awful or broken games aren't actually released, so yes, if a game doesn't function, it should leave a very low score, or not score at all. However, that a game works on a fundamental level isn't an achievement worth praise. It just means the team is competent enough to release a product.

@S.F. Sure, a game can go over its retail price amount, but after a certain point, going above the actual price is somewhat pointless. If someone were to say that Fallout 3 had so much content it was worth $200, does it really matter when the game is only $60 (or currently, $30)?

I also feel that if a game really loved a game, they could abuse the ability to go over the retail price, like giving a game like Portal the same $200 price point if they so pleased.

Like I said, I was actually considering using the retail price scale as much as my current 1-5 scale, but I liked the more universally approachable 5-star scale. You're not wrong if you choose the monetary value. I think it has to do with who your intended audience is. I don't think IGN would use the L/N scale even if every editor liked it just because most of its readers might find it too confusing. A site that was directed at an audience more interested in evaluating games instead of making purchasing decisions might be more able to use it, however.

Suriel:  

I take your point, but I have two counters:  (1) most of your readers would recognize the hyperbole in the "abusive" ratings for the games you used as examples, and (2) very many WoW players have spent a hell of a lot more than $200 to keep playing that game.

I do appreciate that you want to be accountable to your readers.  That's why I think the system I described would work well for you:  you wouldn't risk your credibility by awarding absurd, out-of-market values. 

Either way, I plan on reading your next review.  The debate has been interesting.

Hey SF!

Don't mean to be an ass but we try and use our real names on Bitmob. Even if you don't to use yours, think of something other than just S.F. And that was a good debate!

Hi Suriel,

Great article. I've often wondered what is ideal in regard to ratings. I use Metacritic for a quick glimpse but I then rely on Gamespot’s quick Pro’s and Con’s to determine if what I value highly is done well in a game. Lastly, I use videos of gameplay to make my final decision, rather than opinions of others.

Anway, if I were you, I'd use all four main methods for every game for the next 6 months as a test. I'd do this partially because it would be very intellectually interesting from a review standpoint, and moreover, it would be most helpful to thoughtful readers.

For instance, give a 0-100 score for quick and easy Metacritic users. Also, give a dollar value assessment in relation to other games because this idea is so compelling. Then provide an L/N score that separates the technical gaming aspects from the artistic/thematic elements (I might suggest Gameplay/Narrative as the terms). Finally, give a score like Spill.com does: full price/sale/rental/skip.

(I love the L/N idea because so many games starkly contrast between being both technical disasters but thematic greats (Alpha Protocol, Arcana, VBtM, The Last Remnant Xbox, ect.), and this division allows players to decide what they value more, hardcore gameplay or unique and deep experiences.)

As I said, I do think this process tried for 6 months will greatly enhance our understanding of what is the best rating system. I know if I had a game review site, that is what I would do. ...Maybe I should start one.

I like a scale that goes up to 10, in increments of .1. For instance, 5.5, 5.6, 5.7 are all possibilities. It's only 1 percentage point off, and it's almost negligable, but in this industry, as with any, if you're a developer and are competing with the rest of the industry, and your shooter gets an average accumulated score of 9.6 over the other top shooters at 9.4, you're going to want to brag about your score. Likewise, it just might help out the indecisive gamer who wants "what most people think is the best game".

I'm a reviewer as well, and personally, I just try to stick to writing a good review...but, I do know that a score is necessary in the end. It should be people just reading the review, but some just skip to the score or overview; that's the truth.

So...at least on a 1% scale, we can give as accurate of a score as possible, even though some would say such minue percentages are meticulous.

You must log in to post a comment. Please register if you do not have an account yet.