Testing Game Performance
On the gaming side, I’m taking a closer look at X-Plane 9.3 as a benchmark, as well as including the usual suspects.
Most of the games I use for benchmarking are now available on Steam, so that makes installation of the benchmark games so much easier. Then there’s X-Plane. That ships on 6 DVDs. Yes, most of the DVDs are scenery, so I may be able to get away with a fairly minimum install, once I figure out which scenery files are needed by the performance test.
Of course, we’ll be seeing the next generation of DirectX 11 graphics cards. The problem with any GPU that supports a new API is finding benchmarks – or even visual tests – that actually make use of the new API. We’ll no doubt see a bunch of demos indicating the wondrous beauty that can be created with the new API. But real performance tests can really only be run on apps using the old API.
That’s the Catch-22 of using hardware that expands graphics capabilities.
Of course, games can be used to benchmark processors and platform technologies. Benchmarking graphics performance for games is quite different from testing CPUs, chipsets and memory.
With graphics hardware, you really want to push pixels — high resolution, lots of AA and AF, lots of shaders running, both pixel shaders and shaders that push geometry.
For CPU testing, I tend to run two different sets of tests: very low resolution benchmarks, with features like AA, postprocessing and AF dialed way down. This enables the CPU’s contribution to be more visible. Then I run them at more typical resolutions that a gamer might use — usually 1680×1050, though I’m thinking about moving to 1920×1080. I’d still keep AA and AF down, but turn up other graphics features. This gives us an idea of how much the CPU might realistically add during real game play. Games, of course, will differ depending on the game type.
10 comments
Derek says:
August 24, 2009 at 3:53 pm (UTC -7 )
While single-threaded testing might be important for all the reason you mentioned, I have a hard time imagining a true single-threaded usage in everyday life. No matter what else I might be doing, there is always something else. Ventrilo during the gaming session, or Pigdin while watching a movie. When giving the computer a hours-long task like transcribing video (and walking away) you are virtually guaranteed to have the system come along and do something under the covers. Defragging the disk, a home server backup, the iPod deciding that NOW is a good type to sync.
Right at this moment I have 88 processes running on my Windows 7 box. I have complete control over maybe 20 of them. Is a single-thread test even meaningful?
Loyd Case says:
August 24, 2009 at 4:28 pm (UTC -7 )
I’ll bet if you looked at your performance meter in task manager, you’ll see that all the cores of your CPU is about 98% unused. Those tiny OS tasks take up very little CPU usage, and are often idel.
When you launch something CPU intensive, that also happens to be single-threaded, Nehalem will run that task in one of the cores at a higher frequency than the other cores. That’s when you’ll see at least one of the CPU meters peg.
Finally, Windows 7 is smarter than Vista and much smarter than Windows XP at assigning and maintaining a single, intensive task on one of the cores, rather than doing a lot of core swapping, which slows things down and wastes resources.
Derek says:
August 24, 2009 at 5:35 pm (UTC -7 )
I get all that. Just making the point that single-threaded performance is getting less meaningful. Since it is hardly possible to buy a single-threaded processor, nor to configure a machine to not have multiple uncontrollable processes it seems an academic discussion.
We’ll care that a task takes 60 minutes on the new box and 73 minutes on the old box. But until the software makers re-write a bunch of their core code, upgrading from a dual-core 9300 (for example) to an effective 8-core i920 possible just means that I will have a lot more idle time. Yes, identical tasks will be somewhat faster, but not 4 times as fast.
Ah, the good old days when upgrading from 25Mhz to 90Mhz actually meant almost 3x in performance boost.
Brandon Champion says:
August 24, 2009 at 9:10 pm (UTC -7 )
“I care about digital photography and video encoding performance. I care about games performance.”
Finally! No more irrelevant stuff. I wanna see things like… encoding time in Vegas and FPS in ArmA2. Something I can do on my OWN machine to compare.
Markeyse says:
August 24, 2009 at 10:53 pm (UTC -7 )
I can’t wait to see SSD Performance. This is definitely the way of the future and will be very valuable.
I do audio engineering and I am starting to dabble in video, and my hard ware is everything to me.
Carlo says:
August 25, 2009 at 1:51 am (UTC -7 )
I like your approach. An approach shared by a UK magazine called Custompc, who have designed their own (I think) similar battery of tests, which you can download and use. Take a look at http://bit.ly/oolIs for the explanation and download, you may find it useful.
trip1ex says:
August 25, 2009 at 7:58 am (UTC -7 )
I used to pay attention to benchmarks.
Nowadays all I need is a rough number. 10% faster or 15% faster. And that’s it.
Your article could really just be a number. Take after videogame reviews and just assign the hardware a speedscore A 10 means it’s 10% faster. 15 means 15% faster. etc.
This would leads to lots of debate because assigning one overall number is judgement call. And imagine if all the various hardware sites started doing this. I can see the flames and discussions about why this or that site gave this or that piece of hardware such a high or low (speed)score.
Anyway benchmarks could use a little bit of fun and a judgement call like that. They’ve become stale.
Faster hardware is overkill for many tasks.
Pcgaming isn’t what it once was. It is not like new pcgaming hardware is even needed. DX11? It’s as if a million people yawned simultaneously.
Sure i7 gave video encoding a nice boost. Still takes too long and still can’t do video encoding very well at the same time you’re doing other tasks at least not that well on my 2.66ghz intel c2d cpu.
Software really has to come full front and take advantage of this hardware. Intel should just bury themselves in the lab for 5 years before announcing another cpu. By that time they’ll have a truly new and useful cpu for everyone.
Anyway applaud the effort to try and make benchmarks more relevant. HOpefully you’re not swimming upstream.
Loyd Case says:
August 25, 2009 at 9:44 am (UTC -7 )
The other thing I’ll be doing will be percentile comparisons to a reference system. Right now, it’s a very high end system (Core i7 975). But I plan on keeping that around, and we’ll see systems that exceed that over time.
Mark says:
August 25, 2009 at 9:58 am (UTC -7 )
Sorry to post this here — delete after reading:
Loyd, I tried to send you an email this morning to the address listed on your contact page (loydcase@improbableinsights.com). A few mintues ago I got a ‘failed permanently’ message kicked back to me:
Hope that helps.
Loyd Case says:
August 25, 2009 at 10:15 am (UTC -7 )
Thanks, Mark. I’ve fixed the contact page. The email is actually loyd (at) improbableinsights (DOT) com.