Single versus Multi-threaded Performance
On the CPU front, we’re faced with an interesting change in Intel’s overall posture with regards to CPU benchmarking: they’d like to see more single threaded performance tests.
This causes a certain amount of head scratching and baffled looks. After all, Intel’s implemented Hyper-Threading (simultaneous multithreading) in its Core i7 CPU line, which means it supports eight separate threads on four cores. Hasn’t Intel been pushing the use of threaded apps in benchmarking?
While this seems like a change of face on Intel’s part, it’s not. No one at Intel has suggested that we stop testing the performance of threaded apps. There’s a realization, though, that many apps are still single threaded or, at best, dual threaded. On top of that, many games – even those ostensibly multi-threaded are, at best weakly multi-threaded.
Now let’s think about a feature that Intel built into Core i7 called Turbo Boost. Turbo boost allows the clock frequency of a single core to increase if the processor is running a CPU-intensive, single threaded app. That boost can be significant, which means the performance of that single threaded app to be substantially higher than it would at the native clock speed.
Intel, of course, would like to show this feature off. It’s really a pretty useful feature, and may even see some improvement in the upcoming release of Lynnfield. But there’s no one at Intel discouraging the use of multi-threaded tests, either.
In the end, I plan on testing using real applications, with one or two synthetic tests more are reality checks than any serious indicator of performance. Synthetic benchmarks, like 3DMark, have their place, but no one should ever rely on them as a serious indicator of how real apps might perform. It’s the combination of synthetic tests plus some real world benchmarks, that give us a full understanding of how a product performs.
The exception to this rule, at least for me, will be hard drive benchmarks. It’s tough to run applications to get a quantitative test of hard drive or SSD performance. So I’ll continue to use synthetic tests, like HDTach, PCMark Vantage and HD Tune Pro for evaluating hard drive performance.
So the end of summer is here, and it’s the season for benchmarking. Time to fire up the rigs, pop in the new gear and see how it all hangs together.
10 comments
Derek says:
August 24, 2009 at 3:53 pm (UTC -7 )
While single-threaded testing might be important for all the reason you mentioned, I have a hard time imagining a true single-threaded usage in everyday life. No matter what else I might be doing, there is always something else. Ventrilo during the gaming session, or Pigdin while watching a movie. When giving the computer a hours-long task like transcribing video (and walking away) you are virtually guaranteed to have the system come along and do something under the covers. Defragging the disk, a home server backup, the iPod deciding that NOW is a good type to sync.
Right at this moment I have 88 processes running on my Windows 7 box. I have complete control over maybe 20 of them. Is a single-thread test even meaningful?
Loyd Case says:
August 24, 2009 at 4:28 pm (UTC -7 )
I’ll bet if you looked at your performance meter in task manager, you’ll see that all the cores of your CPU is about 98% unused. Those tiny OS tasks take up very little CPU usage, and are often idel.
When you launch something CPU intensive, that also happens to be single-threaded, Nehalem will run that task in one of the cores at a higher frequency than the other cores. That’s when you’ll see at least one of the CPU meters peg.
Finally, Windows 7 is smarter than Vista and much smarter than Windows XP at assigning and maintaining a single, intensive task on one of the cores, rather than doing a lot of core swapping, which slows things down and wastes resources.
Derek says:
August 24, 2009 at 5:35 pm (UTC -7 )
I get all that. Just making the point that single-threaded performance is getting less meaningful. Since it is hardly possible to buy a single-threaded processor, nor to configure a machine to not have multiple uncontrollable processes it seems an academic discussion.
We’ll care that a task takes 60 minutes on the new box and 73 minutes on the old box. But until the software makers re-write a bunch of their core code, upgrading from a dual-core 9300 (for example) to an effective 8-core i920 possible just means that I will have a lot more idle time. Yes, identical tasks will be somewhat faster, but not 4 times as fast.
Ah, the good old days when upgrading from 25Mhz to 90Mhz actually meant almost 3x in performance boost.
Brandon Champion says:
August 24, 2009 at 9:10 pm (UTC -7 )
“I care about digital photography and video encoding performance. I care about games performance.”
Finally! No more irrelevant stuff. I wanna see things like… encoding time in Vegas and FPS in ArmA2. Something I can do on my OWN machine to compare.
Markeyse says:
August 24, 2009 at 10:53 pm (UTC -7 )
I can’t wait to see SSD Performance. This is definitely the way of the future and will be very valuable.
I do audio engineering and I am starting to dabble in video, and my hard ware is everything to me.
Carlo says:
August 25, 2009 at 1:51 am (UTC -7 )
I like your approach. An approach shared by a UK magazine called Custompc, who have designed their own (I think) similar battery of tests, which you can download and use. Take a look at http://bit.ly/oolIs for the explanation and download, you may find it useful.
trip1ex says:
August 25, 2009 at 7:58 am (UTC -7 )
I used to pay attention to benchmarks.
Nowadays all I need is a rough number. 10% faster or 15% faster. And that’s it.
Your article could really just be a number. Take after videogame reviews and just assign the hardware a speedscore A 10 means it’s 10% faster. 15 means 15% faster. etc.
This would leads to lots of debate because assigning one overall number is judgement call. And imagine if all the various hardware sites started doing this. I can see the flames and discussions about why this or that site gave this or that piece of hardware such a high or low (speed)score.
Anyway benchmarks could use a little bit of fun and a judgement call like that. They’ve become stale.
Faster hardware is overkill for many tasks.
Pcgaming isn’t what it once was. It is not like new pcgaming hardware is even needed. DX11? It’s as if a million people yawned simultaneously.
Sure i7 gave video encoding a nice boost. Still takes too long and still can’t do video encoding very well at the same time you’re doing other tasks at least not that well on my 2.66ghz intel c2d cpu.
Software really has to come full front and take advantage of this hardware. Intel should just bury themselves in the lab for 5 years before announcing another cpu. By that time they’ll have a truly new and useful cpu for everyone.
Anyway applaud the effort to try and make benchmarks more relevant. HOpefully you’re not swimming upstream.
Loyd Case says:
August 25, 2009 at 9:44 am (UTC -7 )
The other thing I’ll be doing will be percentile comparisons to a reference system. Right now, it’s a very high end system (Core i7 975). But I plan on keeping that around, and we’ll see systems that exceed that over time.
Mark says:
August 25, 2009 at 9:58 am (UTC -7 )
Sorry to post this here — delete after reading:
Loyd, I tried to send you an email this morning to the address listed on your contact page (loydcase@improbableinsights.com). A few mintues ago I got a ‘failed permanently’ message kicked back to me:
Hope that helps.
Loyd Case says:
August 25, 2009 at 10:15 am (UTC -7 )
Thanks, Mark. I’ve fixed the contact page. The email is actually loyd (at) improbableinsights (DOT) com.