For much of his 34 years at Intel, Justin R. Rattner has been a pioneer in parallel and distributed processing. His early ideas didn't catch on in the market, but the time has come for them now, he recently told Computerworld's Gary Anthes.
Q: Are we at the end of the line for microprocessor clock speeds?
A: "We'll see modest growth, 5 per cent to 10 per cent per generation. Power issues are so severe that there won't be any radical jumps. If you get a 2 per cent improvement in clock speed but at a 5 per cent increase in power consumption, that's not a favourable trade-off.
"I keep reassuring Bill Gates that there is no magic transistor that is suddenly going to solve his problem, despite his strong desire for such a development."
Q: "What exactly is Gates worried about?"
A: "First, a steadily rising single-thread performance would benefit the entire existing base of software. Second, multicore and, later, many-core processors require a new generation of programming tools. Given the rudimentary state of parallel software, the investment across the entire computing industry will be very large. Third, the tools have to be applied by people with the skills needed to use them effectively. Retraining existing programmers and educating a new generation of developers coming out of school is another formidable challenge. It will take years, if not decades, to reach the point where virtually all programmers assume the default programming model is parallel rather than serial."
Q: "So the only way to keep Moore's Law going is to add more computing cores to a microprocessor chip?"
A: "The only way forward in terms of performance - but we also think in terms of power - is to go to multicores. Multicores put us back on the historical trajectory of Moore's Law. We can directly apply the increase in transistors to core count - if you are willing to suspend disbelief for a moment that you can actually get all those cores working together."
Q: "How many cores might we see on a chip in five years?"
A: "We have been talking about terascale for the past couple of years, and we are demonstrating an 80-core [processor chip]. Our [future] product is Larrabee. It's not 80 cores; we can do things like that in research because we don't care how much it costs. Our hope is that that will stimulate software developers to bring terascale applications to market. We are talking about early production [of Larrabee] in 2009."
Q: "How many cores will Larrabee have?"
A: "I can't comment on details about the first product. It's sufficient to say "more than ten," which is what we define as the boundary between multi-core and many-core. It's better to think of it as a scalable architecture family, with varying numbers of cores depending on the application."
Q: "In five years, will virtually all new software be written for multiple processors?"
A: "Yes, but people will not go back and rewrite a lot of existing software. I don't think word processors need 16 cores grinding away on them."
Q: "So, will we need a new kind of programmer then?"
A: "Yes, it's a whole new ballgame. We have been trying to get the attention of the academic community on this problem. They got all fired up about parallel programming about 20 years ago. Everywhere you went, people were working on parallel programming, but it never came down. It remained in the high-performance computing space."
Q: "Why are things different now?"
A: "Now, every computer you buy has two or four or eight or maybe a lot more cores. Twenty years ago, the market for [multiprocessor] machines wasn't big enough to support the kinds of R&D to really move the ball down the field. It's a disappointment, but not a surprise, that not much happened over 20 years. Now, the financial incentive is there, and the R&D budgets are there. We know we can't sell hundreds of millions of processors that people can't program. But we are on the flat part of the learning curve right now."
Q: "What are the academic researchers doing now?"
A: "Until recently, they weren't even teaching parallel programming. You could get a Ph.D. in computer science and never write a parallel program. But now hundreds of universities worldwide are reintroducing parallel programming into their curricula. Intel and other companies are working on funding programs to reignite academic research in parallel programming and architectures.
"We went out to the universities and talked about these plans. They said, "This is great, because we weren't talking about this." It's sort of like the elephant in the room. We are all buying dual-core this and quad-core that, but no one was saying, "We really don't have much technology to do all this stuff."
Q: "What sort of research are you doing internally?"
A: "Take a look at our CT - a dialect of C for "throughput" computing. It raises the level of abstraction so programmers are not dealing with parallelism in an explicit fashion. You can think very naturally about data structures, and it just has a huge impact on productivity. It lets you express data parallelism very easily. What's different about it is that it deals with all kinds of data structures in an almost seamless way."
Q: "Can scheduling those parallel threads be done just in software?"
A: "If you rely on the operating system to schedule all those threads, you are probably dead in the water. We've developed an architecture for hardware thread scheduling and done extensive simulation of it to understand the trade-offs and refine the mechanism. It's too early to say which product will have it and when it will reach the market."
Q: "But aren't there many applications that are inherently not suited to parallel processing?"
A: "We have faced this problem internally. There is this unfounded belief in Amdahl's Law [which limits the speed that can be gained by adding more processors]. It's: "I've got this program, and it doesn't get faster after four cores. You put eight cores on it, and it still runs at the same speed." I hear that all the time. But then we take a look at it and we go, "You know, you didn't really think about it." But there may be another way where you don't get as much performance in the two-to-four-core range, but it keeps scaling, so you take a slower initial position, but you can scale to 16 or 32 processors. It's a matter of clever algorithms, different decomposition of the problem and using better tools that make decisions on the fly for the programmer."
Q: "What's the future of spintronics, in which information is based on the spin of an electron rather than on its charge?"
A: "Charge-based electronics is going to run out of steam. The memory guys have already hit that point, basically. They can't make those memory cells any smaller. So researchers are looking at quantum effects like spin, and some early results aren't bad. Spin has some nice things about it, both in terms of performance and power."
Q: "Could we have working computers based on spin in ten years?"
A: "Yes, but I wouldn't be more aggressive than that. We'd want to make the transition seamlessly. We wouldn't want to say, "OK, there will be no new microprocessors for five years while we figure out spintronics."