Hacker News new | past | comments | ask | show | jobs | submit login
AMD is working on K12, brand-new x86 and ARM cores (techreport.com)
96 points by AnthonyMouse on May 5, 2014 | hide | past | favorite | 54 comments



> AMD has given the ARM core the code-name K12. I'm not sure whether that name also applies to the x86 core.

I hope not. They would be doing themselves a disservice by trying to confuse their customers by calling 2 very different chips the same (which they're already doing with their upcoming ARM server chips). Don't try to "hide" the ARM cores behind the x86 chip brand, like you're ashamed of them. OWN them. Be proud of them and what they can deliver for the power/performance/cost (and if there's nothing to be proud of, then why bother making them?)


Similar, in a sense, to the mistake Microsoft made (IMO) by naming two very different products "Microsoft Surface".


Three, if you include the table PC.


Right; I specifically mean the anemic ARM-based "Surface" versus the similar-looking, but much beefier x86-based "Surface Pro".


Or the third Xbox the Xbox One...


Read original source (http://www.amd.com/en-us/press-releases/Pages/ambidextrous-c... ), it's directly specified that it's a name for ARM core.


Interesting to see the common socket proprosal - this harks back to the days when AMD's x86 chips were sharing a slot with the DEC Alpha (although I'm not aware of anyone ever making a motherboard that could host either processor).


I had one which, with a bios change was supposed to be able to run an Alpha.

Fun fact, the board - the Asus K7M, came out at a time when Intel was putting pressure on vendors not to support the (brand spanking new) AMD Athlon. The vendors were so scared that the first few shipments of K7M boards came in plain white boxes, without a reference to the manufacturer anywhere in site. Not even the fine print.


This must still go on. Someone brought me a Samsung laptop an its processor made no mention of AMD anywhere. even in the pc management. It just reports as a quad core cpu.


My sister had that one. I had to use CPU-Z to find out its "true" identity (Jaguar core) :)



Thanks. We changed the url.


This might well be the beginning of the end for x86 as we know it. The power advantages of ARM in the server space may not be as significant as in the mobile space but that doesn't mean they're minuscule.

There's a lot of code out there which already compiles to multiple architectures so that's one less objection. The rise of the open-source movement (and source-based distribution) have enabled this to happen on a wide scale.

Since the barriers to entry for a new architecture have fallen (and AMD intends to lower them further) it wouldn't surprise me if in a decade x86 is as much of a novelty as ARM was decade ago.


I don't think ARM really has any inherent "power advantage" over x86; this paper might be interesting to read:

http://research.cs.wisc.edu/vertical/papers/2013/hpca13-isa-...

Yes, ARM architecture is simpler so AMD might stand a better chance, but I'm not so sure how they'll manage to make ARM cores reach x86-level performance and per-cycle efficiency. Currently in cross-architecture benchmarks like SPEC (which doesn't seem to have any ARMs in it at the moment -- hopefully this will change), x86 is at the top for per-thread, per-clock performance and at least a factor of 2-3 away from the next-best - which was POWER7, if I remember correctly.

And given that x86 has tons of intra-instruction optimisation opportunities that Intel are taking advantage of because it's such a rich CISC, and ARM doesn't ("enhanced REP MOVSB" is the first example that comes to mind)... personally, I'm not too optimistic. Either way it will be really interesting to see what happens, particularly the benchmarks.


Quite true, the one ISA advantage that ARM has it's that it's much easier to decode. That does make a difference, but it's not a huge factor. On the other hand x86 tends to be a bit more compact but not as much as you might think.

ISA doesn't make a huge difference here, but Intel currently has a pretty big lead in both process and architecture. Either because of superior research, the ability to impose more constraints than a merchant silicon house can, or both Intel has been reaching new process nodes sooner and getting better performance on them than their competitors.

And Intel currently has a big architecture advantage too. Owning the whole PC market gives you the money to hire a lot of good engineers. It might be that there's a second disadvantage to x86 in that it's complicated enough that you need more engineering to get an equivalent architecture. I say this because IBM has managed to keep parity despite the fact that, IIRC, they're able to invest less engineering. Power 7 had higher single threaded performance than anything Intel had when it came out, and it looks like Power 8 is doing a similar leapfrog. If you look at SPEC int_rate and divide by the number of threads you'll find that Intel comes out a factor of 3 better today and 2 when Power 7 came out, but that's due to IBM having 4 threads per core to Intel's 2.

In theory there isn't anything preventing an ARM-64 chip from having performance as good as an x86 or POWER chip, but in practice Intel and IBM have a lot of experience in designing high performance chips but ARM doesn't. AMD sort of does, but they haven't been executing well at the high end recently and these will be their first ARM cores.

*http://www.spec.org/cpu2006/results/cpu2006.html


Power 7 had higher single threaded performance than anything Intel had when it came out

IBM Power 795 (4.25 GHz, 128 core, SLES) 5350 base, 512 threads -> 2.46 result/thread/GHz

The Power 7 came out in 2010, so we can look at the x86 that were available around that time - the Nehalem era; e.g. this one

IBM BladeCenter HX5 (Intel Xeon E7540 - 2GHz) 490 base, 48 threads, 5.10 result/thread/GHz

For AMD, this one I picked turns out better than Intel:

IBM System x3755 M3, AMD Opteron 6134 (2.3GHz) 638 base, 48 threads, 5.78 result/thread/GHz

The Power 7's single-threaded efficiency is less than half that of competitive x86 CPUs at the time. The TDP is 200W+ as well - around double that of the x86s which are ~100W - so power efficiency isn't that great. The high clock frequencies probably have something to do with it.

In theory there isn't anything preventing an ARM-64 chip from having performance as good as an x86 or POWER chip

True, but as the paper I linked suggests, power efficiency is going to suffer if they're optimising for raw performance. There hasn't really been aggressively high-performance ARM chips before unlike the other traditional RISCs (SPARC, POWER), so that's why I'm really interested to see what AMD does with it.


You're not looking at single-threaded performance with those numbers you're looking at, um, multi-threaded performance per thread which is a metric people don't use for very good reason.

In computer architecture, it's very rare for anything to scale linearly. If you take a chip and double the frequency it runs at you won't get double the performance, because there are all sorts of latencies you haven't improved. If you double the number of cores in your chip you won't get double the performance, because they're contending for the same limited pool of off-socket memory bandwidth. If you add more sockets then some memory accesses will be to other sockets, increasing latency. And if you double the number of threads per core, you're lucky to get even a 20% increase in performance because now your threads are in contention for both the same execution and memory resources.

So when you compare a 32 socket, 4 thread per core system to a 4 socket, 2 thread per core system on the basis of thread performance you're being ludicrously unfair. Would you claim that a non-hyperthreaded Intel i5 has much better single threaded performance than a hyperthreaded Intel i7?

If you follow the link I gave you can find the actual single threaded SPECint results at the top, you'll find two base results for Power 780 (29 and 44) and many results for E7540s which seem to be around 24 for the first ten results I checked.

Sure, the SPECint rate results show a different story if you divide by the number of threads, but that isn't what people mean when they talk about single threaded performance.


multi-threaded performance per thread which is a metric people don't use for very good reason.

Maybe I used the wrong term but I'm referring to the idea of how much work can be done by a single instruction stream (thread) in a fixed number of clock cycles.

but that isn't what people mean when they talk about single threaded performance.

Then what do they mean?

I understand what you mean about scaling not being linear with the number of threads, but even with the same (very large) number of threads:

POWER7@3.44GHz, 384 threads (16 chips, 6 cores/chip, 4 threads/core) result 3560

Xeon X7542@2.67GHz, 384 threads (64 chips, 6 cores/chip, 1 threads/core) result 8190


How much work an instruction an instruction stream can do in a fixed number of clock cycles is going to be hugely dependant on what other instruction streams executing at the same time might be doing. That's why the convention is, when measuring single threaded performance, to only use a single thread.

Nothing says that you have to run the same number of threads in your workload as you have hardware threads. Operating systems are there to multiplex software threads over hardware threads, and part of SPEC is a test of the operating system and compiler as well as the chips and motherboards and memory. There's nothing to prevent someone from taking the Xeon system in your your post with 30,000 threads, producing a system with a performance per thread result much much lower than running it with 384 threads.

The interesting results are which systems can achieve the absolutely highest throughput and single thread performance, and which can achieve more throughput or single thread performance per unit price or unit power consumption.


'I don't think ARM really has any inherent "power advantage" over x86'

paper states ~ISA being CISC or RISC doesn't matter... performance differences are generated by ISA-independent microarchitecture differences".

So being ARM or x86 matters for "power advantage", but being "CISC or RISC" doesn't.


The estabilished terminology is "architecture" for instruction set architecture and "microarchitecture" for the user-invisible implementation details. So, microarchitecture means for example AMD K7 vs Intel P6.

So the quote is saying that ARM vs x86 doesn't matter, but eg Cortex-A8 vs Cortex-A10 does.


I was using paper as a basis since that's what is linked (which I found as a poor one)

"We find that ARM and x86 processors are simply engineering design points optimized for different levels of performance, and there is nothing fundamentally more energy efficient in one ISA class or the other. The ISA being RISC or CISC seems irrelevant."

So if you think as X and Y of course they don't matter. However we are talking about ARM vs x86 which have measurable properties in this whole discussions context+time, and that matters as "a phone powered with ARM" vs "a phone powered with x86" or "a server powered with ARM/x86". That's the level of terminology we are using.


Intel is really good. Really really really good, I have a hard time dismissing them but the ability to buy from many vendors and build custom hardware is compelling. I felt this way before about powerpc but there seems to be more compelling community support for arm this time around, the installed base of mobile devices makes a huge difference. It seems to me that intel has been lazy exploiting their performance edge, if amd and others start selling inexpensive lowish power arm 8 and 12 core systems that can legitimately hang with intel then I think they'll have some serious competition. Especially as the niche vendors start making 32 and 64 core chips, assuming they can keep the prices decent.

It could be very interesting if a Dell, for example, starts to get in on the game, they're already attempting to be solution vendors. Perhaps targeted arm core start being built, like storage centric designs for SAN type systems and network processors will fall out pretty quickly, all with fundamentally the same isa and just different accelerators for different domains. That something intel has always been against.


>"This might well be the beginning of the end for x86 as we know it."

That would require that Intel to stand still with its Atom line. I'm quite sure that Intel, with its well-known corporate paranoia about the marketplace, will continue to compete in low-power x86 technology. Even if AMD delivers on this hybrid with a good product, I expect they'll still have a helluva fight from Intel on their hands.


Just thinking about it superficially (and in general not just server market) it seems ARM is eating Atom's lunch. Atom has to try very hard to cut into mobile market. I can see maybe netbooks, kiosks but how big is that compared to smart phones.

Now in the lower power, micro-server/blade market I can see how Atom might be a good competitor to ARM for the software that can't compile to ARM easily. But I am also not sure how big that market is.

With virtualization and containerization a large powerful Xeon machine carved into hundreds of containers might be just as good or even better.


x86 is unlikely to go away any time soon. I could see a world where ARM was more prevalent for consumer devices, but that's only a fraction of computing power. Servers are going to be x86 for a while, there's just too much of a power differential there. And high end workstations and gaming systems, including gaming consoles, are going to be x86 for years and years.

It'll be interesting to see what happens with Intel's x86 SoC stuff in the near future. They have the capability to be competitive with ARM, and some mobile devices (like smartphones) are already x86 based.


Are you assuming that an x86 core is more powerful than an ARM core? That may be true historically, but only because ARM implementations were optimized for watts and x86 for compute. I think AMD is talking about optimizing ARM for compute as aggressively as x86, and that should (theoretically) make it slightly faster than x86, because the ARM design is more, having learned the lessons of CISC and RISC and all that.


I wouldn't bet on anyone beating Intel. They are a freaking behemoth. They have pretty much all the best engineers, and if they don't, they have the funds to hire them. If anyone gets close to actually threatening them, they have the means to ramp up their game.


CISC is dead, it's mostly irrelevant, all modern computers are RISC based. x86 CPUs merely have a micro-op translation system. That imposes some overhead, but in modern CPUs with gigatransistors that overhead isn't that great, and usually not a bottleneck.

The lesson of the last 20-30 years is that the ISA is almost entirely tangential to performance. There are some cases where that's not entirely true (like VLIW), but for the most part it still holds. What matters is the core design at a hardware level. And there it would be the height of foolishness to bet against Intel. They've been challenged multiple times by multiple world-class competitors, and they've demolished them.

Why is it so hard to imagine a world where multiple competitors have their niches? Look at iphone vs android, who are in the exact same market, for example, there's room for diversity and competition. The only way Intel could "die" is if they stop paying attention and stop caring about competing, and I just don't see that happen. At the end of the day they still have a ton of talent, the best fab capability and capacity in the world, and the ability to execute on projects successfully.


"They've been challenged multiple times by multiple world-class competitors, and they've demolished them."

Yes, but not always by technical might. They were once the scrappy underdog, and it's been shown in courts they aren't that worried about being unsportsmanlike.


"That imposes some overhead, but in modern CPUs with gigatransistors that overhead isn't that great, and usually not a bottleneck."

Citation needed.


I think X86 processors since the Pentium Pro (mid-late 90's) have been converting the instructions into internal micro-ops. If it was feasible/efficient then it will now be quite a small block of logic. While there will be some time used to decode there is also time saved in having denser code (more code cached, less memory bandwidth used). It is far from clear that the overall effect is slower.

I would turn it round to you and ask you to cite something showing that the translation is something that does cause significant performance loss.


> There's a lot of code out there which already compiles to multiple architectures so that's one less objection

Yeah and there's a lot of code out there which has already been compiled to x86, the vendor has gone out of business, and isn't getting compiled again.


True, but that code isn't exactly a driver of new hardware sales.


Never underestimate one powerful and common optimization: buying new hardware.


And that's what emulation and binary translation are for! :-P


And there goes the power efficiency...


Yes, but if you are still using such binaries they themselves are a serious risk to your business in other ways.

Backwards compatibility is good for transitions, but when you are using it to mitigate risks due to dead companies you are using a technical solution to fix something that is inherently a business/people problem.


Like what? Games? Contrary to enthusiast opinion, games have a pretty short life for most people. Doubt that will be a problem. What kind of enterprise is going to run software without a maintainer? I can't come up with anything else than games for which this is a relevant scenario.


You are kidding right? There are loads of old systems that have not been updated, running on MS-Dos or Windows 3.1 systems, stuck in some cupboard in a factory, necessary for continued operation. Stuff that 100 people have ever heard of before.


Yeah, I know about those. There are industrial control computers that still run on Intel 386 (the original). But once those guys run out of spare parts they won't just upgrade the processor.


Intersil still makes an 8088, and it's quite expensive too: http://www.intersil.com/content/intersil/en/products/space-a...

As the URL suggests, there are applications where an existing, well-characterised CPU is highly preferred over a relatively new one where some previously unknown errata could have catastrophic results.


It doesn't even have to be errata. That timing loop someone wrote decades ago might run too fast now. Even if it determines its loop counter at startup, it may fail because it assumes 16 bits is enough for the counter.

Conversely, some instructions may have gotten slower over time. For example, I doubt intel worries much about the performance of the original 8087 floating point operations anymore, so they could move them into microcode or even make them illegal instructions, to be emulated in software (I don't they do at the moment because the overhead of decoding the original 8086 instructions is low, but if x86 stays around for a few decades, at some time, I think they will consider doing the latter)


"Completely Static CMOS Design" - you can run this down to 0 Hz.


8080, or 8086. They use older processors because the wider process has a lower chance of a bit soft changing. Which you really care about in say PLC's.


They can't replace Bulldozer fast enough.


The second generation had pretty good value for money. It's a bit long in the tooth now, but you make it sound like it was bad when it came out.


Sure, AMD had to price them so that they were competitive. From a consumer standpoint there were good potential reasons to buy them. From AMD's perspective, those were big chunks of silicon being sold for far less money than what Intel was able to charge for something that cost as much to manufacture. I doubt they made back their development expenses on them, though thankfully AMD had a lot of other products that worked out better.


I don't disagree, but my lab is full of st00pid cheap 8-core 8250 and 8350 chips running VMWare. I paid a fraction of what other folks paid for 4 core i7, and for my use case, I get lots more out of it. YMMV.


They are definitely cheap. I just built a 32 core cluster for £1200: http://rwmj.wordpress.com/2014/04/28/caseless-virtualization...


That's fucking awesome. I'm an amateur.


Let me shove 60 of these into 4U, 10gbit to each blade, redundant switches in the chassis itself with multiple 40gbit uplinks, and give each blade m.2 slots instead of 2.5" drive slots, and let me shove 64gb of memory into each blade.

If AMD can't get a vendor to build these this year, I don't see a future for AMD.

Yes, and I'm aware this is probably 6000w peak into 4U.

Edit: You can downvote me all you want, but this is what Intel is already planning on rolling out through Dell and HP in the 2015 time frame. This much has already been reported on links that hit HN's front page.


Maybe they would put them in something like this?

http://www.seamicro.com/sm15000

They get 256 low-power Atom cores in there, that's about 100 chips per 4U?


I find AMD's ownership of Seamicro confusing, especially since they still sell Intel CPUs. Also, those are not server Atoms and very old.

Most likely what future systems will look like will be clones of HP's Moonshot chassis (which that is a clone of Sun's Thumper, doing blades instead of HDDs).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: