Hacker News new | past | comments | ask | show | jobs | submit login
Intel Kills “Tick-Tock” (fool.com)
255 points by Deinos on March 23, 2016 | hide | past | favorite | 128 comments



Main take away: Intel is doing three phases instead of two tick-tocks.

Secondary take away: due to the increased time to move to a better manufacturing process, Intel will likely not have as much a competitive advantage anymore. Though the number of competitors will (and has already) decrease due to the high investments.

Would be interesting to see how long intel estimates the process cycles are going to be, i.e. how Moore's law will progress.

Weird statement in the article: "it (TSCMs 7nm tech) should be very similar in terms of transistor density to Intel's 10-nanometer technology". This makes no sense as it would be comparing apples and pears. Surely they are referring to TSCMs 10nm tech?!


> Weird statement in the article: "it (TSCMs 7nm tech) should be very similar in terms of transistor density to Intel's 10-nanometer technology". This makes no sense as it would be comparing apples and pears. Surely they are referring to TSCMs 10nm tech?!

Process metrics are somewhat of a marketing nature nowadays (and not as directly related to the physical properties of the fabrication process as it may seem--at least not without adjusting for the differences between vendor-specific definitions in use).

Compare TSMC's 10nm to Intel's 14nm:

"In the case of TSMC they follow the “Foundry” node progress whereas Intel follows more of an “IDM” node transition 40nm versus 45nm, 28nm versus 32nm and 20nm versus 22nm. At the 14nm node TSMC has also chosen to call their node 16nm where everyone else is calling it 14nm."

Source: https://www.semiwiki.com/forum/content/3884-who-will-lead-10...

"Although the nominal gap in process nodes between Intel and TSMC appears to be narrowing, TSMC is not likely to catch up in terms of actual Moore’s Law scaling any time soon. TSMC’s 16FF+ process delivers only 20nm scaling, so they are still a generation behind Intel’s 14nm in terms of actual die area. TSMC said that 10nm shrinks by 0.52x from 16nm, nearly identical to the 0.53x scaling that Intel achieved from 22nm to 14nm. So if they stay on schedule, in 2017 TSMC will be in production on a 10nm process that is equivalent to the 14nm technology that Intel began producing in 2Q15. At that rate, even though Intel has slipped 10nm to 2H17, they will remain at least a year ahead of TSMC."

Source: http://www.eetimes.com/document.asp?doc_id=1327725


No, there's a very simple definition for these XXnm numbers and a very good reason why they only roughly correlate with transistor density.

These numbers refer to the length of the channel in the MOSFET or FinFET between the Source and Drain dopant regions. The channel is always the smallest feature the fabrication process can create, but how it's created is actually with lots of solid-state physics tricks (such as annealing to cause the dopants to spread out from where they were originally implanted, looking a lot like a physical Gaussian blur), as the original masks used to define the transistor are nowhere near that small.

Depending on how small they can get the mask details (mostly an issue of optics and mask creation, now bounded primarily by the frequency of light they expose to the mask material, though interesting holographic tricks can help if they've left the laboratory phase) you can define the minimum size of the transistors, themselves. But there are other considerations like heat dissipation and the melting points of the various metals you've used that may force the transistors to be larger than what you can theoretically build.

The channel length is still very important because it determines how many Coulombs of charge you need to switch the transistor on, and how long it takes to propagate a signal across the transistor.

It also correlates with how much leakage current the transistor has (smaller = higher) and therefore the standby power consumption, so you do continue to make an engineering trade-off to get lower, which is also why server hardware tends to get the smallest sizes first, as they least care about idle power consumption if they're being utilized correctly.

Everything I've said is about 4 years out of date, when I switched from EE back to Software Engineering for financial reasons (and also my wife is an EE so we didn't want to put all of our financial eggs in one basket), but the semiconductor industry moves so slow, and proof-of-concept fabrication techniques take so long to productionize, that I doubt this is very far off.

The only thing marketing has done here is fight between each other on whether or not channel size is the number to focus on, as it benefits their corporation or not.


Thanks for the reply!

> These numbers refer to the length of the channel in the MOSFET or FinFET between the Source and Drain dopant regions. [...] The only thing marketing has done here is fight between each other on whether or not channel size is the number to focus on, as it benefits their corporation or not.

Right -- I believe this is where the variety (or marketing) creeps in.

Here's what I mean: The various "gate length" definitions (printed, physical, effective, etc.) have a direct (and significant) impact on the ambiguity of the meaning of "process nodes" (that's in addition to the half-pitch interpretation in the DRAM industry); cf. Table 1: ITRS 2013 Data for CMOS Technology “Nodes” in http://semiengineering.com/a-node-by-any-other-name/

For reference (for anyone else reading, I know you know this), "channel length" vs. "gate length" is another difference to take into account: http://vlsi-soc.blogspot.com/2015/12/channel-length-vs-gate-... (More formally -- "Sidebar: Gate Length (Lg) versus Channel Length (L) and Experimental Data versus Equations" in http://www-inst.eecs.berkeley.edu/~ee130/sp06/chp7full.pdf)

Given that, I think it's fair to say the result is that this is more of a marketing name than a directly (in terms of physical feature sizes) interpretable number; as in:

- "At the December meeting, for example, Chenming Hu, the coinventor of the FinFET, began by mapping out the near future. Soon, he said, we’ll start to see 14-nm and 16-nm chips emerge (the first, which are expected to come from Intel, are slated to go into production early next year). Then he added a caveat whose casual tone belied its startling implications: “Nobody knows anymore what 16 nm means or what 14 nm means.”"

- "The switch to FinFETs has made the situation even more complex. Bohr points out, for example, that Intel’s 22-nm chips, the current state of the art, have FinFET transistors with gates that are 35 nm long but fins that are just 8 nm wide."

http://spectrum.ieee.org/semiconductors/devices/the-status-o...

Related: http://spectrum.ieee.org/semiconductors/design/shrinking-pos...


> which is also why server hardware tends to get the smallest sizes first, as they least care about idle power consumption if they're being utilized correctly.

That's certainly true for Google or Amazon but not for most real corporate IT departments. It's very common for small offices to need a local server for one reason or another even though the thing will be idle most of the time.

Moving the server into a larger datacenter would require a faster network link to that office which would be both more expensive and slower for local employees than the local server. So every little office gets a server which is mostly idle.

And then you need two of them for redundancy. And then you need four of them because one app touches credit cards and needs to be in the PCI CDE and the other doesn't etc. etc.


An office with two machines in it has no concern for power costs, it's nothing compared to their office lighting budget.

Compute cost matters when compute is a substantial part of your budget.


>That's certainly true for Google or Amazon but not for most real corporate IT departments.

I can't speak for all corporate IT departments, but I'm at a very large corporation that certainly cares about idle power consumption. Over the years our servers have gotten more powerful, and more power hungry. It's not that hard to add more power, but adding new cooling isn't so easy. The less power we use overall, the less cooling we have to add, and the happier we are.


You have the comparison backwards. The claim is that Google and Amazon don't care about idle power consumption, presumably because their homogeneous computing environments lead to relatively little idle time.


>Weird statement in the article

At this point, process nodes don't map to a particularly well-defined set of feature sizes across vendors. So what I assume they're saying is that what Intel calls 10nm is actually pretty similar in terms of transistor size/density to what TSMC is calling 7nm. (No idea if that's actually true but it's what I believe is being said.)


Yes. Node sizes and actually physical sizes have no correlation anymore. Smaller nodes are smaller, but nowhere near the sizes suggested by names. The already a decade old 45nm process highlights this nicely: https://en.wikipedia.org/wiki/45_nanometer#Technology_demos Effective sizes are all over the place, and Intel's process in particular is nowhere near 45nm for any relevant dimension. It's similar with more recent technologies (and with 3D transistors, the distance between two arbitrary points becomes even less comparable.)


I've been wondering if EUV will bring things back in line. They say the light sources are not bright enough yet, but isn't doing a 2-4x exposure better than double or triple pattering? If the total time were the same you'd still have the benefit of the shorter wavelength allowing dramatically smaller features. What am I missing?


It isn't 25% as bright as it needs to be - it's more like 1% or 0.1%. The total time isn't the same.


That is not true at all, wafer throughput is a factor 4-5 less... The mistake in the first comment is that with self aligned double patterning you only need one litho step, not two. The same applies for SQDP. And since litho is one of the most expensive steps double patterning is more cost effective.

From a technology point of view the problem with EUV is more the line edge roughness, line width roughness, defects and the reliability... Given, more light will make the job easier


> how Moore's law will progress.

I think it's all but confirmed at this point that it's not progressing. What we're witnessing here is it's death.


It certainly isn't dead. It is slower than it was before. Moore's original law stated that every 12 months the number of transistors would double. That was revised to 18 month in the 90s, and is now due another update.


I guess we can agree to disagree on that one. Every trend we look into, we see signs of it's death from memory prices to Intel's struggles on recent chips.


Furthermore it's not a useful trend anymore if it's due for an update. What would that new constant be that people feel confident will hold over the next 10 years? Seems like quite the Frankenstein "law" to be hacking piecewise functions together.

If it's not dead, it's certainly dying. Perpetual exponential growth is irrational given finite resources.


Well, unless the factor is smaller than 1.


Another reason might be they have no AMD pushing them. They are hitting physical physics barriers, but the fact they have no real competitors makes them have less incentive to push the envelope.


It's possible, but the usual pattern is that Intel is slightly price-competitive with AMD when the CPUs are in the same range of performance, and then Intel's prices just rocket upward after that. There's no reason to not come out with an $8,000 desktop CPU if they can get more performance.


TSMC?


I'm also willing to give the benefit of the doubt in that given how sophisticated processors have become they realized that for important use cases they can improve performance pretty significantly with primarily microcode tweaks and slight architecture changes. If that's the case then may as well do that and take more time / less cost refining the processes and architectures for the next node.

I'm mostly speculating, though.


Fun fact: The distance between silicon atoms in a crystal is on the order of 0.2 to 0.5 nm. The 10 nm process is therefore operating on the order of tens of atoms. That's just nuts. Quantum effects must really complicate things at this scale!


Yeah, I've been looking forward to how they'll go past 10nm for years! Crazy to think how much control they have in the manufacturing process.


Aren't they planning to go to 7? I thought I read that somewhere.


It was in the article; TSMC expects it to go into production by 2018.


Yes, but is Intel?



So now it's "Tick-Tock-Tweak".


Some PR person at Intel is kicking themselves now.


Not necessarily, maybe they know it's time not to give the "tick" expectations anymore. They can be very soon away, and from their perspective, it's time to be identified with other aspects, not with "ticks."


Not even that, the question is if there will ever be more than one or two more "ticks" in the traditional sense. Apparently the sizes of the elements on the chips are roughly the same on Intel's 10nm and some other 7nm. As an example of what people in the field say: if nothing changes, at 5nm, 80% of the transistors would have to be "dark" (unpowered) all the time. EUV, often presented as the new solution in the processes, would also possibly be too expensive under 7nm even if it's still not even introduced.

So the "tweak-tweak" can become the major cycle soon on more levels.


The question is what "nothing changes" means, and whether that's a desktop context or not. If he means a chip with 10x as many transistors can only keep 2x active at the same power budget, that's not a big deal. Give me a physically smaller chip in my phone, and a bigger cooler on my desktop.


As far as I understand the issues with the current tech, it's not good enough that a chip only generates, say, 45 watts. You need to know that the dissipation is spread out enough in area+time to not overheat any individual components, which gets harder as the components get smaller.


"Tick-Tock-Tweak" is a great summary, so much easier than the "Process-Architecture-Optimization" label that Intel gave to explain their new plan.


Tick-Tock-Twerk


Tick. Tock. Tweet.


I like Tick-Tock-Punch personally, but it doesn't really matter. I wonder how long this next pattern will last? Probably not nearly as long as tick-tock.


translation: this transition has been inevitable since we discovered that immersion lithography is a bear, the deposition processes to support atomic layer deposition are a bitch, copper has issues being anorexic, our fabs are so sensitive that they are effected by nearby farms, and EUV litho is a piper dream (in a vacuum).

Not to judge, they make things that are mere atoms in size...but the corollary to Moore's law should always have been EY's law: The cost of each major change in semiconductor production methods doubles


Can you elaborate on how nearby farms have affected microchip fabs? Something to do with airborn organics from their animals or chemicals?


When I worked in a fab, we saw process variation driven by weather patterns, so fabs are quite sensitive to their environments. (Barometric pressure can affect both diffusion and wets.) But I have a hard time seeing how a nearby farm would affect a fab. The air handling systems are pretty intense, so I imagine you'd see more filter changes if anything, and every change point has the potential to harm the process. But that's a pretty indirect mechanism. A sibling comment mentioned heavy machinery, which also seems unlikely. The lithography equipment had its own pedestals grounded in bedrock, separate from the building foundation, and each stepper/scanner has its own vibration damping. The rest of the equipment is less sensitive to vibration.


basically this, but see my other answer :)

Thunderstorms used to just shut down litho for hours/shifts


My guess would be more vibrations?

Heavy equipment causes vibrations in the ground, which most people probably can't notice, but when you're literally printing at the atomic scale, microscopic vibrations probably make a difference.


I work in a lab. We can't use our most accurate scales when there's construction going on with heavy machinery within a couple of blocks. This isn't cutting-edge stuff either, it's measuring to 0.001 gram, with a $10k scale and a 4" thick granite table on rubber pads (unlike other equipment that's on air floating supports with active counter balancing).


That's so interesting, thanks for sharing!


Wouldn't they make less difference than the vibrations caused by regular factories? Or should we be adding homeopathic fracking to the list of environmental concerns?


Sorry..was offline for a day. Farms generate more than food. They generate among other things

-Dust -Chemical over spray -Cow poop

there is no such thing as a perfect filter. What you learn over time (not me, the factory as a whole) is that wind direction matters, everything matters, and you start to pick apart causes and figure out why.


Or they figure there is no point in pouring exponentially more cash into R&D when there is no viable competition left.

It's hard to tell with Intel.


They squashed AMD long ago, I don't think it's this.


And less and less people buy new PCs with their products

Some phones have Intel processors but they're not popular


You forgot about server market which is Intel-dominated and growing well.


The way I have begun to term it to people not really familiar with technology is this:

Every time you buy a cellphone with an ARM chip, Amazon buys an Intel chip to process all the data, services, apps, and websites you access from your phone and all the data those services, apps, and websites collect from you.


I found this hard to believe but from a quick Google search it appears you're right - Intel desktop CPU sales have been on a downward trend since 2011.


That's partially related to Intel running against a wall with CPU speeds around that time, and partially to the mobile boom.

An 2011 i7-2600K desktop CPU is still competitive with current generation i7 CPUs for general-purpose computing – most improvements were in specialized instruction sets like AVX or TSX (which infamously was broken on the first generation shipping it). Lower-end i3 and i5 are even worse off, because they don't get most acceleration instruction sets in the first place.

For consumers, it makes absolutely no sense to upgrade their desktop PCs unless they break.


Well, don't forget the GPU. Intel Iris is much better than Intel HD.

The way I see it, this is our fault as software developers. A 2016 PC would be much faster than a 2011 PC if we as software developers made good use of SIMD and GPUs. But we don't.


I think it's more developers getting lazy. Why bother with SIMD and GPUs when you can write in a high level language like javascript, design with HTML / CSS, deploy your app with embedded libchromium and have a faster time to market. SSDs and fast CPUs have made efficient software somewhat of a rarity these days.


It not about being lazy, it's more about how can I get cross-platform GUI support that looks good without having to use C++? The answer seems to be html these days, unfortunately.


And even Qt is shifting to using Chromium web views to supplant their native widgets.


On most workloads typical desktop users run (there are many exceptions, of course, but in terms of numbers of people, those are in the minority), the computational speed of the CPU is not a limiting factor any more. I/O, amount of RAM and probably memory bandwidth are far more important; on a typical mid-range desktop machine running Windows, Office and some line-of-business application, I/O completely dominates, at least from what I have observed working as a sysadmin / helpdesk monkey.


MS Office: yes. Content creation: no.

3D rendering, video editing, and music production all need as many cycles as you can afford, and then some.

VR and all those AI/ML technologies waiting around the corner are going to be even more greedy.


That is true.

At work, our CAD people use Autodesk Inventor heavily, and that thing will happily gobble up all the CPU cycles one can throw at it. (It it the one example I have first-hand experience with.)

What I meant was that for most users of desktop PCs in an office environment, a faster CPU is not going to make much of a difference in overall system performance. (I might be a little sore because at work, users will sometimes complain there computer is too slow and then demand a new one with an i7, and then I have to explain to them why that is not going to help, while a RAM upgrade and an SSD are going to make a big difference.)

But you are right, there are plenty of examples where there is no such thing as "fast enough". ;-)


I blame the latter on language expressiveness more than anything else. Here's two pieces of C++ code; one "clean", one fast, taken from [0]:

    void blur(const Image &in, Image &blurred) {
        Image tmp(in.width(), in.height());
        for (int y = 0; y < in.height(); y++){
            for (int x = 0; x < in.width(); x++){
                tmp(x, y) = (in(x-1, y) + in(x, y) + in(x+1, y))/3;
            }
        }
        for (int y = 0; y < in.height(); y++){
            for (int x = 0; x < in.width(); x++){
                blurred(x, y) = (tmp(x, y-1) + tmp(x, y) + tmp(x, y+1))/3;
            }
        }
    }
The optimised-for-speed version (order of magnitude difference):

    void fast_blur(const Image &in, Image &blurred) {
        m128i one_third = _mm_set1_epi16(21846);
        #pragma omp parallel for
        for (int yTile = 0; yTile < in.height(); yTile += 32) {
            m128i a, b, c, sum, avg;
            m128i tmp[(256/8)*(32+2)];
            for (int xTile = 0; xTile < in.width(); xTile += 256) {
                m128i *tmpPtr = tmp;
                for (int y = -1; y < 32+1; y++) {
                    const uint16_t *inPtr = &(in(xTile, yTile+y));
                    for (int x = 0; x < 256; x += 8) {
                        a = _mm_loadu_si128(( m128i*)(inPtr-1));
                        b = _mm_loadu_si128(( m128i*)(inPtr+1));
                        c = _mm_load_si128(( m128i*)(inPtr));
                        sum = _mm_add_epi16(_mm_add_epi16(a, b), c);
                        avg = _mm_mulhi_epi16(sum, one_third);
                        _mm_store_si128(tmpPtr++, avg);
                        inPtr += 8;
                    }
                }
                tmpPtr = tmp;
                for (int y = 0; y < 32; y++) {
                    m128i *outPtr = ( m128i *)(&(blurred(xTile, yTile+y)));
                    for (int x = 0; x < 256; x += 8) {
                        a = _mm_load_si128(tmpPtr+(2*256)/8);
                        b = _mm_load_si128(tmpPtr+256/8);
                        c = _mm_load_si128(tmpPtr++);
                        sum = _mm_add_epi16(_mm_add_epi16(a, b), c);
                        avg = _mm_mulhi_epi16(sum, one_third);
                        _mm_store_si128(outPtr++, avg);
                    }
                }
            }
        }
    }
I don't know about you, but that looks like an error prone maintenance disaster waiting to happen.

And, just for comparison, Halide code that produces results as fast as the second code:

    Func halide_blur(Func in) {
        Func tmp, blurred;
        Var x, y, xi, yi;

        // The algorithm
        tmp(x, y) = (in(x-1, y) + in(x, y) + in(x+1, y))/3;
        blurred(x, y) = (tmp(x, y-1) + tmp(x, y) + tmp(x, y+1))/3;

        // The schedule
        blurred.tile(x, y, xi, yi, 256, 32).vectorize(xi, 8).parallel(y);
        tmp.chunk(x).vectorize(x, 8);

        return blurred;
}

(this is kind of a weird coincidence; last time I replied to you I mentioned Halide[1] as well)

[0] http://people.csail.mit.edu/jrk/halide12/halide12.pdf

[1] http://halide-lang.org/


Definitely, we've failed in programming language design as well. The biggest problem is that we keep sticking with C++ :)


  (defun language-choice (developer) 
    (if (> (developer-hipness developer) (developer-experience developer)) (lang-du-jour)
      (if (developer-scared-of developer 'parenthesis)
        (c-family-language)
        (lisp-family-language))))


You work in Rust, right? How would you express the above in that language?


SIMD is a work in progress, but we have the foundations laid for a much more ergonomic approach: http://huonw.github.io/blog/2015/08/simd-in-rust/


Switch to DLang


When graphics are a bottleneck it's usually easier and cheaper to pop an entry-level graphics card than to throw out or replace the whole computer (unless it's a laptop). 3-4 years old low-end graphics cards still beat Iris Pro.


Yeah, even though Iris is "good enough for light gaming", integrated still really lags behind dedicated GPUs in how smooth even a desktop experience is.


I am actually part of that camp. I have a i3570k and recently did an entire system upgrade including changing from full tower to mini ITX. The only real problem was the selection of mini ITX motherboards for my older socket type is pretty poor. But I'm now Oculus ready with only a minor CPU overclock despite my CPU being pretty old.


Yea I still keep waiting to upgrade my 2600k, but at this point I think all the other parts will physically break before I need to. It's like 10-20% slower than a chip 5 years newer, big woop.


Same here. It's hard to want to upgrade a 2500k stable at 4.6ghz on air with no overvolting. Got it right when it came out. OTOH, starting with a Radeon 6870 then later adding a second for crossfire and Bitcoin mining when that was profitable still barely give me the power to play the newest Rainbow 6 on low everything @1920x1200 while keeping >30FPS min on Win10. I'm personally waiting for the new GPUs to come later this year before upgrading, then seeing if Oculus or Vive has better game support at that time.


The 2600k cost about $350 new in 2011. For the same amount of money you could get a 5820K which is about 1.5x faster than the 2600k (or to put it in your terms, the 2600k is 52% slower). Still probably not worth it, but I looked up the numbers and might as well post them.


If the 5820k is 1.5x faster, doesn't that mean the 2600k is 33% slower, not 52%?


Oh, yes good catch. The 2600k is (12991 - 8520) / 12991 = 34% slower. That's what I get for not rereading before posting...


Overclock by 20% aaaand done.


If they drag their feet long enough eventually ARM will catch up, right?


Many of the other things Intel has done have been somewhat impossible too.


I would argue that it isn't just Intel but all of the indsutry. They do things that seem impossible on a regular basis.


Sensitive to nearby farms? That sounds interesting. Could you elaborate?


I've been really disappointed in Intel's desktop offerings. At the 300$ high end (quad core i7) desktop price point we have been at 4 cores for years. I would really like an option for more slower cores at that price point.

I guess I kind of get why. Might be a socket compatibility and cost issue, the allocation of die space to a GPU, but it would nice to see some movement. Also probably zero demand outside of software developers, but I have to wonder if it is kind of chicken and egg problem.

It's actually kind of funny if I recall on desktops more of the die is GPU than CPU.


Hello Xeon D

8 core 16 thread, 2.1 GHz base clock, 2.8 turbo, latest tick, latest tock, supports up to 128 gig ram so double the i7, mini itx boards @ 800 dollars including processor. No GPU though. Nice thing is, it does ECC ram. Oh and, 10Gbe. Awesome product. Not for gamers but superb for devs IMO. Throw in a gtx980 for display/compute, 1TB ssd, plus said 128 gig ecc ddr4 @ 1200 dollars, and you can do top-of-range mac pro class power for a third of the price. IE around 3.5k dollars before monitor for a very serious little rig. And if you're creative there are some stunning gamer cases for the mini itx form factor.


Where do I buy a Xeon D? The ARK lists them all as sold in "TRAY" form which in my experience means "not for you".


check out newegg. They come from Supermicro, Gigabyte, and I think Asus does a line too. They're pretty new so I think only Supermicro is currently listed but they come from at least 4 manufacturers now in Mini ITX form - so fitting hundreds of consumer-class (and often awesome - this is the sweet spot for gamers) small form factor cases. Google "Xeon D Mini ITX". Make sure you get the 8C/16T version not the 4/8 version. You want the 1540 or 1541 versions, not 152x. Supermicro is proper server-class stuff - no messin' around - the others will be slightly cheaper but they're more sporadically available at mo. Also check if you need 10GbE - that costs about 100 dollars more. Still, all will work perfectly as a desktop in a mini-ITX case, and that is a nice small form factor. We're not talking about a huge ugly hunk of box here.

This is going to be "build your own" btw but anybody who's used a philips screwdriver can do this. Here in UK I have the total mentioned above at just over 2200 GBP including PSU and a cute Corsair case (http://is.gd/0pfKhg) so we're talking 3300ish dollars leaving 200 left for a pro mech KB and mouse. For the ram I had Crucial supply me for 780 british pounds (1200 USD ish) for 128 GB of 2400Mhz DDR4 ECC DIMMs (4 x 32 - there are only 4 slots so don't do 8 x 16), which works with these boards. As I said total including RAM about 2200 pounds so 3300 dollars. Or if you don't need 128GB start off with 64, say, but stick with the 32GB DIMMs so you can upgrade later.

http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&DE...


Thanks, this seems like a good upgrade route from my current Avoton-based FreeNAS (8 cores, fanless, 2.4ghz, 64GB of ECC DDR3, 16 PCI-E v2 lanes, VT-x) http://ark.intel.com/m/products/77987/Intel-Atom-Processor-C...

Asrock Rack made some nice boards with onboard IPMI, dual Intel gigabit NICs, and 8x SATA. Kind of pricey but still under $500 for the board back in 2013. Been a great board: http://www.asrockrack.com/general/productdetail.asp?Model=C2...

Here's their Xeon D stuff: http://www.asrockrack.com/general/products.asp#Server

Denverton is the successor to Avoton; 16 cores, 16 lanes of PCI-E v3, 10GbE, DDR4: http://www.fudzilla.com/news/processors/39156-intel-s-denver...

Haven't heard anything about it for months, though.


Ahh, so they're soldered on to the motherboard. I was just looking for that sort of form factor over the weekend (as a way of getting the low-power 6th-gen i7s... but didn't think to look for Xeon).

I will have to consider this. I run a Mini-ITX board in a Micro-ATX case anyway. Perfect form factor.


Oh that is pretty reasonable.


> but I have to wonder if it is kind of chicken and egg.

I don't know. Low-end i3s have 4 virtual cores, i5s 4 physical cores – and most consumer software still makes little use of even two. I just downgraded a bunch of users from quadcore 55W i7-HQ to dualcore 15W i5-U CPUs (laptops) – they don't notice a performance difference, because Windows and Office and all the web app bullshit is still single threaded.


That seems like a workflow issue more than a threaded application issue. I can see that the majority of consumer and business workflows only involve running one or two applications at a time, but the extra cores and hardware threads can make a difference if you need to run multiple applications simultaneously.

Office applications may be single-threaded, but if you need to open Word, Excel, and Powerpoint simultaneously they can each use a separate core. Similarly, you may need to run a web application at the same time as one or more Office applications. This sort of multiprocess parallelism still benefits from multiple cores.

As for "Windows ... is still single threaded", I'm not sure what you are talking about. The concept of a thread is part of the OS implementation.


> I can see that the majority of consumer and business workflows only involve running one or two applications at a time, but the extra cores and hardware threads can make a difference if you need to run multiple applications simultaneously.

If you need to run multiple applications simultaneously and they do sufficient amount of computations to generate noticeable load.

> Office applications may be single-threaded, but if you need to open Word, Excel, and Powerpoint simultaneously they can each use a separate core.

And idle. Unless you're actively editing a document, the applications are doing exactly nothing with their separate cores. And so far, our attempts to create four-armed employees able to write two concepts at the same time have hit some unexpected difficulties, so generally only one document is doing any CPU intensive work (and promptly hits 100% load for the single core it can use).

Excel would be an exception, if we did any heavy processing in it – which we don't. We have real databases for that. (And Postgres does scale nicely on its Xeon servers.)

> As for "Windows ... is still single threaded", I'm not sure what you are talking about.

Windows Defender / Security Essentials is singlethreaded, throttling all I/O to 50 MB/s effectively on an i5-6200U (and inducing crippling latency). Windows Explorer is singlethreaded (or has very unfortunate locking), so a single stalling I/O request (spinning up disk/CD drive; SMB over WAN) hangs up the device's entire UI. Windows Update is singlethreaded, resulting in hour-long churn while computing applicable updates. And so on. While you get some noticeable improvements from using a dual core, and a few more from using a dual core with hyperthreading, a full quad-core (with or without HT) is just a waste of silicon for our Windows clients.


> As for "Windows ... is still single threaded", I'm not sure what you are talking about. The concept of a thread is part of the OS implementation.

On a typical desktop workload, at least the ones users at work put on their machines (i.e. running Windows, Office, a browser, maybe a PDF viewer and sometimes something like AutoCAD for P&ID) it is rare to fully utilize more than one or two CPU cores.

In fact, for performance on a typical office PC, the CPU hardly matters any more compared to I/O and (to some degree) RAM (and insufficient RAM again becomes I/O load when the swap-fest starts). When one of our users complains about performance, unless the machine still has a Core2, rather than replacing the CPU, we tend to upgrade RAM and/or replace the hard drive with an SSD.


I hear what you are saying. I have a very large number of cores in my workstation and it's disappointing/frustrating how many applications—most poignantly, Mozilla Thunderbird—are in practice single-threaded. Even using Firefox with Electrolysis and a bumped dom.ipc.processCount setting yields an experience that seems routinely stuck waiting for a single user interface thread to do something.

And multithreaded JavaScript for webapps seems like a pipedream. Virtually no one uses Web Workers because they were designed to avoid sharing data between threads, making them inordinately cumbersome to use in any routine/real-world cases.

There is a lot of room for improvement across many fronts. But I am hopeful it will occur sooner or later because I suspect adding more cores will be easier than increasing the speed or each core for the foreseeable future.


dom.ipc.processCount does not do any load balancing. It just does one tab one process until the number of tabs is greater than dom.ipc.processCount when that happens every new tab ends up in the last process.

For right now any settings other than one are not tested or supported. After electrolysis ships tuning the dom.ipc.processCount value based on the number of cores and and such is planned.


Get a dual socket motherboard, then, and run two CPUs.

I remember back in the day when SMP was the mark of a Real Programmer with a Real Computer. You kids today and your multicore processors! Fie!!


>I would really like an option for more slower cores at that price point.

AMD tried exactly that with the Bulldozer architecture, which was a dismal failure. Desktop workloads are frequently bottlenecked on one or two cores.


Well, my old FX4100 keeps being competitive.


Large core counts are the bread and butter of enterprise/cloud. I suspect Intel is binning those chips for their real customers (not desktop users). Probably has to do with wafer yields or something if I can hazard a guess.


By now they probably have the yields. I'd guess it's just market segmentation. Harder to convince your server customers to fork out $4k for a 16-core chip when you're selling 16-core chips into desktops for $300.


Actually, I suspect you're right. I was being a bit generous with my characterization. :-) Maintaining the status quo allows Intel to add significant margin to their enterprise solutions.


You can get a laptop with a Xeon these days, and dual socket workstations are plentiful. They're getting the 8+ core Xeons in many cases now: http://shop.lenovo.com/us/en/laptops/thinkpad/p-series/p70/

36-core workstation based on Xeons: http://www.mediaworkstations.net/i-x2.html

Makes your MBP look pretty cheesy when you could have something like that under your desk. Then again, for developing rails and angular apps, you don't actually need it, and the PCI-E SSD makes a bigger difference in practice (you can RAID 0 NVME SSDs in workstations too though).


I don't think so because it's not the same layout. Server's don't have GPUs and not the same GPUs when they have stuff for CUDA. They also don't use the same sockets or number of memory channels.


I meant high core count CPUs (intel), not GPUs.


The desktop chips and server chips are not the same chips. You can't bin a different chip as something else. Binning is the practice of taking identical silicon and separating it based on grade and functionality (disabling cores, turning off functionality etc.).

Server CPUs don't even use the same sockets they are fundamentally different.


I was talking about wafers, not dies, but it's all speculation regardless.


I recently built a new machine with the i7-5820K Haswell-E 6-Core CPU which sells for $350. It requires an LGA2011 socket motherboard which costs a little more than an LGA1151, but it can easily be overclocked to ~4 GHz and compares well against the i7-6700K Skylake 4-Core CPU in benchmarks.

https://www.cpubenchmark.net/common_cpus.html


It's interesting and sad that this is the way it is. I'd get way more benefit from another core than another GHz. I'm sure it's not just developers in this boat, but I too am pessimistic about the size of that market.

Would software developers change their approach to coding if 32-core chips started to show up in consumer machines? I'd like to think so, but then again, I figured everything would be multithreaded by now.


What apps honestly scale past 4 cores? Dev stuff does. Movie encoding stuff does. For surfing and facebooking and photoshopping??? Seems not that useful.

(Which is sad because I want 16 core CPUs, but may be a long time because that isn't so useful for normal people)


More cores = more silicon (expensive)

You can turn hyperthreading on if possible

There's probably a narrow set of workloads that would benefit from more slower cores than 4 faster ones


They are already shipping the silicon. There is as much real estate in the GPU as in the CPU. They could drop the GPU and have 8 cores in the same diameter. There are a legion of other issues with that many cores, but die space isn't one of them.


make -j


Hopefully with an SSD but I'd guess it's still IO bound


Depends on the language I'm sure. Compiling C++ is definitely not I/O bound. I have a 14-core build system that achieves approx linear scaling speedup over my dual core laptop.


Not at all; I've seen parallel builds remain CPU-bound on 72-way and 144-way systems.

The biggest bottleneck tends to be linking steps and other serialized items. (And those are getting better too; gcc's current linker supports parallelization using -j, integrated with make.)


People still haven't completely worked through the fact that Moore's Law has died. I think the anger phase was the late 2000's, when quantum computing, etc. was trotted out to denounce anyone noticing the slowdown.

This looks like bargaining to me.


Nah, you're not taking in account Moore's Meta Law:

"Whenever Moore's law is no longer applicable, its definition is changed to accommodate some new version of 'computers get faster'."

Moore's Law was dead a long long time ago.


Completely agree on both counts. There are seven stages of grief, so these things take time.


Classically there are 5 (denial, anger, bargaining, depression, acceptance)


That's the old version. Intel added two phases.


It seems to me that there is still some progress to be made in Moore's Law. Maybe not directly by shrinking the manufacturing process, but:

AlphaGo requires huge amount of energy and CPU power to accomplish what human brain does in 20W, just using a smallish portion of its capabilities. There must be a plenty of undiscovered architectural improvements to computers that can still kick the Moore Law for a couple years.


Moore's law is about transistor density, and thus is oblivious to architecture.


And that's exactly what the parent commenter is saying. Moore's law actually wasn't about transistor density in the beginning, it was changed to this definition when they realized transistor density is the only thing increasing exponentially.


Well, Moore's law is stupid however it is defined, and I do also hope for big benefits from architecture (interface and implementation) work. So I guess I shouldn't be picking an argument :).

[As an aside, I had thought that Moore's law was originally transistor density, and the hype machine spun it into performance. But like I said, Moore's law is doomed regardless, so whatever.]


GPUs are keeping Moore alive


They should call this tick-tock-tack, since the last stage is when they tack and introduce new architecture. You heard it here first.


This turn of events was evident as of 2 months ago. I wrote: "Intel's Tick-Tock is no more. Say hello to Tick-Tock-Tock."

https://plus.google.com/+MarcBevand/posts/ZpuSkXqaBfK


So Intel's "Ticks" weren't quite keeping up with the "Tocks".


Would it be fair to say ...

... they're switching to Tic-Tac-Toe?

(Thank you, don't forget to tip your waitress) ;-)


This is actually a nice joke.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: