Hacker News new | past | comments | ask | show | jobs | submit login

I'm still waiting for this nasty hardware bug to be addressed: https://community.amd.com/message/2796982

Or actually, there are 2 bugs. Some random freezes, and heavy multithreading segfaults.




It would be nice if someone actually managed to pin this down. It appears to primarily affect people that compile with haswell optimizations.

The bug was first reported in april yet to the date no narrowing down from either AMD or community.

This is the best hint we have so far: http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/b48d...


I happens to me very consistently when I compile with -march=znver1.


This should get more recognition from the media. Especially as AMD are launching server and high-end CPUs with high core-counts (where I assume a large part of the market will be programmers), getting 120fps instead of 140fps in ${SOME_GAME} is irrelevant compared to unpredictable crashes during `make -j 16`.

Personally I would like to build a Ryzen (possibly ThreadRipper, depending on pricing) computer this year, but that is definitely on hold until this issue is fixed.


Are there any performance benchmarks associated with building software? Similar to your point, gaming benchmarks are totally meaningless to me, but I would love to learn about the difference in time it takes to build some large software (e.g. Chrome) on various cpus.


AnandTech runs a Chrome Compile benchmark in every review under the office benchmarks section.

http://www.anandtech.com/show/11550/the-intel-skylakex-revie...

http://www.anandtech.com/bench/CPU/1857


Chrome compilation should be heavily parallelized. (It is my core use case. I only care about system big project compilation speed. )

The report seems to show Intel 4C/8T is doing better than AMD 8C/16T with much bigger L2/L3 cache config.

Is 7700K really that good? Can anyone from AMD explain this?

Intel (Kaby Lake) Core i7 7700K (91W, $339) 4C/8T, 4.2 GHz, 1MB L2, 8MB L3 17.81

AMD (Zen) Ryzen 7 1800X (95W, $499) 8C/16T, 3.6 GHz, 4MB L2, 16MB L3 16.32


Wow that was fast and exactly what I meant. Thanks again


Phoronix included Linux compilation benchmarks when it reviewed the Ryzen 1800x: http://www.phoronix.com/scan.php?page=article&item=ryzen-180...


More recogniced just like the Intel bugs on their BayTrail and appraently also the new Apollo Lake(same CPU family). https://bugzilla.kernel.org/show_bug.cgi?id=109051 - check how old it is...

I have a sad HomeServer (J1900 BayTrail).


Reminds me of when i first fiddled with Linux, and while checking out dmesg noticed a line up top about having detected a bug in the CPU and deploying a workaround.

I am not sure, but i think it may have been the F00F bug.


It would be super awesome of any of those bug reports included whether or not they are using ECC ram. Occasional segfaults from compilers (which touch lots of ram) could be explained by bit errors occurring in memory.


That thread is long, and there are a lot of users reporting the issue, but at least one of them claims that it still happens even with ECC memory.

The one I'm talking about is comment #338 in the AMD community thread: https://community.amd.com/message/2813391#thread-message-281...

That in turn links to this LKML message: https://lkml.org/lkml/2017/7/25/1295

which says: "this problem happens with ECC memory and memtest86 clean memory".


Awesome! thanks for finding that, I gave up a couple of hundred comments in ...


It seems unlikely that issues caused by bit errors in memory would go away when the uop cache is disabled, like so many people are reporting.



Has there been any confirmation that it's actually a hardware bug and not just a codegen or other issue in GCC?


If it was in GCC it would be possible to get a setup that makes it consistently reproducible, nor probabilistic as it is now.


It doesn't happen just with gcc. But AMD so far didn't comment on the issue, except that they are looking into it.


Looks like a good use-case for something like Docker.

Create a Docker image which reliably crashes all the time on a Ryzen system but not on others.


Why docker over anything else that uses the CPU?


My guess is state preservation, for the purpose of making the bug happen more reliably.


A static binary serves the same purpose. Docker is actually worse because you can't be sure if the bug is caused by your particular combination of Docker version, kernel version, networking setup and moon phase.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: