Hacker News new | past | comments | ask | show | jobs | submit login
CUDA 11.0 (nvidia.com)
182 points by ksec on July 8, 2020 | hide | past | favorite | 65 comments



I noticed CUDA 11.0 was almost ready for release last week when I went to install CUDA and the default download page linked to the 11.0 Release Candidate. The 10.1 and 10.2 links were buried behind a link off to the side labeled "legacy". The thing is, no library you use is going to be supporting the CUDA 11.0 RC, that's ridiculous. For example, Pytorch stable is on 10.2 and Tensorflow only goes up to 10.1.

This is generally indicative of how poorly organized the CUDA documentation and installation instructions are. The Conda dependency manager has made this a lot easier recently. Especially by, e.g., providing pytorch binaries. Though if you want to use packages like NVIDIA Apex for mixed precision DL[0] you're going to be in for a huge headache trying to compile torch from source while also managing your cuda and nvcc version, which sometimes must be the same but sometimes can not be![1]

[0] Yes, I'm aware that Apex was very recently brought into torch but it seems that the performance issues haven't been ironed out yet.

[1] https://stackoverflow.com/questions/53422407/different-cuda-...


Yeah, and the CUDA 10.0 official Visual Studio demo project build was broken for... looks like a year, at least, because they didn't want to populate the toolkit path. NVidia, you're better than this.

https://forums.developer.nvidia.com/t/the-cuda-toolkit-v10-0...

> The Conda dependency manager has made this a lot easier

Yeah but conda is "Let's do dependency management with a SAT solver, it'll be great!" On a good day, it's just slow. On a bad day, the SAT solver spins for hours before failing to converge. On a really bad day, the SAT solver does something "clever."

I've had a couple of really bad days this year. I'm really starting to not like conda very much.


My biggest gripe with the conda depencency manager is that it doesn't keep track of which packages own which files, and if multiple packages own the same file the last one to be installed will happily scribble over whatever was there before. With hilarious results, of course.

This means that keeping a conda installation up to date is often very tricky, when upgrading you frequently have to uninstall and reinstall some packages.

It works better if you start from scratch with a requirements.yml file.


You might find https://github.com/TheSnakePit/mamba useful, especially if you are slowed down by package resolution.


That looks worth a look for sure!


Conda's SAT solver for dependency management is the bane of my existence. For pip-installable packages, I'll almost always turn to pip rather than conda even when in a conda environment.


> "Let's do dependency management with a SAT solver, it'll be great!"

Debian managed something like this over 20 years ago in dpkg. But somehow people must keep reinventing the wheel.


I thought the debian SAT solver was a maintainer tool rather than something that ran every time? In any case, conda's implementation is really quite awful by comparison and they would have been well served by copying something that works instead of building something that doesn't.


You can't use open source in an enterprise. All you can do is rewrite from scratch or hire external service. Everyone knows that.


I have to use containers with nvidia-docker because NVIDIA so consistently and relentlessly breaks things without so much as a glance at backward compatibility.


I moved our Deep Learning servers over to Docker images + JupyterHub DockerSpawners recently because maintaining all the various version dependencies between frameworks was an absolute PITA.

Images are publicly available here in case anyone else needs something similar: https://hub.docker.com/u/uodcvip


I'm never sure of the relation between the driver, nvidia-docker and the container with a specific cuda version.

Last time I tried it the cuda inside the container tough it was using some old driver version while a much newer version was installed on the host. So I had to manual install the older version, not sure where the issue was but maybe it was because I was using the deprecated nvidia-docker version 2 which is still needed to pass gpu resources to containers run inside kubernetes.


The annoying thing is that nvidia-docker is still not great. You still have to deal with the driver installed outside the container, and it makes a big difference.

Furthermore it seems like even the CUDA runtime is typically not installed in the container, but rather injected in by the nvidia-docker container runtime.

It is not fun to deal with.


You don't have to use nvidia-docker to use cuda with docker. I made my own cuda containers based on Debian and pass the devices to the docker run command. I mount the libcuda and libnvidia libraries as volumes. I think that's what you mean by injecting the runtime.

Here's an example Dockerfile: https://github.com/dmm/docker-debian-cuda/blob/master/Docker...

And here's an example docker run command:

docker run -it --rm $(ls /dev/nvidia* | xargs -I{} echo '--device={}') $(ls /usr/lib/x86_64-linux-gnu/{libcuda,libnvidia}* | xargs -I{} echo '-v {}:{}:ro') dmattli/debian-cuda:10.0-buster-debug /bin/bash

Verbose but it works fine. You still have to have the nvidia driver installed on the host system.


We use Singularity as our container provider for the exact same reason. For now it has worked great and driver/CUDA updates haven't broken any containers yet over the past 2 years.

Something good about Singularity (which I bet you could also do with Docker) is that it automatically binds the right NVIDIA stuff into the container. It also works fine unprivileged :)


Yeah. To make the matter worse, they have updated libnccl-dev from their apt repo to be CUDA-11 based a few weeks ago. That breaks my CUDA app (because it is still on 10.2 and not compatible) in interesting ways. apt-hold libnccl-dev for a while and waiting for this release.


speaking of poorly organized: did they fix their embedded dependencies yet for glibc 2.30 in actual tagged releases of tensorflow?


Everytime I have to deal with multiple versions of CUDA on Linux I feel like poking my eyes out. I get that supporting developer libraries that have to interact with hardware is hard but come on...


For something this popular it shouldn't be so hard. I don't think being related to hardware is an excuse. CUDA is not a driver and exists entirely in userspace.

This is the kind of thing that happens when you're dealing with a monopoly.


The economic incentive is simple: open-sourcing the driver will allow an open-source API to interact with the hardware, allowing AMD/other competitors to support the same API. So instead of competing at the silicon level, Nvidia chooses to set up unnecessary barriers to entry at massive cost to developers/users.

Like Torvalds says [1]: Fuck You, Nvidia.

[1]: https://www.youtube.com/watch?v=iYWzMvlj2RQ


unnecessary? not in the eyes of the shareholders. just compare NVIDIAs stock surge with the performance of AMD. they protect their market and they do it pretty well.

the Linus video is awesome though :-) And I totally understand his sentiment


Yes - I would think drivers are even harder from an engineering point of view but as far as I know they have fairly good backwards compatibility for games. I think this is likely because people would be much more reluctant to buy new graphics cards if they broke their older games.


I run multiple versions all the time. They install in completely independent locations. What's the issue you're having?


Once I discovered buster-backports nonfree has up-to-date cuda it’s been smooth sailing.


The most frustrating part is the gcc version dependency.


I don't even install them, just use containers.

docker and nvidia-docker work fine for me


>cuFFT now accepts __nv_bfloat16 input and output data type for power-of-two sizes with single precision computations within the kernels.

This exact sentence is listed both under "New Feature" and "Known Issues". I'm not super familiar with CUDA stuff, but, it can't be both right?


Thanks, I have reported it internally and it is now fixed.


Looks like a mistake, should only be in New Features.


So a known issue in the known issues?


A known known


Does anyone understand why such minor upgrades resulted in a major version bump? Is this some sort of stability check point? Or some other versioning convention?


It supports a whole new architecture (Ampere) and all the good stuff that comes with it: Multi-Instance GPU partitioning, new number formats (Tfloat32, sparse INT8), 3rd gen of Tensor Cores, and asynchronous copy/asynchronous barriers. These are huge features.


Well, I think a new microarchitecture means a major bump. So between that and version bumps to to actual major software features, you get to 11 within 13 years or so.

Also, GCC 9.x compatibility may seem minor to some, but is significant for others. I also think there's some C++17 support in kernels - that's something too.


Ooh, I missed those. Support for C++17 is pretty major. Thanks. Perhaps my memory is fuzzy, I just remember the CUDA 9->10 switch having some significant (but not major) performance and feature changes.


Usually for an API it indicates a breaking change. In this case the removal of some functions which might require refactoring on the consumers end.


And to keep users on a hardware upgrade treadmill.


Do you mean removing Pascal support deprecating Maxwell? How long you'd need it to be supported to be satisfied?

Even with OSS projects discussions about ending support are not easy.


I've got confused for a sec on "removing Pascal support", as some of the 10XX GPUs are only 2 years old. Looks like Pascal stays, and Maxwell is removed indeed (it was deprecated in 10.2).


s/Pascal/Kepler/ sorry.


I think the page lists the changes since 11.0 RC, not the previous major version.


Interesting that Fedora support seems to have been dropped. Anyone know why that might be?

Edit: oh wait I think I see. Latest supported gcc for CUDA 11 is gcc 9.x, but I think latest Fedora is on gcc 10.


I'm guessing it was an oversight in the table given they still have all the fedora installation instructions on the install page:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/i...


You just have to compile that specific supported version of gcc. On an earlier CUDA, I had to compile gcc-4.9. Unlike Debian, Fedora just seems to remove all traces of old packages.


They've come to the conclusion that you should also come to, that Fedora is basically a waste of time to support because whatever you've gotten working will be terribly broken in the next release for no good reason anyone can point to?


> Added support for Ubuntu 20.04 LTS on x86_64 platforms.

Huh? I've been using CUDA for a while now on my Ubuntu 20.04 machine


That probably just means that they are testing that they are testing that platform in CI now, and are officially supporting it (taking bug reports into account, etc.).


This has been problematic for me after upgrading from 18.04 to 20.04. Every time I apt update cuda or some nvidia package my x server fails to start for several different reasons.

Just today I've already spent 30 minutes trying to start x with this latest cuda update. Too bad I can't switch back to the open source nouveau driver.


Coincidentally I was just looking at this after upgrading from 19.10 to 20.04 today -- basically, the version installed from Nvidia's old 18.04 repos work fine on 20.04.


I'd like to play with CUDA, but I just got a new laptop without an Nvidia GPU, coming from one that had a built in Nvidia GPU. It's got a thunderbolt port, but unfortunately most of the gpu's are quite expensive at around 400$. Does anyone know any cheaper options?


If you just want to try for a few hours, you can add GPU(s) onto a GCP CE instance. Along with the trial credits it should get you a few hours poking around with cuda.

Otherwise, get a pre-owned GTX950 (one that doesn't require external power supply) and a TB3 to PCI-E x16 adapter. Not enclousure, adapter. Should cost you around $200 all in IIRC. And it allows you to upgrade the card furthur down the line since most of the cost is the adapter.


Do you know what exactly I should search for when looking for a "TB3 to PCI-E x16 adapter"? Will this utilize all thunderbolt lanes available? I've got a newer laptop that, I believe, has all lanes available.


You could get an Nvidia Jetson Nano for $100 USD

https://developer.nvidia.com/embedded/jetson-nano-developer-...


For tinkering around, just use Googe Colab[1]. They offer free hosted Jupyter notebooks and have both Nvidia GPU[2] and Google TPU[3] runtime options available.

Here[4] is a notebook that shows how to install CUDA into an environment using the GPU accelerated runtime.

Only major downside is that resources aren't guaranteed (see first section under "Resource Limits" here[5]), so you sporadically may not be able to start a GPU-accelerated runtime session. But that shouldn't be much of a blocker for tinkering purposes.

[1] https://colab.research.google.com/notebooks/intro.ipynb

[2] https://colab.research.google.com/notebooks/gpu.ipynb

[3] https://colab.research.google.com/notebooks/tpu.ipynb

[4] https://colab.research.google.com/github/ShimaaElabd/CUDA-GP...

[5] https://research.google.com/colaboratory/faq.html


Google's Compute Cloud. You can play with a powerful GPU for 0.20$/hour, and you get 300$/year free I think.

Its quite easy to set up as well, basically a workstation that you can just connect to with remote desktop, but migrate the hardware it runs on.


Just ssh to older laptop?


oh great. More chasing to do. Anyone interested in working on the CUDA integration for Go (https://gorgonia.org/cu)? PRs welcome, as I am quite short on time.


Is there Mac OS support?


Nope. That's mostly on Apple though, as they discourage all APIs that aren't Metal.


Apple needs to build a completive standard to CUDA.


Apple doesn't directly compete with CUDA, they just want total control of their platforms. Metal does have great performance and tooling. In practice nobody does HPC on macs so there's no demand for linear algebra or graph libraries, which are a big selling point for Nvidia over AMD.

The fact that Apple is trying to kill OpenGL and OpenCL and block Vulkan definitely sucks though for anyone trying to do indie games, or open source ML/HPC.


Have you looked at OpenGL api's, they are a strange legacy beast. I think Apple does what's best for it's users, even if it has to create a new standard. How about other companies adapting to modern Metal API's instead?


No, you need to use Metal, or Vulkan + Molten VR, both of which suck for compute, but you don't have that much compute available on Apple hardware anyways, so it shouldn't matter that much.


I don't think there is an macOS machine which runs NVidia hardware anyway, is there?


This release dropped support for macOS due to Apple being unpleasant with Nvidia.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: