Jonas wrote a guide to taming your dual-GPU setup in Ubuntu and we wanted to share it. Enjoy! 

I struggled with this installation for some time and this is a summary of things to keep in mind when doing the installation and in general a guide to be able to re-do what I did in much less time hopefully!

In general Nvidia has many good guides that show how you should go about preparing the GPU support to TensorFlow and the actual TensorFlow installation itself. So I very much recommend their forum and guides (even better than the guides at the homepage of TensorFlow).

At this point of time I installed Cuda 7.5 and Cudnn 5.1. The installation was done at a laptop with a Geforce GTX 950M graphics card. The laptop also has an integrated GPU. Having multiple GPU:s and using a laptop with optimus etc probably adds a bit to the hazzle.

I will state the guides that I followed but I will also note what I found important to do DIFFERENT in the guide. So look closely for those notes.

OK, here we go:

First of all I hope that you have all of your documents in a cloud service of some sort and that all of your current work is pushed to a repo. The easiest solution when things break when doing this is sometimes to just reinstall Ubuntu.

Make a bootable usb stick with your preferred Linux distro, so you easily can reinstall it. There is a nice tool in Ubuntu called “Startup disc creator” that handles this really nicely.

To have another computer beside the computer that you are doing the installation on is really nice to have. Because you probably need to download/google/check something. This can be hard if you are working in the console.

Make sure that your computer doesn’t have secure boot on. This can be found in BIOS -> Security -> Secure boot. Set secure boot to disabled. If you already have Linux installed it’s not certain that you still can log in anymore when changing the secure boot option. So just reinstall Linux. If possible, don’t have a dual boot system. It sometimes makes it a bit more unstable and one tend to make the Linux partition to small(I did). Go all in Linux! :)

We are going to do the different installations in the following order:

  1. Install GPU drivers with NO opengl support.

  2. Install cuda + cudnn with NO drivers and NO opengl support.

  3. Build and install Tensorflow.  

  4. Add a couple of path’s.

  5. Test it!

1. Driver installation

  1. Go the the Nvidia’s driver homepage and download the appropriate package for your gpu. http://www.nvidia.com/Download/index.aspx?lang=en-us

  2. Open a terminal: ctrl + alt + t

  3. Run: $sudo apt-get install build-essential

  4. Create the /etc/modprobe.d/blacklist-nouveau.conf file with :
    blacklist nouveau
    option nouveau modeset=0

Save and close the file.

Then run: $sudo update-initramfs -u

  1. Reboot computer

  2. Press ctrl + alt+ F1 to enter the console.

  3. Go to the directory where your driver is located and run: $chmod a+x .

  4. Now, run $ sudo service lightdm stop

  5. Install the drivers by: $sudo bash NVIDIA-Linux-x86_64-367.44.run --no-opengl-files

  6. You might get an error during the installation but continue if possible.

  7. Installation should be complete. Now check if device nodes are present:

  8. Check if /dev/nvidia* files exist. If they don't, do:
    $ sudo modprobe nvidia

  9. Reboot computer

2. Cuda installation

Follow this forum thread to do the installation of cuda:

https://devtalk.nvidia.com/default/topic/878117/cuda-setup-and-installation/-solved-titan-x-for-cuda-7-5-login-loop-error-ubuntu-14-04-/

Check out the replay in the end of the thread. The beginning of this replay is what has been copied and written in “1. Driver installation”. So when doing this make sure to answer NO when the installation asks if you want to do the driver installation part of the cuda installation.

Cuda can be downloaded here:
https://developer.nvidia.com/cuda-toolkit

Make sure to use the appropriate version of cuda. Checkout tensorflow homepage to see what is supported:
https://www.tensorflow.org/versions/r0.10/get_started/os_setup.html

Cudnn can be downloaded here:
https://developer.nvidia.com/cudnn

Unfortunately you need a login to get the cudnn drivers. Also checkout the version to be used at TensorFlow homepage.

3. Tensorflow installation

Make sure that you have python up and running as you want. Consider using anaconda or virtual environment perhaps. Or just make sure to install at least numpy and scipy. Some other packages might be needed also. What is missing in python will be highlighted with an error when building the tensorflow package.

I used python 2.7.

Follow this guide:
http://www.nvidia.com/object/gpu-accelerated-applications-tensorflow-installation.html

4. Add path’s

Add this to .bash_profile or .bashrc depending on what you have and what you are using:
$ export PATH=/usr/local/cuda/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Simply run $sudo nano ~/.bashrc
OR $ sudo nano ~/.bash_profile

Add the content, save and quit.

Restart the terminal.

5. Test

Do the test according to the guide in section 3. You might already have tried this with an error, saying that it couldn’t find the some cudn-file(due to missing paths). Try again!

Happy deep learning! Send me an e-mail if you want help!

/Jonas Karlsson, jonas.karlsson@berge.io