Ubuntu 18.04 Tensorflow/keras 安装问题,有人可以帮忙吗?

Ubuntu 18.04 Tensorflow/keras 安装问题,有人可以帮忙吗?

我对神经网络非常感兴趣,目前正在尝试安装并运行 tensorflow 和 keras。当然,我想在我的 GPU 上运行所有训练,但我的安装有些奇怪。我能够让所有东西运行,但找不到 libcublas.so.10 库。我彻底卸载了所有东西一两次,然后根据 tensorflow 安装指南重新安装。在这样做不起作用之后,我尝试了这个指南https://towardsdatascience.com/installing-nvidia-drivers-cuda-10-cudnn-for-tensorflow-2-1-on-ubuntu-18-04-lts-f1db8bff9ea成功率一般。据我所知,我只安装了 cuda 10.1。我检查了 nvidia smi 输出,它告诉我安装了 cuda 11。如果我查看 /usr/local 文件夹,我会得到以下输出:

/usr/local$ ls -la
total 48
drwxr-xr-x 12 root root 4096 Aug 29 20:44 .
drwxr-xr-x 13 root root 4096 Mai 19  2019 ..
drwxr-xr-x  2 root root 4096 Aug 29 20:44 bin
lrwxrwxrwx  1 root root    9 Aug 29 20:44 cuda -> cuda-10.1
drwxr-xr-x 15 root root 4096 Aug 29 20:44 cuda-10.1
drwxr-xr-x  3 root root 4096 Aug 29 20:42 cuda-10.2
drwxr-xr-x  2 root root 4096 Apr 17  2014 etc
drwxr-xr-x  2 root root 4096 Apr 17  2014 games
drwxr-xr-x  3 root root 4096 Mär 17 09:37 include
drwxr-xr-x  6 root root 4096 Mär 26 20:56 lib
lrwxrwxrwx  1 root root    9 Dez 24  2014 man -> share/man
drwxr-xr-x  2 root root 4096 Apr 17  2014 sbin
drwxr-xr-x 11 root root 4096 Mär 26 20:33 share
drwxr-xr-x  2 root root 4096 Apr 17  2014 src

这对我来说完全没有意义。有人能帮我解决这个问题吗?

以下是 nvidia smi 输出和 pip 输出

Mon Aug 31 13:26:32 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.66       Driver Version: 450.66       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 980     Off  | 00000000:01:00.0  On |                  N/A |
| 26%   32C    P0    51W / 195W |    353MiB /  4042MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1414      G   /usr/lib/xorg/Xorg                158MiB |
|    0   N/A  N/A      2234      G   /usr/bin/gnome-shell              190MiB |
+-----------------------------------------------------------------------------+
2020-08-31 11:28:29.764303: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-31 11:28:30.627193: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-08-31 11:28:30.658355: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-08-31 11:28:30.658633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 980 computeCapability: 5.2
coreClock: 1.2785GHz coreCount: 16 deviceMemorySize: 3.95GiB deviceMemoryBandwidth: 208.91GiB/s
2020-08-31 11:28:30.658659: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-31 11:28:30.658786: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64
2020-08-31 11:28:30.659849: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-31 11:28:30.660053: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-31 11:28:30.661125: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-31 11:28:30.661716: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-08-31 11:28:30.663991: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-08-31 11:28:30.664007: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
2020-08-31 11:28:30.895203: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-31 11:28:30.899538: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 4000070000 Hz
2020-08-31 11:28:30.899808: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5b63340 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-31 11:28:30.899817: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-08-31 11:28:30.900635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-31 11:28:30.900643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 

编辑:目前我按照官方安装指南安装了 tensorflow

相关内容