Tensorflow-gpu:CUDA_ERROR_OUT_OF_MEMORY

Tensorflow-gpu:CUDA_ERROR_OUT_OF_MEMORY

我对 tensorflow 还比较陌生,尝试在运行 Pop!_OS 18.10 的 Thinkpad P1(Nvidia Quadro P2000)上安装 tensorflow-gpu。我将 tensorflow-gpu 安装到新的 conda 环境中并使用 conda install 命令。

现在,在运行如下所示的简单 Python 脚本 2-3 次后,我总是遇到 CUDA_ERROR_OUT_OF_MEMORY 错误。有人知道这里可能是什么问题吗?我不得不说,我之前在笔记本电脑上安装 tensorflow-gpu 时遇到了很多麻烦。我首先在 ubuntu 上尝试过,但没有成功。

代码和警告附在下面。

代码:

import tensorflow as tf

a = tf.constant(1)
b = tf.constant(2)
sum_tensor = a + b

with tf.Session() as session:
 answer = session.run(sum_tensor)
 print('a + b = %d' % answer)

错误:

runfile('/home/andrschl/PycharmProjects/Test/Test.py', wdir='/home/andrschl/PycharmProjects/Test')
2019-05-03 11:30:59.587115: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-05-03 11:30:59.610905: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2712000000 Hz
2019-05-03 11:30:59.611871: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5625de849590 executing computations on platform Host. Devices:
2019-05-03 11:30:59.611905: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-05-03 11:30:59.633882: W tensorflow/compiler/xla/service/platform_util.cc:240] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported: 4236312576
2019-05-03 11:30:59.633965: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: no supported devices found for platform CUDA
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

提前感谢您的回答!

答案1

您是否尝试过配置 GPU 内存增长:

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

或者:

# Creates a graph.
with tf.device('/device:GPU:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# Creates a session with allow_soft_placement and log_device_placement set
# to True.
sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))
# Runs the op.
print(sess.run(c))

Tensorflow 提供更多选项和信息这里

我还看到过一些更野蛮的做法,例如:

del model
gc.collect()

或者 Keras 内部

def clear_cuda_memory():
    from keras import backend as K

    for i in range(5):K.clear_session()
    return True
cuda = clear_cuda_memory()

为了解决内存释放缓慢的进程,上述操作会运行多次。

另一种完全暴力破解的方法是终止 python 进程和/或 ipython 内核。

我还没有找到完美的解决方案。Tensorflow 2.0 可能会解决一些问题;我当前的版本是 1.14.0。

相关内容