Tensorflow-gpu：CUDA_ERROR_OUT_OF_MEMORY

Question

您是否尝试过配置 GPU 内存增长：

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

或者：

# Creates a graph.
with tf.device('/device:GPU:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# Creates a session with allow_soft_placement and log_device_placement set
# to True.
sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))
# Runs the op.
print(sess.run(c))

Tensorflow 提供更多选项和信息这里

我还看到过一些更野蛮的做法，例如：

del model
gc.collect()

或者 Keras 内部

def clear_cuda_memory():
    from keras import backend as K

    for i in range(5):K.clear_session()
    return True
cuda = clear_cuda_memory()

为了解决内存释放缓慢的进程，上述操作会运行多次。

另一种完全暴力破解的方法是终止 python 进程和/或 ipython 内核。

我还没有找到完美的解决方案。Tensorflow 2.0 可能会解决一些问题；我当前的版本是 1.14.0。

Answer 1

您是否尝试过配置 GPU 内存增长：

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

或者：

# Creates a graph.
with tf.device('/device:GPU:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# Creates a session with allow_soft_placement and log_device_placement set
# to True.
sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))
# Runs the op.
print(sess.run(c))

Tensorflow 提供更多选项和信息这里

我还看到过一些更野蛮的做法，例如：

del model
gc.collect()

或者 Keras 内部

def clear_cuda_memory():
    from keras import backend as K

    for i in range(5):K.clear_session()
    return True
cuda = clear_cuda_memory()

为了解决内存释放缓慢的进程，上述操作会运行多次。

另一种完全暴力破解的方法是终止 python 进程和/或 ipython 内核。

我还没有找到完美的解决方案。Tensorflow 2.0 可能会解决一些问题；我当前的版本是 1.14.0。

Tensorflow-gpu：CUDA_ERROR_OUT_OF_MEMORY

答案1

相关内容