错误消息既没有重定向到 STDERR 也没有重定向到 STDOUT,但是当将日志输出到屏幕时它确实存在?

错误消息既没有重定向到 STDERR 也没有重定向到 STDOUT,但是当将日志输出到屏幕时它确实存在?

以下命令在 Ubuntu 16.04 中运行,python 版本为3.5。当我运行没有数据重定向的 python 例程时

python3 opt_CNN2_dense.py

屏幕输出 ResourceExhausted 错误如下:

/usr/local/lib/python3.5/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In futu
re, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Train on 123200 samples, validate on 30800 samples
Epoch 1/10
2018-04-07 11:14:44.279768: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-07 11:14:44.978444: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at leas
t one NUMA node, so returning NUMA node zero
2018-04-07 11:14:44.979036: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
totalMemory: 11.17GiB freeMemory: 11.09GiB
2018-04-07 11:14:44.979273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-04-07 11:14:59.113240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10750 MB memo
ry) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
2018-04-07 11:15:21.405519: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 4.88GiB.  Current allocation summ
ary follows.
2018-04-07 11:15:21.405695: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (256):   Total Chunks: 55, Chunks in use: 55. 13.8KiB allocated for chunks. 13.8KiB in u
se in bin. 1.2KiB client-requested in use in bin.
2018-04-07 11:15:21.405785: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (512):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0
B client-requested in use in bin.
2018-04-07 11:15:21.405804: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (1024):  Total Chunks: 13, Chunks in use: 13. 20.8KiB allocated for chunks. 20.8KiB in u
se in bin. 18.4KiB client-requested in use in bin.
2018-04-07 11:15:21.405850: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0
B client-requested in use in bin.
2018-04-07 11:15:21.405866: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (4096):  Total Chunks: 1, Chunks in use: 1. 4.0KiB allocated for chunks. 4.0KiB in use i
n bin. 4.0KiB client-requested in use in bin.
2018-04-07 11:15:21.405926: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (8192):  Total Chunks: 1, Chunks in use: 1. 8.0KiB allocated for chunks. 8.0KiB in use i
n bin. 8.0KiB client-requested in use in bin.
2018-04-07 11:15:21.405971: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (16384):         Total Chunks: 6, Chunks in use: 6. 118.5KiB allocated for chunks. 118.5
KiB in use in bin. 117.2KiB client-requested in use in bin.
2018-04-07 11:15:21.406013: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (32768):         Total Chunks: 1, Chunks in use: 1. 44.0KiB allocated for chunks. 44.0Ki
B in use in bin. 44.0KiB client-requested in use in bin.
2018-04-07 11:15:21.406055: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (65536):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use i
n bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406096: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (131072):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use i
n bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406135: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (262144):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use i
n bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406175: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (524288):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use i
n bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406209: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (1048576):       Total Chunks: 7, Chunks in use: 7. 11.92MiB allocated for chunks. 11.92
MiB in use in bin. 11.30MiB client-requested in use in bin.
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 173, in run
2018-04-07 11:15:21.406261: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (2097152):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406292: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (4194304):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406323: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (8388608):       Total Chunks: 6, Chunks in use: 6. 72.66MiB allocated for chunks. 72.66MiB in use in bin. 72.66MiB client-requested in use in bin.
2018-04-07 11:15:21.406354: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (16777216):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406398: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (33554432):      Total Chunks: 7, Chunks in use: 6. 399.58MiB allocated for chunks. 366.21MiB in use in bin. 366.21MiB client-requested in use in bin.
2018-04-07 11:15:21.406436: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (67108864):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406467: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (134217728):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-04-07 11:15:21.406497: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (268435456):     Total Chunks: 3, Chunks in use: 2. 10.03GiB allocated for chunks. 9.77GiB in use in bin. 9.77GiB client-requested in use in bin.
2018-04-07 11:15:21.406529: I tensorflow/core/common_runtime/bfc_allocator.cc:646] Bin for 4.88GiB was 256.00MiB, Chunk State: 
2018-04-07 11:15:21.406563: I tensorflow/core/common_runtime/bfc_allocator.cc:652]   Size: 266.07MiB | Requested Size: 1.72MiB | in_use: 0, prev:   Size: 4.88GiB | Requested Size: 4.88GiB | in_use: 1
2018-04-07 11:15:21.406672: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405140000 of size 1280
2018-04-07 11:15:21.406751: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405140500 of size 256
2018-04-07 11:15:21.406803: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405140600 of size 256
2018-04-07 11:15:21.406848: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405140700 of size 20224
2018-04-07 11:15:21.406875: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405145600 of size 256
2018-04-07 11:15:21.406928: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405145700 of size 256
2018-04-07 11:15:21.406950: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405145800 of size 1792
2018-04-07 11:15:21.407015: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405145f00 of size 256
2018-04-07 11:15:21.407027: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405146000 of size 256
2018-04-07 11:15:21.407043: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405146100 of size 256
2018-04-07 11:15:21.407051: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405146200 of size 256
2018-04-07 11:15:21.407114: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x405146300 of size 256
...
2018-04-07 11:15:21.410385: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 20224 totalling 118.5KiB
2018-04-07 11:15:21.410435: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 45056 totalling 44.0KiB
2018-04-07 11:15:21.410475: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 1698304 totalling 1.62MiB
2018-04-07 11:15:21.410487: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 1800192 totalling 10.30MiB
2018-04-07 11:15:21.410531: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 12697600 totalling 72.66MiB
2018-04-07 11:15:21.410544: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 64000000 totalling 366.21MiB
2018-04-07 11:15:21.410586: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 2 Chunks of size 5242880000 totalling 9.77GiB
2018-04-07 11:15:21.410599: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Sum Total of in-use chunks: 10.21GiB
2018-04-07 11:15:21.410644: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats: 
Limit:                 11272650752
InUse:                 10958659072
MaxInUse:              11055887872
NumAllocs:                     108
MaxAllocSize:           5242880000
...
2018-04-07 11:15:31.415584: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 1698304 totalling 1.62MiB
2018-04-07 11:15:31.415597: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 1800192 totalling 10.30MiB
2018-04-07 11:15:31.415639: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 12697600 totalling 72.66MiB
2018-04-07 11:15:31.415744: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 64000000 totalling 366.21MiB
2018-04-07 11:15:31.415763: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 2 Chunks of size 5242880000 totalling 9.77GiB
2018-04-07 11:15:31.415771: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Sum Total of in-use chunks: 10.21GiB
2018-04-07 11:15:31.415797: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats: 
Limit:                 11272650752
InUse:                 10958659072
MaxInUse:              11055887872
NumAllocs:                     108
MaxAllocSize:           5242880000

2018-04-07 11:15:31.415859: W tensorflow/core/common_runtime/bfc_allocator.cc:279] **************************************************************************************************__
2018-04-07 11:15:31.415928: W tensorflow/core/framework/op_kernel.cc:1202] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[1024,2,128,5000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
    return fn(*args)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
    target_list, status, run_metadata)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1024,2,128,5000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[Node: training/Adam/gradients/zeros_4 = Fill[T=DT_FLOAT, _class=["loc:@conv1/Relu"], index_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/Adam/gradients/Shape_5, training/Adam/gradients/zeros_4/Const)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[Node: loss/mul/_129 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_845_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "a.py", line 97, in <module>
    return_argmin=True
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 307, in fmin
    return_argmin=return_argmin,
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/base.py", line 635, in fmin
    return_argmin=return_argmin)
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 320, in fmin
    rval.exhaust()
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 199, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.async)
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 173, in run
    self.serial_evaluate()
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 92, in serial_evaluate
    result = self.domain.evaluate(spec, ctrl)
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/base.py", line 840, in evaluate
    rval = self.fn(pyll_rval)
  File "a.py", line 48, in f_nn
    callbacks=callbacks)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1705, in fit
    validation_steps=validation_steps)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1235, in _fit_loop
    outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2478, in __call__
    **self.session_kwargs)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1137, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
    options, run_metadata)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1024,2,128,5000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[Node: training/Adam/gradients/zeros_4 = Fill[T=DT_FLOAT, _class=["loc:@conv1/Relu"], index_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/Adam/gradients/Shape_5, training/Adam/gradients/zeros_4/Const)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[Node: loss/mul/_129 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_845_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'training/Adam/gradients/zeros_4', defined at:
  File "a.py", line 97, in <module>
    return_argmin=True
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 307, in fmin
    return_argmin=return_argmin,
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/base.py", line 635, in fmin
    return_argmin=return_argmin)
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 320, in fmin
    rval.exhaust()
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 199, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.async)
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 173, in run
    self.serial_evaluate()
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/fmin.py", line 92, in serial_evaluate
    result = self.domain.evaluate(spec, ctrl)
  File "/usr/local/lib/python3.5/dist-packages/hyperopt/base.py", line 840, in evaluate
    rval = self.fn(pyll_rval)
  File "a.py", line 48, in f_nn
    callbacks=callbacks)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1682, in fit
    self._make_train_function()
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 990, in _make_train_function
    loss=self.total_loss)
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/optimizers.py", line 445, in get_updates
    grads = self.get_gradients(loss, params)
  File "/usr/local/lib/python3.5/dist-packages/keras/optimizers.py", line 78, in get_gradients
    grads = K.gradients(loss, params)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2515, in gradients
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 602, in gradients
    out_grads[i] = control_flow_ops.ZerosLikeOutsideLoop(op, i)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1477, in ZerosLikeOutsideLoop
    return array_ops.zeros(zeros_shape, dtype=val.dtype)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1570, in zeros
    output = fill(shape, constant(zero, dtype=dtype), name=name)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1713, in fill
    "Fill", dims=dims, value=value, name=name)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
    op_def=op_def)
  File "/home/iamshg8/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1650, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1024,2,128,5000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[Node: training/Adam/gradients/zeros_4 = Fill[T=DT_FLOAT, _class=["loc:@conv1/Relu"], index_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/Adam/gradients/Shape_5, training/Adam/gradients/zeros_4/Const)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[Node: loss/mul/_129 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_845_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

opt_CNN2.dense.error有一次我想使用以下命令将错误信息输出到文件。文件中没有资源耗尽错误信息。

python3 opt_CNN2_dense.py > opt_CNN2.dense.inf.txt 2> opt_CNN2.dense.error

文件内容opt_CNN2.dense.error为:

2018-04-07 11:21:47.313671: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-07 11:21:47.410104: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at leas
t one NUMA node, so returning NUMA node zero
2018-04-07 11:21:47.410530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
totalMemory: 11.17GiB freeMemory: 11.09GiB
2018-04-07 11:21:47.410551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-04-07 11:21:47.704597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10750 MB memo
ry) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)

我无法理解的是资源耗尽错误信息去哪儿了?

更新:
而且我可以确定,第二次程序出错了,因为inf.txt是空的。(底线)

iamshg8@instance-1:~$ !cat   
cat rml/py/inf.error    
2018-04-07 11:21:47.313671: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions t
hat this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-07 11:21:47.410104: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read
 from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-07 11:21:47.410530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properti
es: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
totalMemory: 11.17GiB freeMemory: 11.09GiB
2018-04-07 11:21:47.410551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 
0
2018-04-07 11:21:47.704597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/
job:localhost/replica:0/task:0/device:GPU:0 with 10750 MB memory) -> physical GPU (device: 0, name: Tesla K80, pc
i bus id: 0000:00:04.0, compute capability: 3.7)
iamshg8@instance-1:~$ cat rml/py/inf.txt
iamshg8@instance-1:~$ 

相关内容