RuntimeError: cuda 运行时错误 (59) : 设备端断言触发了

RuntimeError: cuda 运行时错误 (59) : 设备端断言触发了

我查看了其他建议,但其他人使用了 RNN 网络和数据标签。就我而言,昨天一切都运行正常,但突然我的代码不再工作了。

我正在尝试在本地机器上运行这个 python 代码:https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/master/08.rainbow.ipynb

C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorIndex.cu:189: block: [25,0,0], thread: [63,0,0] Assertion dstIndex < dstAddDimSize failed.
THCudaCheck FAIL file=C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src\THC/THCTensorMathCompareT.cuh line=69 error=59 : device-side assert triggered
Traceback (most recent call last):
File “rainbow.py”, line 763, in
agent.train(epochs, horizon)
File “rainbow.py”, line 629, in train
loss = self.update_model()
File “rainbow.py”, line 578, in update_model
elementwise_loss_n_loss = self._compute_dqn_loss(samples, gamma)
File “rainbow.py”, line 710, in _compute_dqn_loss
dist = self.dqn.dist(state)
File “rainbow.py”, line 386, in dist
print(x)
File “C:\Users\un_po\Anaconda3\envs\rainbowPy\lib\site-packages\torch\tensor.py”, line 82, in repr
return torch._tensor_str._str(self)
File “C:\Users\un_po\Anaconda3\envs\rainbowPy\lib\site-packages\torch_tensor_str.py”, line 300, in _str
tensor_str = _tensor_str(self, indent)
File “C:\Users\un_po\Anaconda3\envs\rainbowPy\lib\site-packages\torch_tensor_str.py”, line 201, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File “C:\Users\un_po\Anaconda3\envs\rainbowPy\lib\site-packages\torch_tensor_str.py”, line 87, in init
nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
File “C:\Users\un_po\Anaconda3\envs\rainbowPy\lib\site-packages\torch\functional.py”, line 228, in isfinite
return (tensor == tensor) & (tensor.abs() != inf)
RuntimeError: cuda runtime error (59) : device-side assert triggered at C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src\THC/THCTensorMathCompareT.cuh:69

昨天同样的代码运行正常。

并且代码仍然可以在 CPU 上运行。

相关内容