logo
Loading...

Tensorflow 2.0 CUDA 顯示有問題? - Cupoy

更新 tensorflow 到 2.0 會跳出一堆錯誤訊息。刪掉 tensorflow 再重新用 c...

Tensorflow 2.0 CUDA 顯示有問題?

2020/09/20 上午 10:55
機器學習共學討論版
Yang Wang
觀看數:15
回答數:2
收藏數:0

更新 tensorflow 到 2.0 會跳出一堆錯誤訊息。 刪掉 tensorflow 再重新用 `conda install tensorflow-gpu=2.0.0` 安裝也是會跳出一樣的錯誤訊息。 有試過刪掉 CUDA 再重新安裝 CUDA 10.0 和 CuDNN 7.6.0,但是 nvidia-smi 上面會顯示 CUDA 11.0 ``` Sun Sep 20 10:49:51 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 451.82 Driver Version: 451.82 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... WDDM | 00000000:01:00.0 On | N/A | | 0% 59C P3 79W / 300W | 1468MiB / 11264MiB | 0% Default | +-------------------------------+----------------------+----------------------+ ``` 環境變數也有設,但重新跑 code 也是會跳出一樣的訊息。 以下是 log 顯示的訊息: ``` 2020-09-20 10:51:32.182818: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.4.1 but source was compiled with: 7.6.0. CuDNN library major and minor version nee ds to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. 2020-09-20 10:51:32.196765: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at cudnn_rnn_ops.cc:1491 : Unknown: Fail to find the dnn implementation. 2020-09-20 10:51:32.201205: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Unknown: Fail to find the dnn implementation. [[{{node CudnnRNN}}]] 2020-09-20 10:51:32.206991: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Cancelled: [_Derived_]RecvAsync is cancelled. [[{{node Reshape_16/_62}}]] [[Adam/Adam/update/AssignSubVariableOp/_65]] 2020-09-20 10:51:32.213809: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Cancelled: [_Derived_]RecvAsync is cancelled. [[{{node Reshape_16/_62}}]] ```

回答列表

  • 2020/09/20 下午 00:23
    Yang Wang
    贊同數:0
    不贊同數:0
    留言數:0

    用 `conda list` 檢查也是沒有問題 顯示如下: ``` cudatoolkit 10.0.130 0 cudnn 7.6.5 cuda10.0_0 ```

  • 2020/10/18 上午 01:31
    張維元 (WeiYuan)
    贊同數:0
    不贊同數:0
    留言數:2

    嗨,我查了一下,看起來是 windows 當中環境變數的問題,你可以參考看看官方的討論:https://github.com/tensorflow/tensorflow/issues/23715


    很高興可以在這次問答進行討論,如果還有不懂或是模糊的部分也歡迎持續追問。期待你的互動與鼓勵創造出不同更深度的討論。歡迎加入我自己經營的Line 群組社群,會有不定時舉辦的分享活動,一起來玩玩吧!