测试nvidia-docker2部署时无法拉取nvidia/cuda:11.0-base镜像的问题求助
测试nvidia-docker2部署时无法拉取nvidia/cuda:11.0-base镜像的问题求助
大家好,我正在按照NVIDIA官方步骤安装nvidia-docker2,前面的流程看起来都顺利完成了,但最后运行测试命令时遇到了镜像拉取失败的错误,想请大家帮忙排查一下问题。
以下是我的操作过程和相关输出:
1. 尝试安装Docker(因提示已安装中断)
(base) mona@ada:~/clean-pvnet/docker$ curl https://get.docker.com | sh \ && sudo systemctl --now enable docker % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 21927 100 21927 0 0 186k 0 --:--:-- --:--:-- --:--:-- 187k # Executing docker install script, commit: e5543d473431b782227f8908005543bb4389b8de Warning: the "docker" command appears to already exist on this system. If you already have Docker installed, this script can cause trouble, which is why we're displaying this warning and provide the opportunity to cancel the installation. If you installed the current Docker package using this script and are using it again to update Docker, you can safely ignore this message. You may press Ctrl+C now to abort this script. + sleep 20 ^C
2. 添加NVIDIA Docker源
(base) mona@ada:~/clean-pvnet/docker$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)). OK deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) / #deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) / deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) / #deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) / deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
添加实验性源:
(base) mona@ada:~/clean-pvnet/docker$ curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) / deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
3. 更新apt并安装nvidia-docker2
(base) mona@ada:~/clean-pvnet/docker$ sudo apt-get update Get:1 file:/var/cudnn-local-repo-ubuntu2204-8.9.5.29 InRelease [1,572 B] Ign:2 http://10.82.164.106/debs ./ InRelease Get:1 file:/var/cudnn-local-repo-ubuntu2204-8.9.5.29 InRelease [1,572 B] Ign:3 http://10.82.164.106/debs ./ Release Ign:4 http://10.82.164.106/debs ./ Packages Ign:5 http://10.82.164.106/debs ./ Translation-en_US Ign:6 http://10.82.164.106/debs ./ Translation-en Ign:4 http://10.82.164.106/debs ./ Packages Ign:5 http://10.82.164.106/debs ./ Translation-en_US Ign:6 http://10.82.164.106/debs ./ Translation-en Ign:4 http://10.82.164.106/debs ./ Packages Ign:5 http://10.82.164.106/debs ./ Translation-en_US Ign:6 http://10.82.164.106/debs ./ Translation-en Hit:4 http://10.82.164.106/debs ./ Packages Ign:5 http://10.82.164.106/debs ./ Translation-en_US Get:7 https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64 InRelease [1,503 B] Ign:6 http://10.82.164.106/debs ./ Translation-en Get:8 https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/amd64 InRelease [1,494 B] Ign:5 http://10.82.164.106/debs ./ Translation-en_US Hit:9 http://us.archive.ubuntu.com/ubuntu jammy InRelease Ign:6 http://10.82.164.106/debs ./ Translation-en Ign:5 http://10.82.164.106/debs ./ Translation-en_US Get:10 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 InRelease [1,484 B] Hit:11 http://us.archive.ubuntu.com/ubuntu jammy-updates InRelease Ign:6 http://10.82.164.106/debs ./ Translation-en Ign:5 http://10.82.164.106/debs ./ Translation-en_US Ign:6 http://10.82.164.106/debs ./ Translation-en Hit:12 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 InRelease Get:13 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 InRelease [1,474 B] Hit:14 http://us.archive.ubuntu.com/ubuntu jammy-backports InRelease Hit:15 http://security.ubuntu.com/ubuntu jammy-security InRelease Hit:16 https://dl.google.com/linux/chrome/deb stable InRelease Hit:17 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 InRelease Hit:18 http://packages.ros.org/ros2/ubuntu jammy InRelease Hit:19 https://packages.microsoft.com/repos/azure-cli jammy InRelease Hit:20 https://packages.microsoft.com/repos/ms-teams stable InRelease Hit:21 https://packages.microsoft.com/repos/code stable InRelease Get:22 https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64 Packages [12.4 kB] Get:23 https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/amd64 Packages [976 B] Hit:24 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease Hit:25 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease Get:26 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 Packages [4,488 B] Hit:27 https://download.docker.com/linux/ubuntu jammy InRelease Hit:28 https://librealsense.intel.com/Debian/apt-repo jammy InRelease Fetched 23.8 kB in 6s (4,201 B/s) Reading package lists... Done W: https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/amd64/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details. W: https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/amd64/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details. W: https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details. W: https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details. W: https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details. W: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.
安装nvidia-docker2:
(base) mona@ada:~/clean-pvnet/docker$ sudo apt-get install -y nvidia-docker2 Reading package lists... Done Building dependency tree... Done Reading state information... Done nvidia-docker2 is already the newest version (2.13.0-1). The following packages were automatically installed and are no longer required: cuda-cccl-11-7 cuda-command-line-tools-11-7 cuda-compiler-11-7 cuda-cudart-11-7 cuda-cudart-dev-11-7 cuda-cuobjdump-11-7 cuda-cupti-11-7 cuda-cupti-dev-11-7 cuda-cuxxfilt-11-7 cuda-demo-suite-11-7 cuda-documentation-11-7 cuda-driver-dev-11-7 cuda-gdb-11-7 cuda-libraries-11-7 cuda-libraries-dev-11-7 cuda-memcheck-11-7 cuda-nsight-11-7 cuda-nsight-compute-11-7 cuda-nsight-systems-11-7 cuda-nvcc-11-7 cuda-nvdisasm-11-7 cuda-nvml-dev-11-7 cuda-nvprof-11-7 cuda-nvprune-11-7 cuda-nvrtc-11-7 cuda-nvrtc-dev-11-7 cuda-nvtx-11-7 cuda-nvvp-11-7 cuda-runtime-11-7 cuda-sanitizer-11-7 cuda-toolkit-11-7 cuda-toolkit-11-7-config-common cuda-tools-11-7 cuda-visual-tools-11-7 gds-tools-11-7 libcublas-11-7 libcublas-dev-11-7 libcufft-11-7 libcufft-dev-11-7 libcufile-11-7 libcufile-dev-11-7 libcurand-11-7 libcurand-dev-11-7 libcusolver-11-7 libcusolver-dev-11-7 libcusparse-11-7 libcusparse-dev-11-7 libnpp-11-7 libnpp-dev-11-7 libnvidia-egl-wayland1 libnvjpeg-11-7 libnvjpeg-dev-11-7 nsight-compute-2022.2.1 nsight-systems-2022.1.3 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 58 not upgraded.
4. 重启Docker并测试
(base) mona@ada:~/clean-pvnet/docker$ sudo systemctl restart docker (base) mona@ada:~/clean-pvnet/docker$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi Unable to find image 'nvidia/cuda:11.0-base' locally docker: Error response from daemon: manifest for nvidia/cuda:11.0-base not found: manifest unknown: manifest unknown. See 'docker run --help'.
我的系统信息:
(base) mona@ada:~$ nvidia-docker --version Docker version 24.0.6, build ed223bc (base) mona@ada:~$ docker --version Docker version 24.0.6, build ed223bc (base) mona@ada:~$ uname -a Linux ada 6.2.0-36-generic #37~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Oct 9 15:34:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux (base) mona@ada:~$ lsb_release -a LSB Version: core-11.1.0ubuntu4-noarch:security-11.1.0ubuntu4-noarch Distributor ID: Ubuntu Description: Ubuntu 22.04.3 LTS Release: 22.04 Codename: jammy
nvidia-docker2包信息:
(base) mona@ada:~$ sudo apt show nvidia-docker2 Package: nvidia-docker2 Version: 2.13.0-1 Priority: optional Section: utils Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com> Installed-Size: 27.6 kB Depends: nvidia-container-toolkit (>= 1.13.0-1), docker-ce (>= 18.06.0~ce~3-0~ubuntu) | docker-ee (>= 18.06.0~ce~3-0~ubuntu) | docker.io (>= 18.06.0) | moby-engine Breaks: nvidia-docker (<< 2.0.0) Replaces: nvidia-docker (<< 2.0.0) Homepage: https://github.com/NVIDIA/nvidia-docker/wiki Download-Size: 6,876 B APT-Manual-Installed: yes APT-Sources: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 Packages Description: nvidia-docker CLI wrapper Replaces nvidia-docker with a new implementation based on the NVIDIA Container Toolkit N: There are 33 additional records. Please use the '-a' switch to see them.
我需要用nvidia-docker2来运行一个教程里的相关命令,现在卡在这个镜像拉取的问题上了,有没有大佬能帮我看看问题出在哪?
备注:内容来源于stack exchange,提问作者Mona Jalal




