代码编织梦想

nvidia-container-toolkit安装

安装前提

  1. GNU/Linux x86_64 with kernel version > 3.10

  2. Docker >= 19.03 (recommended, but some distributions may include older versions of Docker. The minimum supported version is 1.12)

  3. NVIDIA GPU with Architecture >= Kepler (or compute capability 3.0)

  4. NVIDIA Linux drivers >= 418.81.07 (Note that older driver releases or branches are unsupported.)

Ubuntu在线安装nvidia-container-toolkit

1.使用xshell的root用户登录服务器

#查看docker版本
docker -v
#添加对应的库成功会提示ok
 curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add 

#设置软件包存储库和 GPG 密钥:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
#更新源
sudo apt-get update
#安装nvidia-container-toolkit
sudo apt-get install -y nvidia-container-toolkit
#重启docker
sudo systemctl restart docker
  • 注意在容器外执行相关命令
  • 错误1:
    执行:
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

出现:

Unsupported distribution!

Check https://nvidia.github.io/nvidia-docker

解决:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

docker版本小于19解决

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.lis

sudo apt-get update
#安装nvidia-docker
sudo apt-get install nvidia-docker

#重启docker
sudo systemctl restart docker
  • 注意在容器外执行相关命令

Ubuntu离线安装nvidia-container-toolkit安装

软件包的依赖关系:

├─ nvidia-container-toolkit (version)
│ ├─ libnvidia-container-tools (>= version)
│ └─ nvidia-container-toolkit-base (version)

├─ libnvidia-container-tools (version)
│ └─ libnvidia-container1 (>= version)
└─ libnvidia-container1 (version)

  • 安装顺序:
  1. libnvidia-container1
  2. libnvidia-container-tools
  3. nvidia-container-toolkit

获取软件包

  1. 下载以下软件包
libnvidia-container1_1.9.0-1_amd64.deb					
libnvidia-container-tools_1.9.0-1_amd64.deb		
nvidia-container-toolkit_1.9.0-1_amd64.deb			

安装软件包

  1. 上传软件包
  2. cd进入软件包目录
  3. 使用命令安装(注意安装顺序)
#需要先安装container1
sudo dpkg -i ./libnvidia-container1_1.9.0-1_amd64.deb
#再安装libnvidia-container-tools
sudo dpkg -i ./libnvidia-container-tools_1.9.0-1_amd64.deb
#最后安装nvidia-container-toolkit
sudo dpkg -i ./nvidia-container-toolkit_1.9.0-1_amd64.deb
  1. 重启docker服务
#重启docker
sudo systemctl restart docker

卸载nvidia-container-toolkit

  1. 卸载安装命令
sudo apt remove nvidia-container-toolkit
sudo apt remove libnvidia-container-tools
sudo apt remove libnvidia-container1
  1. 没有安装nvidia-container-toolkit报错
docker: Error response from daemon: exec: "nvidia-container-runtime-hook": executable file not found in $PATH.
ERRO[0000] error waiting for container: context canceled 

其他错误

E: Conflicting values set for option Signed-By regarding source https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/ /: /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg != E: The list of sources could not be read.>
#解决 备份源
sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup
sudo cp -r /etc/apt/sources.list.d/ /etc/apt/sources.list.d.backup
#尝试删除与 NVIDIA 相关的软件源配置文件和签名键
sudo rm /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
sudo rm /etc/apt/sources.list.d/*nvidia*
#重新创建 NVIDIA 的软件源配置文件。
echo "deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/ /" | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
#更新源
sudo apt update
#重新安装
sudo apt-get install nvidia-docker

References
[1] nvidia-container-toolkit安装
[2] NVIDIA Linux drivers
[3] 获取软件包

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/weixin_55674987/article/details/139867794

ubuntu安装nvidia container toolkit_八面受敌的博客-爱代码爱编程

文章目录 前言一、基本概念二、操作步骤1.添加源2.安装重启 总结 前言 本文介绍如何在Ubuntu安装Nvidia Container Toolkit。 一、基本概念 Nvidia Cont

ubuntu20.04安装nvidia-爱代码爱编程

docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]]. 然后按照网上的博客进行以下

ubuntu 20.04 安装 nvidia container toolkit-爱代码爱编程

使用官方提供的Apt安装方式,可参考:Installing the NVIDIA Container Toolkit — NVIDIA Container Toolkit 1.14.5 documentationhttps://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/

dockers、nvidia-爱代码爱编程

楔子 要部署和迁移paddle的几个应用(像OCR、语音识别等),项目数据也比较庞大,所以需要调用gpu,其实如果数据量不多完全可以使用这些应用的cpu版本。paddle本身的坑也比较多(充分展现出了国产特色),一波部署下来,心态是炸裂的。而我对于nvidia-docker也不熟悉,摸石头过河踩了很多坑。 一、国际惯例 按照国际惯例(外国朋友们讲课搞

docker:使用nvidia官方的pytorch、tensorflow、tensorrt镜像创建container容器_nvidia tensorrt docker-爱代码爱编程

文章目录 前言 一、前期准备 二、具体步骤 1.启动容器 2.使用容器 补充 前言 相信大家在学习新的知识前都遇到过开发环境安装不上,或者环境冲突和版本不匹配的情况,另外当我们想要安装多个版本的支持库时,在本地环境上直接安装往往会导致版本冲突的情况,如果我们使用虚拟机或者WSL技术新建一个完整系统,这又往往需要耗费很

在ubuntu中基于nvidia-爱代码爱编程

nvidia-container-toolkit可以让docker容器直接访问显卡,方便实验室多人共用一台服务器。 前言 若当前系统已有ssh、docker、nvidia-container-toolkit,可从Block2开始阅读,若是新安装的ubuntu系统,可从Block1阅读。 Block1 首先处于新安装的ubuntu系统中。 1.安装