ubuntu安装nvidia-爱代码爱编程
nvidia-container-toolkit安装
安装前提
-
GNU/Linux x86_64 with kernel version > 3.10
-
Docker >= 19.03 (recommended, but some distributions may include older versions of Docker. The minimum supported version is 1.12)
-
NVIDIA GPU with Architecture >= Kepler (or compute capability 3.0)
-
NVIDIA Linux drivers >= 418.81.07 (Note that older driver releases or branches are unsupported.)
Ubuntu在线安装nvidia-container-toolkit
1.使用xshell的root用户登录服务器
#查看docker版本
docker -v
#添加对应的库成功会提示ok
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add
#设置软件包存储库和 GPG 密钥:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
#更新源
sudo apt-get update
#安装nvidia-container-toolkit
sudo apt-get install -y nvidia-container-toolkit
#重启docker
sudo systemctl restart docker
- 注意在容器外执行相关命令
- 错误1:
执行:
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
出现:
Unsupported distribution!
Check https://nvidia.github.io/nvidia-docker
解决:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
docker版本小于19解决
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.lis
sudo apt-get update
#安装nvidia-docker
sudo apt-get install nvidia-docker
#重启docker
sudo systemctl restart docker
- 注意在容器外执行相关命令
Ubuntu离线安装nvidia-container-toolkit安装
软件包的依赖关系:
├─ nvidia-container-toolkit (version)
│ ├─ libnvidia-container-tools (>= version)
│ └─ nvidia-container-toolkit-base (version)
│
├─ libnvidia-container-tools (version)
│ └─ libnvidia-container1 (>= version)
└─ libnvidia-container1 (version)
- 安装顺序:
- libnvidia-container1
- libnvidia-container-tools
- nvidia-container-toolkit
获取软件包
- 下载以下软件包
libnvidia-container1_1.9.0-1_amd64.deb
libnvidia-container-tools_1.9.0-1_amd64.deb
nvidia-container-toolkit_1.9.0-1_amd64.deb
安装软件包
- 上传软件包
- cd进入软件包目录
- 使用命令安装(注意安装顺序)
#需要先安装container1
sudo dpkg -i ./libnvidia-container1_1.9.0-1_amd64.deb
#再安装libnvidia-container-tools
sudo dpkg -i ./libnvidia-container-tools_1.9.0-1_amd64.deb
#最后安装nvidia-container-toolkit
sudo dpkg -i ./nvidia-container-toolkit_1.9.0-1_amd64.deb
- 重启docker服务
#重启docker
sudo systemctl restart docker
卸载nvidia-container-toolkit
- 卸载安装命令
sudo apt remove nvidia-container-toolkit
sudo apt remove libnvidia-container-tools
sudo apt remove libnvidia-container1
- 没有安装nvidia-container-toolkit报错
docker: Error response from daemon: exec: "nvidia-container-runtime-hook": executable file not found in $PATH.
ERRO[0000] error waiting for container: context canceled
其他错误
E: Conflicting values set for option Signed-By regarding source https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/ /: /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg != E: The list of sources could not be read.>
#解决 备份源
sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup
sudo cp -r /etc/apt/sources.list.d/ /etc/apt/sources.list.d.backup
#尝试删除与 NVIDIA 相关的软件源配置文件和签名键
sudo rm /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
sudo rm /etc/apt/sources.list.d/*nvidia*
#重新创建 NVIDIA 的软件源配置文件。
echo "deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64/ /" | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
#更新源
sudo apt update
#重新安装
sudo apt-get install nvidia-docker
References
[1] nvidia-container-toolkit安装
[2] NVIDIA Linux drivers
[3] 获取软件包