You need to enable JavaScript to run this app.
导航

GPU实例部署paddlepaddle-gpu环境

最近更新时间2024.01.02 15:47:08

首次发布时间2024.01.02 15:47:08

本文介绍 GPU 实例部署深度学习Paddle环境。

前言

在ECS GPU实例上部署深度学习Paddle环境。

关于实验

预计实验时间:20分钟
级别:初级
相关产品:ECS
受众: 通用

环境说明

本文测试规格如下:
实例规格:ecs.pni2.3xlarge
GPU 类型:Tesla A100 80G
显存容量:81920MiB
实例镜像:velinux - 1.0 with GPU Driver
NVIDIA-SMI:470.57.02
NVIDIA Driver version:470.57.02
CUDA version:11.4
CUDA Toolkit version:11.2
Python version:Python 3.7.3
paddlepaddle-gpu version:2.3.0.post112

安装相关依赖

apt update
apt-get install libjpeg-dev zlib1g-dev python3-pip lrzsz zip -y

安装驱动

安装cuda driver

参考文档:官方文档-安装GPU驱动

安装cuda-toolkit

cuda toolkit下载

mkdir /home/cuda-toolkit
cd /home/cuda-toolkit
wget https://us.download.nvidia.com/tesla/450.203.03/NVIDIA-Linux-x86\_64-450.203.03.run

安装cuda-toolkit

sudo sh cuda\_11.2.0\_460.27.04\_linux.run

通过光标+空格取消选择,仅安装CUDA Toolkit 11.2

输出结果如下,安装完成

===========_460.27.04_linux.run
= Summary =
===========
Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.2/
Samples:  Not Selected
Please make sure that
 -   PATH includes /usr/local/cuda-11.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.2/lib64, or, add /usr/local/cuda-11.2/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.2/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 460.00 is required for CUDA 11.2 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

配置环境变量

vim ~/.bashrc
export PATH=$PATH:/usr/local/cuda-11.2/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2/lib64
source ~/.bashrc

验证安装

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

安装cudnn

参考文档

NVIDIA cudnn文档

NVIDIA cudnn 官方下载地址

选择对应当前cuda toolkit版本的 cudnn安装包进行下载

NVIDIA官网下载cudnn需要注册登录,并同意加入开发者社区才可以进行下载

本地下载cudnn安装包,通过scp命令拷贝至目标GPU服务器

mkdir /home/cudnn
scp ~/Download/NVIDIA/cudnn-11.2-linux-x64-v8.1.1.33.tgz root@180.184.86.27:/home/cudnn/

安装 & 配置

cd /home/cudnn
tar zxf cudnn-11.2-linux-x64-v8.1.1.33.tgz 
sudo cp cudnn/cuda/include/cudnn.h /usr/local/cuda-11.2/include/
sudo cp cudnn/cuda/lib64/libcudnn* /usr/local/cuda-11.2/lib64/
sudo chmod a+r /usr/local/cuda-11.2/include/cudnn*.h /usr/local/cuda-11.2/lib64/libcudnn*
安装paddle
mkdir /home/paddle
cd mkdir /home/paddle

安装依赖(可选)

为避免安装paddle时,因网络影响导致部分Python Module下载慢,可提前单独安装。

# Cython
# Version:0.29.28
wget https://files.pythonhosted.org/packages/cb/da/54a5d7a7d9afc90036d21f4b58229058270cc14b4c81a86d9b2c77fd072e/Cython-0.29.28.tar.gz
tar zxf Cython-0.29.28.tar.gz 
cd Cython-0.29.28/
python3 setup.py install

#  numpy
# Version:1.21.6
wget https://files.pythonhosted.org/packages/45/b7/de7b8e67f2232c26af57c205aaad29fe17754f793404f59c8a730c7a191a/numpy-1.21.6.zip
unzip numpy-1.21.6.zip
cd numpy-1.21.6/
python3 setup.py install

# Pillow
# Version:9.2.0
wget https://files.pythonhosted.org/packages/8c/92/2975b464d9926dc667020ed1abfa6276e68c3571dcb77e43347e15ee9eed/Pillow-9.2.0.tar.gz
tar zxf Pillow-9.2.0.tar.gz
cd Pillow-9.2.0/
python3 setup.py install

# protobuf
# Version:3.10.0
wget https://files.pythonhosted.org/packages/12/b9/e7c6a58613c9fe724d1ff9f2353fa48901e6b1b99d0ba64c36a8de2cfa45/protobuf-3.10.0.tar.gz
tar zxf protobuf-3.10.0.tar.gz 
cd protobuf-3.10.0/

安装Paddle

根据Paddle官方文档中的版本支持,选择对应的Toolkit版本的Paddle版本进行安装。
根据当前Python版本选择对应的whl包
https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html

# 本环境python版本3.7.3,cuda toolkit选择11.2 以paddle 2.3.0版本为例,因paddle包较大,下载whl文件进行安装:

wget https://paddle-wheel.bj.bcebos.com/2.3.0/linux/linux-gpu-cuda11.2-cudnn8-mkl-gcc8.2-avx/paddlepaddle_gpu-2.3.0.post112-cp37-cp37m-linux_x86_64.whl

执行安装

pip3 install paddlepaddle_gpu-2.2.1.post112-cp37-cp37m-linux_x86_64.whl 
Successfully installed astor-0.8.1 decorator-5.1.1 paddlepaddle-gpu-2.2.1.post112

验证安装

python3
>>> import paddle
>>> paddle.utils.run_check()

Running verify PaddlePaddle program ... 
W0921 17:54:07.119729  4254 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 11.4, Runtime API Version: 11.2
W0921 17:54:07.122442  4254 gpu_context.cc:306] device: 0, cuDNN Version: 8.1.
PaddlePaddle works well on 1 GPU.
PaddlePaddle works well on 1 GPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.