Pytorch amd gpu windows — Ваш верный помощник с OS Windows

Уровень сложностиСредний

Время на прочтение8 мин

Количество просмотров6.3K

Когда начинаешь изучать или использовать машинное обучение, то думаешь, как приспособить те устройства, которые есть в наличии, чтобы снизить свои траты на вход. И, в частности, обладатели довольно мощных старых карт AMD (типа AMD Fury), на которых легко идут довольно тяжёлые игры типа Cyberpunk 2077 или Atomic Heart, сталкиваются с тем, что эти GPU бесполезны для PyTorch и других фреймворков машинного обучения. Да и самые современные карты AMD 7900-й серии работают с PyTorch только из под Linux. Также есть редкие карты других брендов, типа Intel Arc или китайские, которые хотелось бы использовать для машинного обучения.

Итак, в этой статье я приведу подход, который в некоторых случаях может помочь. Он сыроват, но других работающих вариантов под PyTorch я не нашёл. Итак, речь пойдёт о проекте израильского разработчика Артёма Бейлиса (Тонких) pytorch_dlprim.

Предыстория проекта pytorch_dlprim

Артём Бейлис — человек широко известный в узких кругах тем, что в своё время написал CMS для сайтов и шаблонизатор на чистом C++. Самый частый вопрос в FAQ этого проекта звучит так: «Вы сумасшедший или мазохист?»

Его, как и многих других, расстраивало то, что хорошее (но старое, до эпохи ROCm) железо AMD и других брендов нельзя использовать в современных фреймворках машинного обучения, хотя оно поддерживает вычисления общего назначения через OpenCL. К слову говоря, в какой-то степени это универсальная проблема. Которая касается и Nvidia, которая не выпускает новейшие версии CUDA для карт прошлых поколений.

AMD с появлением ROCm и ROCm HIP (аналог CUDA) тоже не решила проблему, так как каждая следующая версия ROCm отказывалась от поддержки какого-то предыдущего поколения видеокарт. В данный момент ROCm 6.2 поддерживает Radeon 7900, Radeon VII, ускорители MI100, MI210, MI250, MI300 и некоторые модели PRO-карт. Это очень и очень скудная поддержка.

Почему я тут вообще заговорил о ROCm? Потому что PyTorch с недавних пор начал работать с ROCm.

Но ROCm поддерживает очень мало видеокарт AMD. И только под Linux.

Суть проекта pytorch_dlprim

Так вот, возвращаемся к Артёму Бейлису. Этот титан мысли (без доли сарказма) задумал подружить все устройства с поддержкой OpenCL с фреймворками машинного обучения.

Идея такая: OpenCL — это очень похожая на CUDA вещь. Тот же Си-подобный синтаксис. Похожие цели. Поэтому Артём понял, что если устройство поддерживает OpenCL, то его можно использовать в машинном обучении. Неважно, что за устройство: FPGA, видеокарта, специализированная плата или процессор. Главное, чтобы для него был драйвер OpenCL.

История проекта

Вооружившись недюжинным фанатизмом, он с 2021 года несколько лет тянул этот Сизифов камень. Один. И сделал-таки библиотеку DLPrimitives. Что расшифровывается как «Deep Learning Primitives». Эта библиотека написана на C++ и реализует стандартные функции DL на C++ и OpenCL.

Потом Артём написал бэкенд для Pytorch под названием pytorch_dlprim. И ещё несколько других бэкендов: для Caffe и даже начальную поддержку TensorFlow.

Естественно, период такого фанатизма не может продолжаться вечно, предполагаю, что Артёму нужно как-то зарабатывать на жизнь, кормить семью. А, как я понял, руководство AMD не поняло, что им достаточно нанять одного чувака, чтобы решить кучу проблем с поддержкой старых карт. Поэтому сейчас Артём возвращается к этому проекту примерно раз в полгода.

Автор этой статьи, кстати, тоже отметился внесением большого патча в проект Артёма. Коммита которого мне пришлось ждать 4 месяца. Репозиторий Артёма. В конце статьи ещё дан мой репозиторий, на случай если Артём забросит разработку. Сейчас нужно использовать только его репозиторий, мой на текущий момент устарел.

Однако для многих ситуаций то, что уже Артёмом сделано, решает свои задачи. Иными словами, если не использовать редких функций PyTorch, то мы можем использовать старые карты AMD для обучения и тестирования нейронок в PyTorch.

Устанавливаем pytorch_dlprim под Windows

Естественно, драйвера OpenCL для вашей видеокарты уже должны быть установлены. Часто это происходит автоматически, и, как правило, специально их устанавливать не нужно.

Для того, чтобы избавить пользователей Windows от мук компиляции с недавних пор Артём запустил поддержку бинарных релизов в формате whl. Эти релизы работают, начиная с версии PyTorch 2.4.0.

Достаточно скачать со страницы релизов файл whl для нужной версии python, а потом установить примерно такой командой:

pip install pytorch_ocl-0.1.0+torch2.4-cp312-none-win_amd64.whl

В самом коде достаточно сделать:

import pytorch_ocl
# Цифра после ocl может быть любой, но скорее всего 0 или 1. Зависит от числа поддерживаемых OpenCL-платформ
device=torch.device("ocl:1")

Для того, чтобы можно было приступить к экспериментам. Однако, если вы работаете с PyTorch 1.13, то компилировать всё-таки придётся.

Как я запустил под Linux PуTorch на AMD Radeon Fury и Radeon R9 280X

Вообще, дела у AMD с поддержкой OpenCL под Linux у старых карт обстоят плохо. Есть несколько вариантов OpenCL и большинство из них уже не поддерживаются или убраны из публичного доступа. Есть даже отдельный проект I-love-compute, который помогает разобраться в этом винегрете. Причём это касается как старых карт AMD, так и карт Nvidia.

Иногда получается с помощью I-love-compute включить поддержку OpenCL у старых карт на новых дистибутивах, а иногда нет.

Мне же помогла Mesa, это драйвера OpenCL с открытым исходным кодом. Дело в том, что Mesa использует LLVM для генерации кода на GPU. И с некоторых пор LLVM стал поддерживать бэкенд amdgpu для генерации кода для GPU от архитектуры r600 и GCN 1.0 и выше.

Как говорится, лучше поздно, чем никогда.

▍ Активируем поддержку rusticl в Mesa 3D

Поддержка OpenCL 3.0 для старых карт реализована в Mesa через модуль на языке Rust под названием rusticl. Иногда его требуется активировать. Если эта поддержка уже и так активна, то делать ничего не нужно. Проверяем так:

clinfo

Если в выводе мы видим что-то подобное:

Platform Name                                   rusticl
Number of devices                                 1
  Device Name                                     AMD Radeon R9 Fury Series (radeonsi, fiji, LLVM 17.0.6, DRM 3.57, 6.8.9-calculate)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 3.0

То rusticl уже активен, тогда просто запоминаем номер платформы. У меня rusticl идёт второй платформой в выводе clinfo, нумерация начинается с нуля, так что номер rusticl — это 1.

Если же rusticl нет в выводе поддерживаемых платформ, то нужно экспортировать переменную

RUSTICL_ENABLE=radeonsi

В данном случае я привёл переменную для карт AMD семейства Southern Islands, для вашего семейства она может быть другой.

Проверку тогда делаем так:

RUSTICL_ENABLE=radeonsi clinfo

Если платформа rusticl появилась, то всё отлично. И эту переменную среды можно вписать в файл, который при загрузке компа будет её активировать.

У меня это файл radeonsi.sh, который лежит в /etc/profile.d.

export RUSTICL_ENABLE=radeonsi

▍ Компилируем pytorch_dlprim

В принципе сейчас это необязательная вещь, так как в проекте появились whl-файлы, но они всегда будут отставать от актуальной версии кода. Но если вы используте PyTorch 1.13, то без компиляции никак не обойтись.

Но в ней нет ничего сложного. Клонируем с github и компилируем. Мы должны находиться в виртуальном окружении питона.

Активируем окружение:

. your_path_ve/bin/activate

Компилируем:

# Проект dlprimitives вложен, поэтому ключ --recurse-submodules
git clone --recurse-submodules https://github.com/artyom-beilis/pytorch_dlprim.git
cd pytorch_dlprim
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=$VIRTUAL_ENV/lib/python3.10/site-packages/torch/share/cmake/Torch ..
make

При компиляции возможны предупреждения. Игнорируем. Компиляция идёт несколько минут. Это нормально. C++ компилируется медленнее, чем С.

Для ускорения вместо make можем использовать:

# Если у нас 12 логических или физических ядер
make -j12

Если у нас PyTorch 1.13.1 получим библиотеку libpt_ocl.so. У меня она вышла 32 МБ весом. Её можно положить, например, в /usr/local/lib64.

Если же у нас PyTorch 2.4.*, то получим папку pytorch_ocl, а в ней библиотеку примерно на 50 МБ и питон-обвязку к ней. В случае виртуального окружения pytorch_ocl потом нужно скопировать в папку site-packages.

Далее можно подняться на уровень выше и запустить тесты mnist.py и test.py. Чтобы они заработали, в коде тестов или в командной строке нужно подставлять правильный номер платформы. Мне пришлось 0 заменять на 1.

python mnist.py --device ocl:1
# В коде менял privateuseone:0 на privateuseone:1 для PyTorch 1.13.1
python test.py

Скрипт mnist.py может вылететь с ошибкой, об этом нужно сказать Артёму. А вот test.py у меня нормально отрабатывает.

▍ Подключаем pytorch_dlprim к PyTorch 1.13

Далее, в *.py файлах мы подключаем новый бэкенд так:

torch.ops.load_library("/usr/local/lib64/libpt_ocl.so")
# номер платформы может быть другим
device = torch.device("privateuseone:1")

▍ Подключаем pytorch_dlprim к PyTorch 2.4

import pytorch_ocl
# Цифра после ocl может быть любой, но скорее всего 0 или 1. Смотрим через clinfo. Детали выше в статье.
device=torch.device("ocl:1")

Смотрим загрузку GPU во время работы с нейросетями

Есть 2 утилиты: старая и новая. Старая — это radeontop, более новая — это amdgpu_top. Устанавливаем их через пакетный менеджер вашего дистрибутива или компилируем из исходников.

Вот так выглядит вывод amdgpu_top без нагрузки:

А вот я обучаю нейросеть:

Что может и что не может pytorch_dlprim

На странице проекта Артём пишет, что протестировал много нейросетей из PyTorch, и они по большей части успешно отработали.

Мне не нужны были стандартные тестовые сетки, и я столкнулся с тем, что при использовании базовых функций всё работало, но при создании хитрых конвееров средствами PyTorch — нет.

Например, такая штука не работала из-за отсутствия поддержки функций torch.index_select и torch.logical_xor:

# Ищем индекс с максимальным значением, если он совпадает, то засчитываем угадывание.
correct_curr =  torch.index_select(y, 1, torch.tensor(0).to(device)).squeeze(1).logical_xor(pred.argmax(1)).type(torch.int).sum().item()

Но когда я придумал более простой аналог, то pytorch_dlprim справился:

correct_curr = pred.argmax(1).eq(y.argmax(1)).int().sum()

Иными словами, если работать с базовыми функциями, то вполне можно извлечь пользу в DL от старых GPU в PyTorch.

Разные версии OpenCL

AMD за всё время сделала несколько версий OpenCl. И несмотря на то, что их не так-то просто использовать, в некоторых случаях они могут работать быстрее rusticl. Но намного капризнее. Так как возможны конфликты библиотек. И то, что работало ранее, может перестать работать после какого-то обновления. По-хорошему было бы прекрасно их как-то в докер-образы запаковать, а пока есть только проект I-love-compute.

У Mesa также есть более старый проект Clover для OpenCL. В каких-то случаях старого железа он может быть тоже актуален.

Прелесть проекта Артёма Бейлиса заключается в том, что устройство должно поддерживать OpenCL для интеграции с PyTorch, а что это будет за устройство — неважно. Он писал, что люди и с видеокартами Intel и даже FPGA использовали его проект.

Совместимые версии PyTorch

Я использовал pytorch_dlprim с версией PyTorch 1.13, так как с 2.2.1 было уж очень много предупреждений (warnings). Артём говорит, что с версией 2.4 уже работает нормально. С версиями 2.0-2.3 возможны проблемы.

Статус проекта

В данный момент проект активно развивается. Внедрены эпохальные изменения, упрощающие установку, поддерживается последняя версия PyTorch 2.4.0. Артём дорабатывает код, чтобы на pytorch_dlprim заработала нейросеть YOLO, предназначенная для онлайн-распознавания объектов и сегментации изображений. Также он купил видеокарту Intel Arc, чтобы улучшить поддержку этой платформы. Есть блог разработки (редкообновляемый).

Однако история показывает, что Артём может пропасть на несколько месяцев, и в это время вы не дождётесь ответа на ваш тикет. Сейчас у него как раз период активности. Поэтому прошу всех заинтересованных лиц потестировать его проект и прислать ему ошибки, пока у него опять есть запал.

Мне лично пришлось написать несколько патчей, чтобы использовать pytorch_dlprim в своём проекте. И если Артём опять пропадёт с радаров, то я на всякий случай оставлю ссылку на мой репозиторий, так как есть вероятность, что код в нём будет свежее, чем в его оригинальном. Как было последние 4 месяца, которые ему потребовались, чтобы принять мой большой пул-реквест (патч).

Проект pytorch_dlprim нуждается в помощи сообщества, так как многие простые функции до сих пор не написаны. Если вы хотя бы немного знаете OpenCL и средне С++, то можете принести пользу сообществу.

Пожалуйста, тестируйте и присоединяйтесь к разработке!

Telegram-канал со скидками, розыгрышами призов и новостями IT 💻

Если эта публикация вас вдохновила и вы хотите поддержать автора — не стесняйтесь нажать на кнопку

Источник

To install PyTorch on AMD GPUs for Windows, you need to follow a series of steps that ensure you have the right environment and dependencies set up. First, ensure that you have the latest version of the ROCm (Radeon Open Compute) platform installed, as it provides the necessary support for running PyTorch on AMD hardware.

Step 1: Install ROCm

You can download the ROCm installation package from the official AMD website. Follow the instructions provided there to install ROCm on your Windows system. Make sure to check the compatibility of your AMD GPU with the ROCm version you are installing.

Step 2: Download LibTorch

Once ROCm is installed, you can proceed to download the LibTorch distribution, which is the C++ distribution of PyTorch. Use the following command to download the latest version:

wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-shared-with-deps-latest.zip
unzip libtorch-shared-with-deps-latest.zip

Make sure to select the GPU-enabled version of LibTorch if you want to leverage the capabilities of your AMD GPU. You can find the appropriate link on the PyTorch website.

Step 3: Set Up Your Development Environment

For Windows developers who prefer not to use CMake, you can utilize Visual Studio. However, CMake is recommended for its ease of use and future support. If you choose to use CMake, create a basic CMakeLists.txt file to configure your project. Here’s a simple example:

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(example_app)
find_package(Torch REQUIRED)
add_executable(example_app example.cpp)
target_link_libraries(example_app ${TORCH_LIBRARIES})
set_property(TARGET example_app PROPERTY CXX_STANDARD 14)

Step 4: Compile and Run Your Application

After setting up your project, you can compile your application using CMake. Navigate to your project directory and run the following commands:

mkdir build
cd build
cmake ..
cmake --build .

Once the build process is complete, you can run your application. Ensure that your environment variables are set correctly to include the paths to the ROCm libraries and the LibTorch binaries.

By following these steps, you will have a working installation of PyTorch on your AMD GPU under Windows, allowing you to leverage the power of your hardware for deep learning tasks.

Источник

Installing on macOS

PyTorch can be installed and used on macOS. Depending on your system and GPU capabilities, your experience with PyTorch on a Mac may vary in terms of processing time.

Prerequisites

macOS Version

PyTorch is supported on macOS 10.15 (Catalina) or above.

Python

It is recommended that you use Python 3.9 — 3.12.
You can install Python either through the Anaconda
package manager (see below), Homebrew, or
the Python website.

Package Manager

To install the PyTorch binaries, you will need to use one of two supported package managers: pip or Anaconda.

Anaconda

To install Anaconda, you can download graphical installer or use the command-line installer. If you use the command-line installer, you can right-click on the installer link, select Copy Link Address, or use the following commands on Mac computer with Apple silicon:

# The version of Anaconda may be different depending on when you are installing`
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
sh Miniconda3-latest-MacOSX-arm64.sh
# and follow the prompts. The defaults are generally good.`

pip

Python 3

If you installed Python via Homebrew or the Python website, pip was installed with it. If you installed Python 3.x, then you will be using the command pip3.

Tip: If you want to use just the command pip, instead of pip3, you can symlink pip to the pip3 binary.

Installation

Anaconda

To install PyTorch via Anaconda, use the following conda command:

conda install pytorch torchvision -c pytorch

pip

To install PyTorch via pip, use one of the following two commands, depending on your Python version:

# Python 3.x
pip3 install torch torchvision

Verification

To ensure that PyTorch was installed correctly, we can verify the installation by running sample PyTorch code. Here we will construct a randomly initialized tensor.

import torch
x = torch.rand(5, 3)
print(x)

The output should be something similar to:

tensor([[0.3380, 0.3845, 0.3217],
        [0.8337, 0.9050, 0.2650],
        [0.2979, 0.7141, 0.9069],
        [0.1449, 0.1132, 0.1375],
        [0.4675, 0.3947, 0.1426]])

Building from source

For the majority of PyTorch users, installing from a pre-built binary via a package manager will provide the best experience. However, there are times when you may want to install the bleeding edge PyTorch code, whether for testing or actual development on the PyTorch core. To install the latest PyTorch code, you will need to build PyTorch from source.

Prerequisites

[Optional] Install Anaconda
Follow the steps described here: https://github.com/pytorch/pytorch#from-source

You can verify the installation as described above.

Installing on Linux

PyTorch can be installed and used on various Linux distributions. Depending on your system and compute requirements, your experience with PyTorch on Linux may vary in terms of processing time. It is recommended, but not required, that your Linux system has an NVIDIA or AMD GPU in order to harness the full power of PyTorch’s CUDA support or ROCm support.

Prerequisites

Supported Linux Distributions

PyTorch is supported on Linux distributions that use glibc >= v2.17, which include the following:

Arch Linux, minimum version 2012-07-15
CentOS, minimum version 7.3-1611
Debian, minimum version 8.0
Fedora, minimum version 24
Mint, minimum version 14
OpenSUSE, minimum version 42.1
PCLinuxOS, minimum version 2014.7
Slackware, minimum version 14.2
Ubuntu, minimum version 13.04

The install instructions here will generally apply to all supported Linux distributions. An example difference is that your distribution may support yum instead of apt. The specific examples shown were run on an Ubuntu 18.04 machine.

Python

Python 3.9-3.12 is generally installed by default on any of our supported Linux distributions, which meets our recommendation.

Tip: By default, you will have to use the command python3 to run Python. If you want to use just the command python, instead of python3, you can symlink python to the python3 binary.

However, if you want to install another version, there are multiple ways:

APT
Python website

If you decide to use APT, you can run the following command to install it:

If you use Anaconda to install PyTorch, it will install a sandboxed version of Python that will be used for running PyTorch applications.

Package Manager

To install the PyTorch binaries, you will need to use one of two supported package managers: Anaconda or pip. Anaconda is the recommended package manager as it will provide you all of the PyTorch dependencies in one, sandboxed install, including Python.

Anaconda

To install Anaconda, you will use the command-line installer. Right-click on the 64-bit installer link, select Copy Link Location, and then use the following commands:

# The version of Anaconda may be different depending on when you are installing`
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
# and follow the prompts. The defaults are generally good.`

You may have to open a new terminal or re-source your ~/.bashrc to get access to the conda command.

pip

Python 3

While Python 3.x is installed by default on Linux, pip is not installed by default.

sudo apt install python3-pip

Tip: If you want to use just the command pip, instead of pip3, you can symlink pip to the pip3 binary.

Installation

Anaconda

No CUDA/ROCm

To install PyTorch via Anaconda, and do not have a CUDA-capable or ROCm-capable system or do not require CUDA/ROCm (i.e. GPU support), in the above selector, choose OS: Linux, Package: Conda, Language: Python and Compute Platform: CPU.
Then, run the command that is presented to you.

With CUDA

To install PyTorch via Anaconda, and you do have a CUDA-capable system, in the above selector, choose OS: Linux, Package: Conda and the CUDA version suited to your machine. Often, the latest CUDA version is better.
Then, run the command that is presented to you.

With ROCm

PyTorch via Anaconda is not supported on ROCm currently. Please use pip instead.

pip

No CUDA

To install PyTorch via pip, and do not have a CUDA-capable or ROCm-capable system or do not require CUDA/ROCm (i.e. GPU support), in the above selector, choose OS: Linux, Package: Pip, Language: Python and Compute Platform: CPU.
Then, run the command that is presented to you.

With CUDA

To install PyTorch via pip, and do have a CUDA-capable system, in the above selector, choose OS: Linux, Package: Pip, Language: Python and the CUDA version suited to your machine. Often, the latest CUDA version is better.
Then, run the command that is presented to you.

With ROCm

To install PyTorch via pip, and do have a ROCm-capable system, in the above selector, choose OS: Linux, Package: Pip, Language: Python and the ROCm version supported.
Then, run the command that is presented to you.

Verification

To ensure that PyTorch was installed correctly, we can verify the installation by running sample PyTorch code. Here we will construct a randomly initialized tensor.

import torch
x = torch.rand(5, 3)
print(x)

The output should be something similar to:

tensor([[0.3380, 0.3845, 0.3217],
        [0.8337, 0.9050, 0.2650],
        [0.2979, 0.7141, 0.9069],
        [0.1449, 0.1132, 0.1375],
        [0.4675, 0.3947, 0.1426]])

Additionally, to check if your GPU driver and CUDA/ROCm is enabled and accessible by PyTorch, run the following commands to return whether or not the GPU driver is enabled (the ROCm build of PyTorch uses the same semantics at the python API level link, so the below commands should also work for ROCm):

import torch
torch.cuda.is_available()

Building from source

Prerequisites

Install Anaconda or Pip
If you need to build PyTorch with GPU support
a. for NVIDIA GPUs, install CUDA, if your machine has a CUDA-enabled GPU.
b. for AMD GPUs, install ROCm, if your machine has a ROCm-enabled GPU
Follow the steps described here: https://github.com/pytorch/pytorch#from-source

You can verify the installation as described above.

Installing on Windows

PyTorch can be installed and used on various Windows distributions. Depending on your system and compute requirements, your experience with PyTorch on Windows may vary in terms of processing time. It is recommended, but not required, that your Windows system has an NVIDIA GPU in order to harness the full power of PyTorch’s CUDA support.

Prerequisites

Supported Windows Distributions

PyTorch is supported on the following Windows distributions:

Windows 7 and greater; Windows 10 or greater recommended.
Windows Server 2008 r2 and greater

The install instructions here will generally apply to all supported Windows distributions. The specific examples shown will be run on a Windows 10 Enterprise machine

Python

Currently, PyTorch on Windows only supports Python 3.9-3.12; Python 2.x is not supported.

As it is not installed by default on Windows, there are multiple ways to install Python:

Chocolatey
Python website
Anaconda

If you use Anaconda to install PyTorch, it will install a sandboxed version of Python that will be used for running PyTorch applications.

If you decide to use Chocolatey, and haven’t installed Chocolatey yet, ensure that you are running your command prompt as an administrator.

For a Chocolatey-based install, run the following command in an administrative command prompt:

Package Manager

To install the PyTorch binaries, you will need to use at least one of two supported package managers: Anaconda and pip. Anaconda is the recommended package manager as it will provide you all of the PyTorch dependencies in one, sandboxed install, including Python and pip.

Anaconda

To install Anaconda, you will use the 64-bit graphical installer for PyTorch 3.x. Click on the installer link and select Run. Anaconda will download and the installer prompt will be presented to you. The default options are generally sane.

pip

If you installed Python by any of the recommended ways above, pip will have already been installed for you.

Installation

Anaconda

To install PyTorch with Anaconda, you will need to open an Anaconda prompt via Start | Anaconda3 | Anaconda Prompt.

No CUDA

To install PyTorch via Anaconda, and do not have a CUDA-capable system or do not require CUDA, in the above selector, choose OS: Windows, Package: Conda and CUDA: None.
Then, run the command that is presented to you.

With CUDA

To install PyTorch via Anaconda, and you do have a CUDA-capable system, in the above selector, choose OS: Windows, Package: Conda and the CUDA version suited to your machine. Often, the latest CUDA version is better.
Then, run the command that is presented to you.

pip

No CUDA

To install PyTorch via pip, and do not have a CUDA-capable system or do not require CUDA, in the above selector, choose OS: Windows, Package: Pip and CUDA: None.
Then, run the command that is presented to you.

With CUDA

To install PyTorch via pip, and do have a CUDA-capable system, in the above selector, choose OS: Windows, Package: Pip and the CUDA version suited to your machine. Often, the latest CUDA version is better.
Then, run the command that is presented to you.

Verification

To ensure that PyTorch was installed correctly, we can verify the installation by running sample PyTorch code. Here we will construct a randomly initialized tensor.

From the command line, type:

then enter the following code:

import torch
x = torch.rand(5, 3)
print(x)

The output should be something similar to:

tensor([[0.3380, 0.3845, 0.3217],
        [0.8337, 0.9050, 0.2650],
        [0.2979, 0.7141, 0.9069],
        [0.1449, 0.1132, 0.1375],
        [0.4675, 0.3947, 0.1426]])

Additionally, to check if your GPU driver and CUDA is enabled and accessible by PyTorch, run the following commands to return whether or not the CUDA driver is enabled:

import torch
torch.cuda.is_available()

Building from source

Prerequisites

Install Anaconda
Install CUDA, if your machine has a CUDA-enabled GPU.
If you want to build on Windows, Visual Studio with MSVC toolset, and NVTX are also needed. The exact requirements of those dependencies could be found out here.
Follow the steps described here: https://github.com/pytorch/pytorch#from-source

You can verify the installation as described above.

Источник

First written：2022/09/26

Last updated：2023/07/15

In the past，using AMD GPU for DL，we need Linux with ROCm installed.

Now, Microsoft published DirectMLopen in new window, which makes any GPU supporting DirectX12 be able to be used for DL on Windows.

How to config

Choose PyTorch-DirectML as an example：

(Optional) Install MiniConda environment，and set environment variables：

Path\To\Miniconda3
Path\To\Miniconda3\Scripts
Path\To\Miniconda3\Library\bin

Create a Python env（If without Conda, you also need Python3.8-3.10)
```
conda create -n directML python=3.8
```

Activate env and install dependencies

conda activate directML
pip install torch-directml

Usage

import torch_directml
dml = torch_directml.device() # device=dml

Run an Opensource Project

Choose GFPGANopen in new window from Tencent as an example, which aims at recover faces, with Real-ESRGANopen in new window built in, which can enhance the background.

Follow its Guideopen in new window：

Mentioned: If you want to enhance the background, Real-ESRGAN installation is needed

After Guide, download pre-trained model:
```
wget https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth -P experiments/pretrained_models
```
You can use other tools to download to specific directory.

How to use：

Usage: python inference_gfpgan.py -i inputs/whole_imgs -o results -v 1.3 -s 2 [options]...

  -h                   show this help
  -i input             Input image or folder. Default: inputs/whole_imgs
  -o output            Output folder. Default: results
  -v version           GFPGAN model version. Option: 1 | 1.2 | 1.3. Default: 1.3
  -s upscale           The final upsampling scale of the image. Default: 2
  -bg_upsampler        background upsampler. Default: realesrgan
  -bg_tile             Tile size for background sampler, 0 for no tile during testing. Default: 400
  -suffix              Suffix of the restored faces
  -only_center_face    Only restore the center face
  -aligned             Input are aligned faces
  -ext                 Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto

Because basicsr and facexlib both depend on torch, you need to uninstalltorch, and force to reinstall pytorch-directml: (Dependency relations will not be satisfied, but ok)
```
pip uninstall torch
pip install --force-reinstall pytorch-directml
```

About Real-ESRGAN

The unoptimized RealESRGAN is slow on CPU. If you really want to use it on CPU, please modify the corresponding codes: (inference_gfpgan.py Line59)

# ------------------------ set up background upsampler ------------------------
if args.bg_upsampler == 'realesrgan':
    from basicsr.archs.rrdbnet_arch import RRDBNet
    from realesrgan import RealESRGANer
    model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
    bg_upsampler = RealESRGANer(
        scale=2,
        model_path='https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth',
        model=model,
        tile=args.bg_tile,
        tile_pad=10,
        pre_pad=0,
        half=False)  # use fp16, need to set False in CPU mode
else:
    bg_upsampler = None

Everything is OK here.

DirectML Support

In inference_gfpgan.py: add device parameter to constructors of RealESRGANer and GFPGANer.
```
import torch_directml
dml = torch_directml.device()

bg_upsampler = RealESRGANer(..., device=dml)
restorer = GFPGANer(..., device=dml)
```
But there are currently two problems:
- upsampler triggers an assertion in basicsr\archs\arch_util.py after processing several tiles, which is not solved:
```
Assert that hh % scale == 0 and hw % scale == 0
```
- If realesrgan is disabled, the face can be processed normally, but result is very weird.
  
  In order to avoid discomfort, no results are shown here.
  
  The CPU output are normal, but the DirectML output is abnormal.
  
  After analysis, the problem may be in the bilinear interpolation calculation, I have raised an issue in the DirectML project, click hereopen in new window.

Источник

Install PyTorch for ROCm#

Refer to this section for the recommended PyTorch via PIP installation method, as well as Docker-based installation.

PCIe atomics

ROCm is an extension of HSA platform architecture, and shares queuing model, memory model, signaling and synchronization protocols.

Platform atomics are integral to perform queuing and signaling memory operations, where there may be multiple-writers across CPU and GPU agents.

For more details, see How ROCm uses PCIe atomics.

Install methods#

AMD recommends the PIP install method to create a PyTorch environment when working with ROCm™ for machine learning development.

Using Docker provides portability, and access to a prebuilt Docker container that has been rigorously tested within AMD. Docker also cuts down compilation time, and should perform as expected without installation issues.

Option A: PyTorch via PIP installation

AMD recommends the PIP install method to create a PyTorch environment when working with ROCm™ for machine learning development.

Note
The latest version of Python module numpy v2.0 is incompatible with the torch wheels for this version. Downgrade to an older version is required.
Example: pip3 install numpy==1.26.4

Install PyTorch via PIP

Enter the following command to unpack and begin set up.
```
sudo apt install python3-pip -y
```
Enter this command to update the pip wheel.
```
pip3 install --upgrade pip wheel
```

Select the applicable Ubuntu version and enter the commands to install Torch and Torchvision for ROCm AMD GPU support.

This may take several minutes.

Important! AMD recommends proceeding with ROCm WHLs available at repo.radeon.com.
The ROCm WHLs available at PyTorch.org are not tested extensively by AMD as the WHLs change regularly when the nightly builds are updated.

Ubuntu 24.04

wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/torch-2.4.0%2Brocm6.3.4.git7cecbf6d-cp312-cp312-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/torchvision-0.19.0%2Brocm6.3.4.gitfab84886-cp312-cp312-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/pytorch_triton_rocm-3.0.0%2Brocm6.3.4.git75cc27c2-cp312-cp312-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/torchaudio-2.4.0%2Brocm6.3.4.git69d40773-cp312-cp312-linux_x86_64.whl
pip3 uninstall torch torchvision pytorch-triton-rocm
pip3 install torch-2.4.0+rocm6.3.4.git7cecbf6d-cp312-cp312-linux_x86_64.whl torchvision-0.19.0+rocm6.3.4.gitfab84886-cp312-cp312-linux_x86_64.whl torchaudio-2.4.0+rocm6.3.4.git69d40773-cp312-cp312-linux_x86_64.whl pytorch_triton_rocm-3.0.0+rocm6.3.4.git75cc27c2-cp312-cp312-linux_x86_64.whl

Note

The --break-system-packages flag must be added when installing wheels for Python 3.12 in a non-virtual environment.

Ubuntu 22.04

wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/torch-2.4.0%2Brocm6.3.4.git7cecbf6d-cp310-cp310-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/torchvision-0.19.0%2Brocm6.3.4.gitfab84886-cp310-cp310-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/pytorch_triton_rocm-3.0.0%2Brocm6.3.4.git75cc27c2-cp310-cp310-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3.4/torchaudio-2.4.0%2Brocm6.3.4.git69d40773-cp310-cp310-linux_x86_64.whl
pip3 uninstall torch torchvision pytorch-triton-rocm
pip3 install torch-2.4.0+rocm6.3.4.git7cecbf6d-cp310-cp310-linux_x86_64.whl torchvision-0.19.0+rocm6.3.4.gitfab84886-cp310-cp310-linux_x86_64.whl torchaudio-2.4.0+rocm6.3.4.git69d40773-cp310-cp310-linux_x86_64.whl pytorch_triton_rocm-3.0.0+rocm6.3.4.git75cc27c2-cp310-cp310-linux_x86_64.whl

Update to WSL compatible runtime lib.

location=$(pip show torch | grep Location | awk -F ": " '{print $2}')
cd ${location}/torch/lib/
rm libhsa-runtime64.so*

Optional step: Using a Conda environment.

Note
This is an optional step for users who wish to proceed with a Conda environment.
AMD does not officially support and validate Conda usecases.

The libhsa-runtime64.so requires installation of GCC 12.1 at minimum.
When using a Conda environment, ImportError: version 'GLIBCXX_3.4.30' not found is likely to occur.
Upgrade GCC for Conda using the following command.
```
conda install -c conda-forge gcc=12.1.0
```

Next, verify your PyTorch installation.

Option B: Docker installation

Note
The latest version of Python module numpy v2.0 is incompatible with the torch wheels for this version. Downgrade to an older version is required.
Example: pip3 install numpy==1.26.4

Prerequisites to install PyTorch using Docker

Docker for Ubuntu® must be installed.

To install Docker for Ubuntu, enter the following command:

sudo apt install docker.io

Use Docker image with pre-installed PyTorch

Select the applicable Ubuntu version and enter the following command to pull the public PyTorch Docker image.

Ubuntu 22.04

sudo docker pull rocm/pytorch:rocm6.3.4_ubuntu22.04_py3.10_pytorch_release_2.4.0

Ubuntu 24.04

sudo docker pull rocm/pytorch:rocm6.3.4_ubuntu24.04_py3.12_pytorch_release_2.4.0

Select the applicable Ubuntu version and start a Docker container using the downloaded image.

Ubuntu 22.04

sudo docker run -it \
--cap-add=SYS_PTRACE  \
--security-opt seccomp=unconfined \
--ipc=host \
--shm-size 8G \
--device=/dev/dxg -v /usr/lib/wsl/lib/libdxcore.so:/usr/lib/libdxcore.so -v /opt/rocm/lib/libhsa-runtime64.so.1:/opt/rocm/lib/libhsa-runtime64.so.1  \
  rocm/pytorch:rocm6.3.4_ubuntu22.04_py3.10_pytorch_release_2.4.0

Ubuntu 24.04

sudo docker run -it \
--cap-add=SYS_PTRACE  \
--security-opt seccomp=unconfined \
--ipc=host \
--shm-size 8G \
--device=/dev/dxg -v /usr/lib/wsl/lib/libdxcore.so:/usr/lib/libdxcore.so -v /opt/rocm/lib/libhsa-runtime64.so.1:/opt/rocm/lib/libhsa-runtime64.so.1  \
  rocm/pytorch:rocm6.3.4_ubuntu24.04_py3.12_pytorch_release_2.4.0

This will automatically download the image if it does not exist on the host. You can also pass the -v argument to mount any data directories from the host onto the container.

Next, verify the PyTorch installation.

See PyTorch Installation for ROCm for more information.

Verify PyTorch installation#

Confirm if PyTorch is correctly installed.

Verify if Pytorch is installed and detecting the GPU compute device.

python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure'

Expected result:

Enter command to test if the GPU is available.

python3 -c 'import torch; print(torch.cuda.is_available())'

Expected result:

Enter command to display installed GPU device name.

python3 -c "import torch; print(f'device name [0]:', torch.cuda.get_device_name(0))"

Expected result:
Example: device name [0]: Radeon RX 7900 XTX

device name [0]: <Supported AMD GPU>

Enter command to display component information within the current PyTorch environment.

python3 -m torch.utils.collect_env

Expected result:

PyTorch version
 
ROCM used to build PyTorch
 
OS
 
Is CUDA available
 
GPU model and configuration
 
HIP runtime version
 
MIOpen runtime version

Environment set-up is complete, and the system is ready for use with PyTorch to work with machine learning models, and algorithms.

Источник