Building PyTorch from Source on Jetson AGX Orin with CUDA and uv

In this post, I’ll walk through how I built PyTorch from source on a Jetson AGX Orin 32GB with CUDA enabled, while using uv for the Python environment and package management side.

Prerequisites

This write-up assumes:

Jetson AGX Orin 32GB
JetPack 6.x
a working CUDA toolchain already installed through JetPack
uv already installed

If you are not sure what JetPack / L4T version your board is running, check it first:

cat /etc/nv_tegra_release
nvcc --version | tail -n1
python3 --version

If you see R35.x, you are on the JetPack 5 generation. If you see R36.x, you are on the JetPack 6 generation.

For a PyTorch 2.10 build, I strongly recommend being on JetPack 6.x. It is the path of least resistance on AGX Orin.

System packages

I started by installing the native packages I wanted available on the system side:

sudo apt update
sudo apt install -y \
  curl git build-essential pkg-config \
  libopenblas-dev libjpeg-dev zlib1g-dev libpng-dev

Workspace and virtual environment

Next, I created a dedicated workspace and a virtual environment with uv:

mkdir -p ~/work/torch_build
cd ~/work/torch_build

uv venv
source ~/work/torch_build/.venv/bin/activate

uv pip install -U pip setuptools wheel cmake ninja

Clone PyTorch

For PyTorch 2.10, I would use:

git clone --recursive --branch v2.10.0 https://github.com/pytorch/pytorch
cd pytorch

Then I install PyTorch’s development dependencies from the repository itself:

uv pip install --project . --group dev

Build configuration

Before building, I export the PyTorch build options I want.

export CMAKE_PREFIX_PATH="${VIRTUAL_ENV}${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}"

export USE_CUDA=1
export USE_CUDNN=1
export TORCH_CUDA_ARCH_LIST="8.7"
export BLAS=OpenBLAS

export USE_CUSPARSELT=1
export USE_DISTRIBUTED=0
export USE_GLOO=0
export USE_MPI=0
export BUILD_TEST=0
export USE_FBGEMM=0
export USE_KINETO=0
export USE_MKLDNN=0

# To avoid OOM on Jetson during compilation
export MAX_JOBS=4

Here is what matters most:

USE_CUDA=1 enables the CUDA build.
USE_CUDNN=1 enables cuDNN.
TORCH_CUDA_ARCH_LIST="8.7" targets Jetson AGX Orin’s GPU architecture.
BLAS=OpenBLAS tells the build to use the OpenBLAS installation from the system.
MAX_JOBS=4 is a practical limit to reduce the chance of the build getting killed by memory pressure.

Build PyTorch

With the environment ready, the build itself is just:

uv pip install --no-build-isolation -v -e .

A few details are worth calling out:

Verify that PyTorch works

Once the build finishes, I run a quick verification script:

python - <<'PY'
import torch
print("torch:", torch.__version__)
print("cuda build:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("device:", torch.cuda.get_device_name(0))
    x = torch.randn(1024, 1024, device="cuda")
    y = x @ x
    print("matmul ok, norm =", y.norm().item())
PY

What I want to see here is:

a valid PyTorch version,
a non-empty CUDA build version,
torch.cuda.is_available() returning True,
the device name resolving to the Jetson GPU,
a simple CUDA matmul succeeding.