vLLM 0.23.0 (ROCm 7.2.4)
Generated on 19 Jun 2026 from the vLLM 0.23.0 (ROCm 7.2.4) catalog page
vLLM 0.23.0 on AMD ROCm 7.2.4
This 1-Click image ships vLLM 0.23.0, a high-throughput and memory-efficient inference and serving engine for large language models, running on an AMD ROCm 7.2.4 host stack on Ubuntu 24.04. It is preconfigured for AMD Instinct GPUs and packaged as a ready-to-run Docker container with the OpenAI-compatible vLLM server, alongside a JupyterLab environment with example notebooks.
vLLM delivers state-of-the-art serving throughput with PagedAttention, continuous batching, and an OpenAI-compatible API.
Software Included
| Package | Version | License |
|---|---|---|
| vLLM | 0.23.0 | Apache-2.0 |
| ROCm | 7.2.4 | MIT/Apache-2.0 |
| JupyterLab | 4.4.2 | |
| Docker | latest |
Deploying this Offering using the Control Panel
Click the Deploy to DigitalOcean button to deploy this offering. If you aren’t logged in, this link will prompt you to log in with your DigitalOcean account.
[](https://cloud.digitalocean.com/gpus/new?appId=36510f8e9ccc84d0a64948b9&image=vLLM 0.23.0 (ROCm 7.2.4) 0.23.0 on Ubuntu 24.04&type=applications)
Getting Started After Deploying vLLM 0.23.0 (ROCm 7.2.4)
After the droplet boots, SSH in as root. The Jupyter Lab URL and token, plus the commands to start and inspect the vLLM container, are printed in the login MOTD. vLLM runs inside a Docker container; use the printed docker exec command to open an interactive shell. Verify the GPU stack with amd-smi and rocminfo.