Llama 3.2 11B Vision Instruct - Multi GPU
Generated on 13 Mar 2025 from the Llama 3.2 11B Vision Instruct - Multi GPU catalog page
The Llama-3.2-11B-Vision-Instruct is a multimodal large language model optimized for visual recognition, image reasoning, captioning, and answering questions about images. It was trained on 6 billion image-text pairs, has 11 billion parameters, and is supported for commercial and research use in English.
Model Information
- Model ID:
meta-llama/Llama-3.2-11B-Vision-Instruct
- Supported Language(s): en, de, fr, it, pt, hi, es, th
- License: Llama 3.2
- Modality: text+image
Hardware Support
NVIDIA GPUs
GPU Model | Number of Accelerators | Max Input Tokens | Max New Tokens |
---|---|---|---|
NVIDIA H100 | 1 | 99,658 | 99,690 |
NVIDIA H100 | 2 | 74,840 | 74,872 |
NVIDIA H100 | 4 | 90,582 | 90,614 |
NVIDIA H100 | 8 | 90,582 | 90,614 |
Software Included
Package | Version | License |
---|---|---|
Meta Llama 3.2 | 3.2-11B-Vision-Instruct |
Creating an App using the Control Panel
Click the Deploy to DigitalOcean button to create a Droplet based on this 1-Click App. If you aren’t logged in, this link will prompt you to log in with your DigitalOcean account.
Creating an App using the API
In addition to creating a Droplet from the Llama 3.2 11B Vision Instruct - Multi GPU 1-Click App using the control panel, you can also use the DigitalOcean API. As an example, to create a 4GB Llama 3.2 11B Vision Instruct - Multi GPU Droplet in the SFO2 region, you can use the following curl
command. You need to either save your API access token) to an environment variable or substitute it in the command below.
curl -X POST -H 'Content-Type: application/json' \
-H 'Authorization: Bearer '$TOKEN'' -d \
'{"name":"choose_a_name","region":"sfo2","size":"s-2vcpu-4gb","image": "digitaloceanai-llama3211bvisioninstruct8x"}' \
"https://api.digitalocean.com/v2/droplets"
Getting Started After Deploying Llama 3.2 11B Vision Instruct - Multi GPU
Quickly Get Started With Your 1-Click Models
-
Access the Droplet Console:
- Navigate to the GPU Droplets page.
- Locate your newly created 1-Click Model Droplet and click on its name.
- Under the “Access” tab, select Console. This will open an in-browser terminal session connected to your droplet.
- Log in as the
root
user using the password you set during droplet creation.
- Login via SSH:
- If you selected an SSH key during droplet creation, follow these steps:- Open your preferred SSH client (e.g., PuTTY, Terminal).
- Use the droplet’s public IP address to log in as
root
:
ssh root@your_droplet_public_IP
+ Ensure your SSH key is added to the SSH agent, or specify the key file directly:
ssh -i /path/to/your/private_key root@your_droplet_public_IP
+ Once connected, you will be logged in as the root user without needing a password.
-
Check the Message of the Day (MOTD) for Access Token:
- Upon successful login via console or SSH, the Message of the Day (MOTD) will be displayed.
- This message includes important information such as the bearer token. Take note of this token as you’ll need it to use the inference API for your model.
Troubleshooting
- Please note that the models require a couple of minutes to load, as the docker containers is started for the respective model. During this process any API calls to the model will timeout.
- To ensure that Caddy is working, run:
sudo systemctl status caddy
Usage Example
You can make an API call to the droplet using the following cURL command:
curl --location 'http://<your_droplet_ip>/v1/chat/completions' \
--header 'accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <your_token_here>' \
--data '{
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "test-image.jpg"
}
},
{
"type": "text",
"text": "Describe this image in detail"
}
]
}
],
"max_tokens": 600,
"stream": false
}'
This works with every OpenAI client including JavaScript.