Available Foundation and Embedding Models

Validated on 7 Jun 2025 • Last edited on 8 Jul 2025

GradientAI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.

The following foundation and embedding models are available for GradientAI Platform. For pricing, see GradientAI Platform’s pricing page.

Foundation Models

A foundation model is a large-scale model pre-trained on a large corpus of data and adaptable to various tasks. We offer the following foundation models:

Provider Model and Version Source Availability Parameters Max Tokens
Anthropic Claude 3.7 Sonnet Proprietary Not published 1,024
Anthropic Claude 3.5 Sonnet Proprietary Not published 1,024
Anthropic Claude 3.5 Haiku Proprietary Not published 1,024
Anthropic Claude 3 Opus Proprietary Not published 1,024
DeepSeek DeepSeek-R1 distill-llama-70B Open source 70 billion 8,000
Meta Llama 3.3 Instruct-70B Open source 70 billion 2,048
Meta LLama 3.1 Instruct-70B Open source 70 billion 2,048
Meta LLama 3.1 Instruct-8B Open source 8 billion 512
Mistral NeMo Open source 12 billion 512
OpenAI GPT-4o Proprietary Not published 16,384
OpenAI GPT-4o mini Proprietary Not published 16,384
OpenAI o1 Proprietary Not published 100,000
OpenAI 03-mini Proprietary Not published 100,000

You can access all available models (except OpenAI o1) for inference using endpoints. You can also experiment with models in the Model Playground.

Warning
When using DeepSeek models in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience.

Access to proprietary models are determined by your API keys with those providers, like your Anthropic API key or OpenAI API key.

Embedding Models

An embedding model converts data into vector embeddings. GradientAI Platform stores vector embeddings in an OpenSearch database cluster for use with agent knowledge bases. We offer the following embedding models:

Provider Type Model and Version Parameters
Tongyi Lab, Alibaba General text embeddings (GTE) Alibaba-NLP/gte-large-en-v1.5 434 million
UKP Lab, Technical University of Darmstadt Sentence Transformer (SBERT) sentence-transformers/all-MiniLM-L6-v2 22.7 million
UKP Lab, Technical University of Darmstadt Sentence Transformer (SBERT) sentence-transformers/multi-qa-mpnet-base-dot-v1 109 million

We can't find any results for your search.

Try using different keywords or simplifying your search terms.