-->
Explore the specifications, capabilities, and API model strings of Cloxo AI's powerful models.
A fine-tuned version of Llama 3.2 90b optimized for text and vision tasks, designed for smooth chat and content generation.
Provider: Cloxo AI
API Slug: cloxoai/cloxogpt
Context Length: 8k
Links: Data Policy / Status
DeepSeek-V3, pre-trained on nearly 15T tokens, improves upon previous versions instruction following and coding, outperforming open-source models and rivaling leading closed-source models.
Provider: DeepSeek
API Slug: deepseek/deepseek-v3-free
Context Length: 64k
Links: Data Policy / Status
DeepSeek R1 Distill Llama 70B, distilled from Llama-3.3-70B-Instruct using DeepSeek R1 outputs, employs advanced distillation techniques to achieve high benchmark performance, including AIME 2024 pass@1 (70.0), MATH-500 pass@1 (94.5), and a CodeForces rating of 1633, offering competitive performance akin to larger frontier models.
Provider: DeepSeek
API Slug: deepseek/deepseek-r1-distill-llama-70b-free
Context Length: 16k
Links: Data Policy / Status
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Provider: Lambda
API Slug: meta/llama-3-3-70b-instruct-free
Context Length: 131k
Links: Data Policy / Status
DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Provider: DeepSeek
API Slug: deepseek/deepseek-r1-free
Context Length: 16k
Links: Data Policy / Status
QwQ-32B-Preview is an experimental research model from the Qwen Team showcasing promising AI reasoning capabilities, particularly in math and coding, while exhibiting limitations in language mixing, recursive reasoning, safety, and overall performance.
Provider: DeepInfra
API Slug: qwen/qwq-32b-preview-free
Context Length: 33k
Links: Data Policy / Status
OpenAI's cost-efficient multimodal model, offering state-of-the-art intelligence for text and vision inputs.
Provider: OpenAI
API Slug: openai/gpt-4o-mini-free
Context Length: 128k
Links: Data Policy / Status
Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stronger reasoning capabilities in its responses than the base Gemini 2.0 Flash model.
Provider: Google Vertex
API Slug: google/gemini-20-flash-thinking-exp-free
Context Length: 40k
Links: Data Policy / Status
The first image-to-text model from Mistral AI, optimized for converting visual data into text with precision.
Provider: Hyperbolic
API Slug: mistral/pixtral-12b-free
Context Length: 4k
Links: Data Policy / Status
Gemini Flash 2.0 significantly improves TTFT over Gemini Flash 1.5 while maintaining Gemini Pro 1.5-level quality, enhancing multimodal understanding, coding, complex instruction following, and function calling for more seamless and robust agentic experiences.
Provider: Google Vertex
API Slug: google/gemini-2.0-free
Context Length: 1000k
Links: Data Policy / Status
The highly anticipated 405B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong performance compared to leading closed-source models including GPT-4o and Claude 3.5 Sonnet in evaluations.
Provider: DeepInfra
API Slug: meta/llama-3-1-405b-instruct-free
Context Length: 32k
Links: Data Policy / Status
The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology.
Provider: openAI
API Slug: openai/o1-free
Context Length: 128k
Links: Data Policy / Status
A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi.
Provider: DeepInfra
API Slug: mistral/mistral-nemo-free
Context Length: 128k
Links: Data Policy / Status
Qwen2.5-Coder improves upon CodeQwen1.5 in code generation, reasoning, and fixing, providing a stronger foundation for applications like Code Agents. It also maintains strengths in math and general knowledge.
Provider: Lambda
API Slug: qwen/qwen-2-5-coder-32b-instruct-free
Context Length: 33k
Links: Data Policy / Status
Qwen2.5 72B is an advanced LLM with improved knowledge, coding, and multilingual capabilities, supporting long context and generating high-quality outputs.
Provider: DeepInfra
API Slug: qwen/qwen-2-5-72b-free
Context Length: 32k
Links: Data Policy / Status
This 7.3B Mamba model offers linear time inference, a 256k context window, and fast responses for code and reasoning tasks, performing comparably to transformers and available under Apache 2.0.
Provider: Mistral
API Slug: mistral/codestral-mamba-free
Context Length: 256k
Links: Data Policy / Status
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.
Provider: OpenAI
API Slug: openai/gpt-3-5-turbo-free
Context Length: 16k
Links: Data Policy / Status
Mistral Large 2 is a powerful AI model from Mistral AI excelling at reasoning, coding, and various languages, boasting a long context window for effective information retrieval.
Provider: Mistral
API Slug: mistral/mistral-large-free
Context Length: 128k
Links: Data Policy / Status
An experimental multimodal model based on Gemini 1.5 Pro, capable of handling both text and vision tasks with high efficiency.
Provider: Google AI Studio
API Slug: google/learnlm-1-5-pro-experimental-free
Context Length: 8k
Links: Data Policy / Status
Grok 2 Vision improves image AI with enhanced visual understanding, instruction following, and multilingual support, enabling intuitive, visually aware apps and paving the way for future image solutions.
Provider: xAI
API Slug: x-ai/grok-2-vision-free
Context Length: 33k
Links: Data Policy / Status
Liquid's 40.3B MoE model, a powerful LFM built on dynamic systems, excels at modeling diverse sequential data, including video, audio, and text.
Provider: Lambda
API Slug: liquid/lfm-40b-free
Context Length: 66k
Links: Data Policy / Status
A 90-billion-parameter multimodal model excelling in advanced visual reasoning and language tasks like image captioning and analysis.
Provider: SambaNova
API Slug: meta/llama-3-2-90b-vision-instruct-free
Context Length: 131k
Links: Data Policy / Status
An 11B parameter multimodal model designed for tasks that integrate visual and textual reasoning with high accuracy.
Provider: Together
API Slug: meta/llama-3-2-11b-vision-instruct-free
Context Length: 131k
Links: Data Policy / Status
An experimental model optimized for STEM tasks, providing PhD-level accuracy in physics, chemistry, and biology.
Provider: OpenAI
API Slug: gpt-o1-mini-free
Context Length: 128k
Links: Data Policy / Status
An updated model with faster performance and lower latencies, designed for multilingual tasks and reasoning.
Provider: Cohere
API Slug: cohere/command-r-plus-free
Context Length: 128k
Links: Data Policy / Status
Command-R is a 35B parameter model that performs conversational language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents.
Provider: Cohere
API Slug: cohere/command-r-free
Context Length: 128k
Links: Data Policy / Status
OpenAI's advanced GPT-4o model, offering faster processing, enhanced multilingual capabilities, and better file analysis.
Provider: OpenAI
API Slug: openai/gpt-4o-free
Context Length: 128k
Links: Data Policy / Status
A lightweight, experimental 8B parameter model offering efficient multimodal capabilities for text and vision tasks.
Provider: Google AI Studio
API Slug: google/gemini-flash-1-5-8b-exp
Context Length: 1000k
Links: Data Policy / Status
An experimental multimodal model by Google, heavily rate-limited but designed for high-end vision and text tasks.
Provider: Google
API Slug: google/gemini-pro-1-5-free
Context Length: 1000k
Links: Data Policy / Status
NVIDIA's Llama 3.1 Nemotron 70B is a language model optimized for generating precise and useful responses across various domains, leveraging RLHF and excelling in automatic alignment benchmarks.
Provider: Lambda
API Slug: nvidia/llama-3-1-nemotron-70b-instruct-free
Context Length: 131k
Links: Data Policy / Status
A transformer-based model with strengths in multilingual tasks, reasoning, and coding capabilities.
Provider: Novita
API Slug: qwen-2-7b-instruct-free
Context Length: 8k
Links: Data Policy / Status
Gemma 2 27B is an open-source model from Google, based on Gemini research, that excels in text generation tasks like question answering, summarization, and reasoning.
Provider: DeepInfra
API Slug: google/gemma-2-27b-it-free
Context Length: 8k
Links: Data Policy / Status
A versatile 9B parameter model by Google, designed for efficient and cost-effective language tasks across domains.
Provider: DeepInfra
API Slug: google/gemma-2-9b-it-free
Context Length: 4k
Links: Data Policy / Status
A fine-tuned Llama 2 13B model excelling in descriptive roleplay and creative narratives.
Provider: Together 2
API Slug: gryphe/mythomax-l2-13b-free
Context Length: 4k
Links: Data Policy / Status