Mistral Medium 3.5 is a 128B dense instruction-following model from Mistral AI, supporting text and image inputs with text output. It is designed for agentic workflows, coding, and complex multi-step reasoning, with strong reliability in multi-tool orchestration and long-horizon tasks. The model features a 256K token context window, configurable reasoning effort per request, and a custom vision encoder that handles variable image sizes and aspect ratios. With support for self-hosting on as few as four GPUs and availability under open weights, it is well suited for scalable, production-grade deployments.
Use the Apertis AI SDK, the OpenAI SDK, or make direct HTTP requests to our API.
from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.apertis.ai/v1") response = client.chat.completions.create( model="mistral-medium-3-5", messages=[ {"role": "user", "content": "Hello!"} ], max_tokens=1024, temperature=0.7) print(response.choices[0].message.content) # Optional: Enable context compression to reduce token usage# response = client.chat.completions.create(# model="mistral-medium-3-5",# messages=[{"role": "user", "content": "Hello!"}],# extra_body={"compression": {"enabled": True, "model": "gpt-4.1-mini"}}# )Common parameters: modelmessagesmax_tokenstemperaturetop_pstreamtools
Extended parameters: reasoning_effortstream_optionsthinkingextra_body
Use these namespaced identifiers in Cursor IDE to avoid conflicts with built-in models.