Changelog

Type

March 2026

Feature

Feature Added

✨ Apertis Python SDK v0.2.1 — Context Compression Now Available

Hi there,

We're excited to announce Apertis Python SDK v0.2.1, now with Context Compression support.

What's New

Context Compression automatically compresses prompt tokens in long conversations and large context scenarios — reducing API costs while maintaining response quality.

Quick Start

  pip install apertis==0.2.1

Resources

PyPI: https://pypi.org/project/apertis/0.2.1/
GitHub Release: https://github.com/apertis-ai/python-sdk/releases/tag/v0.2.1

*See the Release Notes for detailed usage and configuration options.*

Questions? Feel free to reach out anytime.

Happy Building.

Feature

Models Added

Add Gemini 3.1 Flash Lite Preview & GPT-5.3 Chat

GPT-5.3 ChatGemini 3.1 Flash Lite Preview

Add Gemini 3.1 Flash Lite Preview & GPT-5.3 Chat

Feature

Context Compression automatically summarizes conversation history using a smaller, cost-efficient model before sending requests to your primary model. This significantly reduces input token costs while preserving conversation context.

Highlights

Up to 78% token savings on long multi-turn conversations
Three compression strategies to balance quality vs. savings:
conservative — compresses after 8+ turns (minimal context loss)
on — compresses after 6+ turns (balanced)
aggressive — compresses after 3+ turns (maximum savings)
All endpoints supported:
POST /v1/chat/completions
POST /v1/messages
POST /v1/responses

How to Enable

Option 1: API Key Dashboard (Zero Code Changes)

Go to API Key Management → Edit your API key → Enable Context Compression and select your preferred strategy. All requests using that key will automatically apply compression.

Option 2: Per-Request via Request Body

  {
    "model": "gpt-4.1",
    "messages": [...],
    "compression": {
      "enabled": true,
      "strategy": "on",
      "model": "gpt-4.1-mini"
    }
  }

Option 3: Per-Request via HTTP Headers

X-Context-Compression: on X-Compression-Model: gpt-4.1-mini

SDK Support

Compression examples are now available for all supported SDKs:

Python SDK (OpenAI, Anthropic, Responses API)
TypeScript / Vercel AI SDK (@apertis/ai-sdk-provider)
LangChain (via default_headers)
LlamaIndex (via additional_kwargs)
LiteLLM (via extra_headers)

Priority

Request body params > HTTP headers > API key defaults. Per-request settings always override key-level defaults.

See more on **Documentation**

February 2026

Feature

Models Added

Add Grok 4.2

Grok 4.2

Grok 4.2 is the next major iteration of xAI's Grok series, advancing the model's reasoning, coding, and multimodal capabilities with architectural improvements over Grok 4 and 4.1. It is positioned as a more powerful and general-purpose frontier AI model in the Grok family with stronger deep reasoning and real-world task performance.

Feature

Models Added

Add Qwen 3.5 Full Series & Seed-2.0-Mini

Seed-2.0-MiniQwen3.5 Plus 2026-02-15Qwen3.5 397B A17BQwen3.5-FlashQwen3.5-122B-A10BQwen3.5-27B

The full Qwen 3.5 series is provided at **Apertis Coding Plan** as well, Enjoy it.