Changelog

Type

May 2026

System

System Update

Apertis New Look

UI/UX refreshed Looks

Apertis has received a broad UI/UX refresh across public pages, model browsing, API key selection, and admin workflows. This update focuses on a cleaner visual system, clearer product navigation, and more practical operational screens for daily use.

Highlights

  • Refreshed the public experience with a redesigned 404 page, improved layout chrome, and a more polished route diagnostic view.
  • Introduced the new Verbatim landing experience, including updated positioning, product sections, code preview, and lead capture flow.
  • Added a signed-in Dashboard shortcut in the header for faster access to the main workspace.
  • Redesigned the chat API key selection modal with a cleaner Apertis-style layout, clearer selection states, and improved preview coverage.
  • Improved model detail pages with task-driven endpoint selection, better TTS examples, and cleaner handling of voice and web-search pricing states.
  • Refined the model filter sidebar so empty task categories are hidden, making model discovery easier to scan.

overview card.

  • Shared FAQ row styling across pages for a more consistent Apertis UI system.
  • Cleaned up footer navigation and naming, including the updated Verbatim label.

User Experience

The new look reduces visual clutter, improves spacing consistency, and makes important actions easier to find.

Public pages now feel more aligned with the Apertis brand, while authenticated workflows are more direct and data-focused.

Quality

This refresh includes focused test coverage for updated components and route behavior, including header navigation, modal behavior, layout chrome, pricing helpers, filter behavior, and cost analytics logic.

Enjoy them.

Read more

Feature

System Update

Audio APIs Now Live

Full Audio API Support

Apertis now supports the OpenAI-compatible Audio API. Use a single API key to access leading TTS (text-to-speech) and STT (speech-to-text) models across providers.

Supported Models

Text-to-Speech (TTS)

  • gemini-3.1-flash-tts-preview — Google's latest Flash TTS preview
  • gpt-4o-mini-tts — OpenAI's lightweight real-time speech synthesis

Speech-to-Text (STT)

  • gpt-4o-transcribe — Flagship high-accuracy transcription
  • gpt-4o-mini-transcribe — Cost-efficient real-time transcription
  • whisper-large-v3-turbo — Accelerated Whisper v3
  • whisper-large-v3 — Full-precision Whisper
  • whisper-1 — The classic, battle-tested baseline

Endpoints

Drop-in compatible with the OpenAI SDK — no code changes required:

  • POST /v1/audio/speech — text → audio
  • POST /v1/audio/transcriptions — audio → text
  • POST /v1/audio/translations — audio → translated text

Billing

  • PAYG (pay-as-you-go): shares the same quota balance as chat/completions
  • Per-dimension billing: priced separately on input tokens / output

tokens / audio seconds, with admin-tunable AudioRatio

  • File limit: 25 MB per multipart upload
  • Subscriptions: audio models are PAYG-only for now (not included

in subscription plans)

Example

  from openai import OpenAI

  client = OpenAI(
      api_key="sk-your-apertis-key",
      base_url="https://api.apertis.ai/v1"
  )

  # TTS
  speech = client.audio.speech.create(
      model="gpt-4o-mini-tts",
      voice="alloy",
      input="Hello from Apertis."
  )
  speech.stream_to_file("hello.mp3")

  # STT
  with open("audio.mp3", "rb") as f:
      transcript = client.audio.transcriptions.create(
          model="whisper-large-v3-turbo",
          file=f
      )
  print(transcript.text)

Model Detail Page Updates

  • Endpoint and code samples auto-switch based on the model's task
  • TTS models now emit ready-to-run OpenAI SDK Python snippets
  • Web Search pricing column hidden for voice models (:web is unsupported)
Read more

Feature

Models Added

Add Grok 4.3

Grok 4.3

Grok 4.3

Grok 4.3 is a reasoning-focused model from xAI designed for agentic workflows, instruction following, and high factual accuracy tasks. It supports text and image inputs with text output, with reasoning always active and not configurable by effort level.

The model features a 1M-token context window with effectively no output token limit, making it well suited for long-document analysis, deep research, and multi-step agentic workflows. It uses tiered pricing, with higher rates applied to requests exceeding 200K total tokens.

Enjoy it.

Read more

April 2026

Feature

Models Added

Add Nemotron 3 Nano Omni (Free)

Nemotron 3 Nano Omni (Free)

Nemotron 3 Nano Omni (Free)

NVIDIA Nemotron 3 Nano Omni is an open 30B-A3B multimodal model designed as a perception and context sub-agent for enterprise agent systems. It supports text, image, video, and audio inputs with text output, enabling unified multimodal reasoning within a single inference loop. Built on a hybrid MoE Transformer–Mamba architecture with Conv3D video layers and Efficient Video Sampling (EVS), it delivers significantly improved efficiency for video reasoning—achieving ~2× higher throughput and 2.5× lower compute compared to separate pipelines.

With up to 300K context length and extended thinking support, it is well suited for scalable, multimodal agent workflows.

Enjoy it.

Read more

Feature

Models Added

Add latest Qwen Models

Qwen3.5 Plus 2026-04-20Qwen3.6 FlashQwen3.6 Max Preview

Qwen3.5 Plus 2026-04-20

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba, supporting text, image, and video inputs with text output. It features a 1M-token context window, enabling large-scale reasoning and multimodal workflows within a single interaction.

This updated version of Qwen3.5 Plus introduces tiered pricing beyond 256K tokens, making it suitable for high-context applications while maintaining flexibility for cost optimization in long-input scenarios.

Qwen3.6 Flash

Qwen3.6 Flash is a fast and efficient model from Alibaba's Qwen 3.6 series, supporting text, image, and video inputs with a 1M-token context window for high-context multimodal workflows.

Optimized for performance and cost efficiency, it features tiered pricing beyond 256K tokens and supports prompt caching with both cache creation and read pricing, making it well suited for large-scale, high-throughput applications.

Qwen3.6 Max Preview

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse Mixture-of-Experts (MoE) architecture with approximately 1 trillion parameters. It is optimized for agentic coding, tool use, and long-context reasoning, supporting a 262K token context window.

The model includes an integrated thinking mode that preserves reasoning across multi-turn interactions, along with support for structured outputs and function calling.

Enjoy them.

Read more