Gemini 3.5 Flash

gemini-3.5-flash

Gemini 3.5 Flash is Google's high-efficiency multimodal model, delivering near-Pro level performance in coding and reasoning at Flash-tier speed and cost. It supports text, image, video, audio, and PDF inputs, making it well suited for diverse multimodal workflows. Optimized for coding proficiency and parallel agentic execution, the model defaults to medium thinking effort for faster, cost-efficient responses while supporting configurable thinking levels (minimal, low, medium, high) for fine-grained cost–performance control.

Context: 1M tokens
Endpoint

Get API Key Compare

Pricing

Input$1.50 / 1M

Output$9.00 / 1M

Cache Write$0 / 1M

Cache Read$0 / 1M

Web Search$0 / 1M

Prompt cache writes and reads are included at no additional cost.

Quick Start

Select an endpoint and copy a working example for this model.

Endpoint:

python

from openai import OpenAI client = OpenAI(    api_key="YOUR_API_KEY",    base_url="https://api.apertis.ai/v1") response = client.chat.completions.create(    model="gemini-3.5-flash",    messages=[        {"role": "user", "content": "Hello!"}    ],    max_tokens=1024,    temperature=0.7) print(response.choices[0].message.content) # Optional: Enable context compression to reduce token usage# response = client.chat.completions.create(#     model="gemini-3.5-flash",#     messages=[{"role": "user", "content": "Hello!"}],#     extra_body={"compression": {"enabled": True, "model": "gpt-4.1-mini"}}# )

Supported Parameters

API docs

Common7 params

modelmessagesmax_tokenstemperaturetop_pstreamtools

Extended4 params

reasoning_effortstream_optionsthinkingextra_body

Cursor IDE Model IDs

Use these namespaced identifiers in Cursor IDE to avoid conflicts with built-in models.

gemini-3.5-flash

Compare with Other Models

See how this model compares to others from the same provider.

Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight, low-latency model focused on speed and cost efficiency. It generates tokens quickly and outperforms earlier Flash models on common benchmarks. “Thinking” (multi-pass reasoning) is off by default for maximum speed, but can be turned on through the Reasoning API when deeper reasoning is needed.

Context: 1.0M
Input: $0.05/M
Output: $0.20/M

Gemini 2.5 Flash DeepSearch

Gemini 2.5 Flash is Google's main high-performance model for complex reasoning, coding, math, and scientific tasks. It has built-in “thinking” features that help it produce more accurate, context-aware answers.

Context: 1.0M
Input: $4.20/M
Output: $33.60/M

Gemini 2.5 Flash Preview

Gemini 2.5 Flash Preview (May 2025) is Google's high-performance general model built for advanced reasoning, coding, math, and science. It includes built-in “thinking” features to deliver more accurate, context-aware answers.

Context: 1.0M
Input: $0.075/M
Output: $0.30/M

Gemini 2.5 Flash Image Preview

Gemini 2.5 Flash Image Preview (“Nano Banana”) is a cutting-edge image generation model with strong contextual understanding. It can create and edit images and supports multi-turn conversational workflows around visuals.

Context: N/A
Input: $0/M
Output: $0/M

Compare with Other Models

See how this model compares to others from the same provider.

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) is Google's fastest and most cost-efficient multimodal image generation model, designed for high-throughput visual workflows and real-time applications. It supports text-to-image generation, image editing, and multi-image composition through a unified API, while also producing text outputs alongside images. Delivering image generation in approximately 4 seconds, it combines fast inference with strong character consistency, precise editing, and real-world knowledge. The model generates 1K-resolution images across 14 aspect ratios and embeds an invisible SynthID watermark in all outputs. Optimized for the best balance of quality, speed, and cost, Nano Banana 2 Lite is ideal for prototyping, developer pipelines, and large-scale visual content generation.

Context: 66K
Input: $0.7754/M
Output: Pay Per Request/M

Gemini 3.1 Flash TTS Preview

Gemini 3.1 Flash TTS Preview is Google's next-generation text-to-speech model, delivering a major upgrade over Gemini 2.5 Flash TTS. It converts text into natural audio across 70+ languages, with significantly expanded language coverage and improved quality. The model introduces 200+ inline audio control tags (e.g., [whispers], [laughs], [excited]) for fine-grained control over emotion, tone, and pacing, along with support for two speakers with independent voice and style settings. It outputs 24 kHz / 16-bit PCM audio, includes SynthID watermarking, and supports a 32K token context window. Designed for expressive and controllable voice generation, it is well suited for dialogue systems, storytelling, character-driven content, and advanced audio production workflows.

Context: 8.2K
Input: $27.50/M
Output: $0/M

Gemma 4 26B A4B (Free)

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind, featuring 25.2B total parameters with only 3.8B activated per token—delivering near 31B-class quality at a fraction of the compute cost. It supports multimodal inputs including text, images, and video (up to 60s at 1fps). The model includes a 256K token context window, native function calling, configurable thinking/reasoning modes, and structured output support. Released under the Apache 2.0 license, it is well suited for efficient, production-ready multimodal and agentic applications.

Gemini 3.5 Flash

Pricing

Quick Start

Supported Parameters

Cursor IDE Model IDs

Compare with Other Models

Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash DeepSearch

Gemini 2.5 Flash Preview

Gemini 2.5 Flash Image Preview

Compare with Other Models

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)

Gemini 3.1 Flash TTS Preview

Gemma 4 26B A4B (Free)

Gemma 4 26B A4B

Developers

Contact

Legal

Compare with Other Models

Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash DeepSearch

Gemini 2.5 Flash Preview

Gemini 2.5 Flash Image Preview

Metadata

Supported Features

Observed Availability