gemini-3.1-flash-lite-imageNano Banana 2 Lite (Gemini 3.1 Flash Lite Image) is Google's fastest and most cost-efficient multimodal image generation model, designed for high-throughput visual workflows and real-time applications. It supports text-to-image generation, image editing, and multi-image composition through a unified API, while also producing text outputs alongside images. Delivering image generation in approximately 4 seconds, it combines fast inference with strong character consistency, precise editing, and real-world knowledge. The model generates 1K-resolution images across 14 aspect ratios and embeds an invisible SynthID watermark in all outputs. Optimized for the best balance of quality, speed, and cost, Nano Banana 2 Lite is ideal for prototyping, developer pipelines, and large-scale visual content generation.
Select an endpoint and copy a working example for this model.
from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.apertis.ai/v1") response = client.chat.completions.create( model="gemini-3.1-flash-lite-image", messages=[ {"role": "user", "content": "Hello!"} ], max_tokens=1024, temperature=0.7) print(response.choices[0].message.content) # Optional: Enable context compression to reduce token usage# response = client.chat.completions.create(# model="gemini-3.1-flash-lite-image",# messages=[{"role": "user", "content": "Hello!"}],# extra_body={"compression": {"enabled": True, "model": "gpt-4.1-mini"}}# )modelmessagesmax_tokenstemperaturetop_pstreamtoolsreasoning_effortstream_optionsthinkingextra_bodyUse these namespaced identifiers in Cursor IDE to avoid conflicts with built-in models.
See how this model compares to others from the same provider.
Gemini 3.5 Flash is Google's high-efficiency multimodal model, delivering near-Pro level performance in coding and reasoning at Flash-tier speed and cost. It supports text, image, video, audio, and PDF inputs, making it well suited for diverse multimodal workflows. Optimized for coding proficiency and parallel agentic execution, the model defaults to medium thinking effort for faster, cost-efficient responses while supporting configurable thinking levels (minimal, low, medium, high) for fine-grained cost–performance control.
Gemini 3.1 Flash TTS Preview is Google's next-generation text-to-speech model, delivering a major upgrade over Gemini 2.5 Flash TTS. It converts text into natural audio across 70+ languages, with significantly expanded language coverage and improved quality. The model introduces 200+ inline audio control tags (e.g., [whispers], [laughs], [excited]) for fine-grained control over emotion, tone, and pacing, along with support for two speakers with independent voice and style settings. It outputs 24 kHz / 16-bit PCM audio, includes SynthID watermarking, and supports a 32K token context window. Designed for expressive and controllable voice generation, it is well suited for dialogue systems, storytelling, character-driven content, and advanced audio production workflows.
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind, featuring 25.2B total parameters with only 3.8B activated per token—delivering near 31B-class quality at a fraction of the compute cost. It supports multimodal inputs including text, images, and video (up to 60s at 1fps). The model includes a 256K token context window, native function calling, configurable thinking/reasoning modes, and structured output support. Released under the Apache 2.0 license, it is well suited for efficient, production-ready multimodal and agentic applications.
See how this model compares to others from the same provider.
Gemini 3.1 Pro Preview is Google's frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Built on the multimodal foundation of the Gemini 3 series, it combines high-precision reasoning across text, image, video, audio, and code with a 1M-token context window for large-scale tasks. The 3.1 update introduces measurable gains on SWE benchmarks and real-world coding environments, along with stronger autonomous execution in structured domains such as finance and spreadsheet-based workflows. Designed for advanced development and agentic systems, it improves long-horizon stability and tool orchestration while adding a new medium thinking level to better balance cost, speed, and performance. Gemini 3.1 Pro Preview is well suited for agentic coding, structured planning, multimodal analysis, financial modeling, spreadsheet automation, and high-context enterprise applications.
Gemini 2.5 Pro is Google's top reasoning model for coding, math, and scientific work. It uses built-in “thinking” to deliver more accurate, context-aware answers and ranks at the top of major benchmarks like LMArena, showing strong alignment and problem-solving ability.
Veo 3.1 is a state-of-the-art generative AI video model developed by Google DeepMind (part of the broader Gemini/Flow ecosystem). It builds on the earlier Veo models to make AI-generated video creation more realistic, expressive, and controllable.
Gemini 2.5 Flash Image Preview (“Nano Banana”) is a cutting-edge image generation model with strong contextual understanding. It can create and edit images and supports multi-turn conversational workflows around visuals.
Initialized observational baseline with no recorded failures