Logo for AiToolGo

Build with Nano Banana: A Complete Developer Tutorial for Gemini 2.5 Flash Image

In-depth discussion
Technical, Easy to understand
 0
 0
 1
This tutorial provides a comprehensive guide for developers on integrating Gemini 2.5 Flash Image, codenamed Nano Banana, into applications using the Gemini Developer API. It covers project setup, image creation, editing, restoration, handling multiple inputs, conversational editing, aspect ratios, image-only outputs, and best practices. The article includes Python code examples and links to further resources.
  • main points
  • unique insights
  • practical applications
  • key topics
  • key insights
  • learning outcomes
  • main points

    • 1
      Comprehensive coverage of Nano Banana's image generation and editing capabilities.
    • 2
      Practical, step-by-step code examples for developers using Python.
    • 3
      Clear guidance on project setup, API key generation, and billing.
  • unique insights

    • 1
      Demonstrates advanced use cases like conversational image editing and photo restoration.
    • 2
      Provides specific prompting tips and best practices for achieving optimal results.
  • practical applications

    • Enables developers to quickly integrate Gemini 2.5 Flash Image into their applications for image generation and manipulation tasks.
  • key topics

    • 1
      Gemini 2.5 Flash Image (Nano Banana)
    • 2
      Gemini Developer API
    • 3
      Image Generation and Editing
  • key insights

    • 1
      Detailed developer-focused guide to integrating a cutting-edge image generation model.
    • 2
      Practical code examples for common and advanced image manipulation tasks.
    • 3
      Guidance on setup, billing, and best practices for effective API usage.
  • learning outcomes

    • 1
      Understand how to set up a development environment for Gemini 2.5 Flash Image.
    • 2
      Implement image generation and editing functionalities using the Gemini Developer API.
    • 3
      Apply best practices for effective prompting and advanced image manipulation techniques.
examples
tutorials
code samples
visuals
fundamentals
advanced content
practical tips
best practices

Introduction to Nano Banana (Gemini 2.5 Flash Image)

While end-users can access Nano Banana through the Gemini app, Google AI Studio serves as the premier environment for developers to prototype, test prompts, and experiment with AI models before writing code. It acts as the gateway to building with the Gemini API. Developers can utilize Nano Banana within AI Studio free of charge. To begin, navigate to aistudio.google.com, sign in with your Google account, and select Nano Banana from the model picker. For direct access to a new session with the model, use the link ai.studio/banana. AI Studio also offers the 'vibe code' feature at ai.studio/apps, allowing you to build and remix web apps directly.

Project Setup: API Keys and Billing

To interact with Nano Banana programmatically, you'll need to install the Google Gen AI SDK. Choose the SDK that aligns with your preferred programming language. **For Python:** Install the SDK using pip: ```bash pip install -U google-genai ``` Additionally, install the Pillow library for image manipulation: ```bash pip install Pillow ``` **For JavaScript / TypeScript:** Install the SDK using npm: ```bash npm install @google/genai ``` This tutorial will primarily use Python SDK examples. Equivalent code snippets for JavaScript can be found in the provided GitHub Gist.

Generating Images from Text Prompts

Nano Banana's capabilities extend to editing existing images based on text prompts. By providing both an input image and a descriptive prompt, you can modify the image while the model strives to maintain character and content consistency. This is particularly useful for creative transformations. Consider this Python example for editing an image: ```python from google import genai from PIL import Image from io import BytesIO client = genai.Client(api_key="YOUR_API_KEY") prompt = """Using the image of the cat, create a photorealistic, street-level view of the cat walking along a sidewalk in a New York City neighborhood, with the blurred legs of pedestrians and yellow cabs passing by in the background.""" image = Image.open("cat.png") # Pass both the text prompt and the image in the 'contents' list response = client.models.generate_content( model="gemini-2.5-flash-image", contents=[prompt, image], ) for part in response.candidates[0].content.parts: if part.text is not None: print(part.text) elif part.inline_data is not None: image = Image.open(BytesIO(part.inline_data.data)) image.save("cat2.png") ``` This code takes the previously generated `cat.png` and applies the new prompt to create `cat2.png`, depicting the cat in a new urban environment.

Photo Restoration with Nano Banana

Nano Banana supports more complex image manipulation tasks by allowing multiple images as input. This is useful for tasks like compositing or applying elements from one image to another. **Working with Multiple Input Images:** ```python from google import genai from PIL import Image from io import BytesIO client = genai.Client(api_key="YOUR_API_KEY") prompt = "Make the girl wear this t-shirt. Leave the background unchanged." image1 = Image.open("girl.png") image2 = Image.open("tshirt.png") response = client.models.generate_content( model="gemini-2.5-flash-image", contents=[prompt, image1, image2], ) for part in response.candidates[0].content.parts: if part.text is not None: print(part.text) elif part.inline_data is not None: image = Image.open(BytesIO(part.inline_data.data)) image.save("girl-with-tshirt.png") ``` **Conversational Image Editing:** For iterative refinement and maintaining context across multiple edits, you can leverage chat sessions. This allows for a conversational approach to image editing. ```python from google import genai from PIL import Image from io import BytesIO client = genai.Client(api_key="YOUR_API_KEY") # Create a chat session chat = client.chats.create( model="gemini-2.5-flash-image" ) # First image edit response1 = chat.send_message( [ "Change the cat to a bengal cat, leave everything else the same", Image.open("cat.png"), ] ) # display / save image... # Continue chatting and editing response2 = chat.send_message("The cat should wear a funny party hat") # display / save image... ``` **Tip:** If image features begin to degrade over many conversational edits, it's advisable to start a new session with the latest image and a more consolidated prompt to maintain high fidelity.

Controlling Output: Aspect Ratios and Image-Only Responses

Achieving optimal results with Nano Banana requires thoughtful prompting. Adhering to these guidelines will significantly enhance your control over the generated images: * **Be Hyper-Specific:** Provide detailed descriptions of subjects, colors, lighting, and composition. The more precise your prompt, the more predictable the output. * **Provide Context and Intent:** Clearly articulate the purpose or desired mood of the image. The model's understanding of context influences its creative decisions. * **Iterate and Refine:** Expect to refine your prompts. Utilize the model's conversational capabilities for incremental adjustments to achieve your desired outcome. * **Use Step-by-Step Instructions:** For complex scenes, break down your prompt into a series of clear, sequential instructions. * **Use Positive Framing:** Instead of specifying what *not* to include (e.g., "no cars"), describe the desired scene positively (e.g., "an empty, deserted street with no signs of traffic"). * **Control the Camera:** Employ photographic and cinematic terms (e.g., "wide-angle shot", "macro shot", "low-angle perspective") to direct the composition. For a deeper understanding, refer to the official blog post on prompting best practices and the prompting guide in the Gemini API documentation.

 Original link: https://dev.to/googleai/how-to-build-with-nano-banana-complete-developer-tutorial-646

Comment(0)

user's avatar

      Related Tools