Lesson learned from building FotoBareng with Gemini Nano Banana

In December 2025, I had the opportunity to experiment with Google’s Gemini Nano Banana model to build a simple application called FotoBareng. The goal of this project was to create a web app that allows users to blend several pictures together. In this post, I’ll share some lessons learned from building FotoBareng using Gemini Nano Banana.

Lesson 1: Security Driven Development

When building web applications, especially those that interact with AI models and handle user data, security should be at the forefront of your development process. Here are three key principles I followed:

Use the Simplest Form of Auth

For the initial version of FotoBareng, I kept authentication simple. Firebase and cookie-based authentication for client-to-server communication was sufficient for getting started.

The beauty of this approach is that it’s native to the browser and can be utilized automatically by fetch API calls. There’s no need to overcomplicate things in the beginning - start simple and iterate as your security requirements grow.

Protect All Endpoints by Default

One of the most important security practices is to protect all endpoints by default. Opening an endpoint later is much easier compared to protecting an endpoint later in the future.

For public endpoints that need to be accessible, I protected them with anonymous cookies. This approach ensures that even publicly accessible endpoints have some level of protection and monitoring capability.

Validation, Usage Quota, and Limit

Always validate user requests. This is crucial not just for security, but also for maintaining a good user experience. I implemented fair quota and limit systems for users to protect other users from “abusive” users who might try to monopolize resources.

Input validation, rate limiting, and usage quotas help ensure that the application remains stable and accessible for all users, while preventing potential abuse or attacks.

Lesson 2: Image Load Performance

Performance optimization is critical when working with AI-generated images, especially when dealing with large file sizes. Here’s what I learned about optimizing image load performance in FotoBareng:

The Challenge: Large PNG Files from Gemini

One of the first challenges I encountered was that Gemini generates PNG files with sizes greater than 2MB per image. While PNG offers lossless compression, the file sizes can significantly impact load times and bandwidth consumption, especially for users on slower connections or mobile devices.

The Solution: Asynchronous Format Conversion

To address this, I implemented an asynchronous image conversion pipeline. After receiving the PNG from Gemini, the system converts it to a more efficient format like WEBP in the background.

WEBP offers impressive compression - the converted images are typically around 10% of the original PNG size. This means a 2MB PNG can be reduced to approximately 200KB, resulting in:

Faster page load times - Images load 10x faster
Reduced bandwidth costs - Both for the server and users
Better mobile experience - Critical for users on limited data plans

The key here is to do this conversion asynchronously, so users aren’t blocked waiting for the conversion to complete. They can see the original image immediately while the optimized version is being prepared.

CDN Configuration Matters

Another important lesson was ensuring proper CDN configuration. Even with optimized images, poor CDN settings can bottleneck your performance gains. Make sure your CDN is configured to:

Cache images effectively
Serve content from geographically distributed edge locations
Use proper cache headers and TTL values
Support modern image formats like WEBP

A properly configured CDN can make the difference between a sluggish and a snappy user experience, especially for a global audience.

Lesson 3: Pricing is Tricky - Plan from Day One

One of the most critical lessons I learned while building FotoBareng is that pricing strategy needs to be considered from the very beginning, not as an afterthought. When you’re working with AI models like Gemini Nano Banana, costs can add up quickly, and without proper planning, you might find yourself in financial trouble.

Free Trial & User Acquisition: Who Pays for Free?

The first question you need to answer is: how will you cover costs for free generation? Who pays for users during the free trial period?

In the early days of building a product, offering free trials is essential for user acquisition. However, with AI-powered applications, each “free” generation actually costs you money in API calls, compute resources, and storage. You need a clear strategy:

Time-boxed trials: Limit free usage to a specific time period
Generation limits: Cap the number of free generations per user
Quality tiers: Offer lower quality or smaller images for free users
Investor funding or self-funding: Be prepared to subsidize early users as a customer acquisition cost

The key is to find the right balance between being generous enough to attract users while not bankrupting yourself in the process.

Input & Output Pricing Rules: Track Everything

When working with AI models, input and output have separate pricing rules. For image generation:

Input costs: Text prompts, reference images, and parameters
Output costs: Generated image size, quality, and format

What makes this tricky is that different image sizes incur different costs. A 512x512 image might cost pennies, but a 2048x2048 high-quality image could cost significantly more. You need to:

Always record and aggregate user token requests for audit: Keep detailed logs of every API call, including input tokens, output size, and associated costs
Implement proper tracking: Build a system that can attribute costs to specific users and features
Monitor usage patterns: Identify which features are most expensive and optimize accordingly

This data becomes invaluable not just for billing, but also for understanding user behavior and optimizing your infrastructure.

Hidden & Infrastructure Costs: Factor Everything In

Beyond the obvious AI model costs, there are numerous hidden and infrastructure costs that can catch you off guard:

Storage costs: Every generated image needs to be stored, and with 2MB+ PNGs, costs can escalate quickly
CDN bandwidth: Serving images to users worldwide isn’t free
Database operations: Storing metadata, user information, and audit logs
Computing resources: Image conversion, resizing, and optimization processes
Backup and disaster recovery: Essential but often overlooked

When calculating your pricing, factor in all these operational expenses. A common mistake is to only account for the AI model API costs while ignoring that infrastructure might represent 30-50% of your total operational costs.

The Bottom Line

Pricing isn’t just about making money - it’s about sustainability. By thinking about pricing from day one, you:

Make informed decisions about feature scope
Build proper tracking and monitoring systems from the start
Understand your unit economics before scaling
Can pivot your pricing model based on real data rather than guesswork

Don’t wait until you have thousands of users to figure out your costs. By then, it might be too late to build a sustainable business. Start tracking, start calculating, and start planning your pricing strategy from day one.

Lesson 4: Monitor LLM Usage and Performance

Building on the previous lesson about pricing, monitoring your LLM usage and performance is not just important - it’s essential for running a sustainable AI-powered application. Without proper monitoring, you’re flying blind, unable to optimize costs, improve user experience, or troubleshoot issues effectively.

Input/Output Token Usage: Know Your Consumption

Every interaction with Gemini Nano Banana consumes tokens - both for input (your prompts and parameters) and output (the generated images and metadata). Tracking token usage is fundamental for several reasons:

Cost attribution: Understanding which features and users consume the most tokens helps you optimize your cost structure
Budget forecasting: Historical token usage data allows you to predict future costs as you scale
Anomaly detection: Sudden spikes in token usage can indicate bugs, abuse, or inefficient prompt engineering
Optimization opportunities: Identifying high-token operations helps you prioritize optimization efforts

Implement real-time dashboards that show token consumption by user, feature, and time period. This visibility is crucial for making data-driven decisions about product development and resource allocation.

LLM RED Metrics: Rate, Errors, and Duration

Similar to traditional web services, LLMs should be monitored using RED metrics:

Rate: How many requests per second/minute are you making to the LLM? This helps you understand load patterns and capacity planning needs.
Errors: What’s your error rate? Are certain prompts failing more than others? High error rates indicate problems with your integration, prompt design, or service reliability.
Duration: How long does each request take? Long durations impact user experience and may indicate performance bottlenecks or inefficient prompt structures.

Set up alerts for when these metrics deviate from normal patterns. For example, if your error rate suddenly jumps from 1% to 10%, you need to know immediately so you can investigate and resolve the issue before it impacts too many users.

Success/Fail Images Generation: Quality Matters

Not all generated images are created equal. Track the number of successful versus failed image generations:

Success rate: What percentage of requests produce usable images? A low success rate indicates problems with prompt engineering, model compatibility, or parameter settings.
Failure analysis: Categorize failures - are they API errors, timeout issues, or quality problems? Different failure types require different solutions.
User satisfaction: Correlate technical success with user behavior - do users keep images that are “technically successful” or do they regenerate them?

This data helps you improve your prompt engineering, adjust parameters, and set realistic expectations with users about generation success rates. It also helps justify the costs - if only 70% of generations are successful, you need to factor that waste into your pricing.

Free Credits Claimed: Managing Your Burn Rate

If you’re offering free credits for user acquisition, you must track how quickly they’re being consumed:

Redemption rate: How many users are actually using their free credits versus letting them expire?
Consumption patterns: Are users burning through credits quickly or spreading them out over time?
Conversion indicators: Do users who use all their free credits convert to paid plans?
Abuse detection: Are some users gaming the system to get unlimited free generations?

Understanding free credit usage helps you:

Set appropriate credit amounts for new users
Identify the right balance between generous onboarding and financial sustainability
Detect fraudulent or abusive behavior early
Calculate your true customer acquisition cost

Building a Monitoring Culture

Effective monitoring isn’t just about collecting data - it’s about building a culture where decisions are driven by metrics rather than assumptions. Set up dashboards that your entire team can access, establish regular reviews of key metrics, and use the data to continuously improve both your product and your business model.

Remember: what gets measured gets managed. By monitoring LLM usage and performance from day one, you position yourself to build a scalable, sustainable AI-powered application.