How a Southeast Asian E-Commerce Seller Automated 50,000 Multilingual Review Analyses with Gemini Advanced

Turning 50,000 Reviews in 6 Languages into Actionable Product Insights

For cross-border e-commerce sellers operating across Southeast Asia, customer reviews arrive in Thai, Vietnamese, Indonesian, Tagalog, Malay, and Burmese — often mixed with English slang and local idioms. Manually reading and categorizing these reviews is impossible at scale. This case study documents how one Shopee and Lazada seller used the Gemini Advanced API to build an automated pipeline that analyzed over 50,000 multilingual reviews, identified product defects by category, and reduced return rates by 23% within a single quarter.

The Challenge

The seller — a consumer electronics brand shipping phone accessories to six ASEAN markets — faced three core problems:

  • Volume: 8,000+ new reviews per month across platforms and languages- Language diversity: Reviews in Thai, Vietnamese, Bahasa Indonesia, Tagalog, Malay, and Burmese with code-switching- Actionability gap: No structured way to route review insights to the product and QA teams

Solution Architecture

The pipeline consists of four stages: extraction, multilingual analysis, categorization, and reporting — all powered by the Gemini 2.5 Pro model via the Google AI API.

Step 1: Environment Setup

# Install the Google Gen AI SDK pip install google-genai pandas openpyxl

Set your API key as an environment variable

export GEMINI_API_KEY=“YOUR_API_KEY”

Step 2: Initialize the Gemini Client

from google import genai
import json
import pandas as pd

client = genai.Client(api_key="YOUR_API_KEY")
MODEL = "gemini-2.5-pro"

Step 3: Build the Review Analysis Prompt

The key to accurate multilingual analysis is a structured system prompt that instructs Gemini to handle all six languages natively — without a separate translation step. SYSTEM_PROMPT = """ You are a multilingual product review analyst specializing in Southeast Asian languages. You can natively understand Thai, Vietnamese, Indonesian, Tagalog, Malay, and Burmese.

For each review, return a JSON object with these fields:

  • “original_language”: detected language code (th, vi, id, tl, ms, my, en)
  • “sentiment”: “positive”, “negative”, or “neutral”
  • “category”: one of [“quality”, “shipping”, “packaging”, “functionality”, “value”, “other”]
  • “key_issues”: array of specific issues mentioned (in English)
  • “severity”: 1-5 scale (5 = critical defect)
  • “english_summary”: one-sentence English summary

Respond ONLY with valid JSON. No commentary. """

def analyze_review(review_text): response = client.models.generate_content( model=MODEL, contents=review_text, config=genai.types.GenerateContentConfig( system_instruction=SYSTEM_PROMPT, temperature=0.1, response_mime_type=“application/json” ) ) return json.loads(response.text)

Step 4: Batch Processing with Rate Limiting

import time

def process_reviews_batch(reviews_df, batch_size=10):
    results = []
    for i in range(0, len(reviews_df), batch_size):
        batch = reviews_df.iloc[i:i+batch_size]
        for _, row in batch.iterrows():
            try:
                analysis = analyze_review(row["review_text"])
                analysis["review_id"] = row["review_id"]
                analysis["product_sku"] = row["sku"]
                results.append(analysis)
            except Exception as e:
                results.append({
                    "review_id": row["review_id"],
                    "error": str(e)
                })
        time.sleep(2)  # Respect rate limits
        print(f"Processed {min(i+batch_size, len(reviews_df))}/{len(reviews_df)}")
    return pd.DataFrame(results)

# Load and process
reviews = pd.read_csv("reviews_export.csv")
results_df = process_reviews_batch(reviews)
results_df.to_excel("analysis_results.xlsx", index=False)

Step 5: Generate Executive Summary

def generate_summary(results_df):
    stats = results_df.groupby(["category", "sentiment"]).size().to_dict()
    top_issues = results_df.explode("key_issues")["key_issues"].value_counts().head(10).to_dict()

    summary_prompt = f"""
    Based on the analysis of {len(results_df)} customer reviews:
    Category/Sentiment breakdown: {json.dumps(stats)}
    Top 10 issues: {json.dumps(top_issues)}

    Write a concise executive summary with:
    1. Top 3 critical findings
    2. Recommended actions ranked by impact
    3. Market-specific patterns (which countries report which issues)
    """

    response = client.models.generate_content(
        model=MODEL,
        contents=summary_prompt
    )
    return response.text

print(generate_summary(results_df))

Results

MetricBeforeAfter (90 days)Change
Return rate8.7%6.7%-23%
Time to insight2 weeks (manual)4 hours (automated)-96%
Languages covered2 (EN, ID)6+200%
Reviews analyzed/month~5008,000++1,500%
Product issues identified3-5 per quarter18 per quarter+300%
## Pro Tips for Power Users - **Use response_mime_type="application/json":** This forces Gemini to return valid JSON every time, eliminating parsing errors from markdown-wrapped responses.- **Set temperature to 0.1:** For analytical tasks, low temperature ensures consistent categorization across thousands of reviews.- **Batch by language:** Group reviews by detected language before processing. This reduces context-switching overhead and improves accuracy for low-resource languages like Burmese.- **Cache your system prompt:** When using the Gemini API at scale, leverage context caching to reduce costs on the system instruction across batches.- **Combine with Google Sheets API:** Push results directly to a shared Google Sheet for real-time dashboards that product and QA teams can monitor.- **Use grounding with Google Search:** For reviews mentioning competitor products, enable grounding to verify claims and identify competitive patterns. ## Troubleshooting

Error: 429 Resource Exhausted

You have exceeded the API rate limit. The free tier allows 15 requests per minute for Gemini 2.5 Pro. Add exponential backoff: import time import random

def analyze_with_retry(text, max_retries=3): for attempt in range(max_retries): try: return analyze_review(text) except Exception as e: if “429” in str(e): wait = (2 ** attempt) + random.uniform(0, 1) time.sleep(wait) else: raise raise Exception(“Max retries exceeded”)

Error: Invalid JSON in Response

Occasionally, very short or emoji-only reviews can cause malformed output. Add validation: def safe_analyze(text): if len(text.strip()) < 3: return {"sentiment": "neutral", "category": "other", "key_issues": [], "severity": 1} return analyze_review(text) ### Poor Accuracy on Burmese or Tagalog Reviews

For lower-resource languages, add few-shot examples in your system prompt. Include 2-3 sample reviews with expected JSON output for each language to guide the model.

Large File Processing Timeout

For datasets exceeding 10,000 reviews, split into daily chunks and use asynchronous processing with asyncio and the async Gemini client to maximize throughput.

Frequently Asked Questions

Can Gemini Advanced handle code-switched reviews (e.g., Thai mixed with English)?

Yes. Gemini 2.5 Pro handles code-switching natively. In this case study, approximately 35% of Thai and Tagalog reviews contained English words or phrases. The model correctly identified the primary language while extracting meaning from both languages without requiring preprocessing or language separation.

What is the cost of analyzing 50,000 reviews with the Gemini API?

Using Gemini 2.5 Pro, with an average review length of 80 tokens and a structured JSON response of approximately 120 tokens, the total cost for 50,000 reviews is roughly $15-25 USD at current pricing. Using context caching for the system prompt can reduce this by an additional 20-30%. The free tier (15 RPM) can process the full dataset in approximately 56 hours; the paid tier processes it in under 4 hours.

How does this approach compare to using dedicated NLP tools like AWS Comprehend for multilingual sentiment analysis?

Dedicated NLP services typically offer sentiment and entity detection but lack the ability to perform nuanced categorization, severity scoring, and natural-language summarization in a single call. Gemini’s advantage is the unified pipeline — one API call extracts sentiment, categorizes the issue, assigns severity, and summarizes in English. This eliminates the need to chain multiple services together. However, for pure sentiment-only analysis at very high volume (millions of records), dedicated NLP services may offer lower per-unit costs.

Explore More Tools

Grok Best Practices for Academic Research and Literature Discovery: Leveraging X/Twitter for Scholarly Intelligence Best Practices Grok Best Practices for Content Strategy: Identify Trending Topics Before They Peak and Create Content That Captures Demand Best Practices Grok Case Study: How a DTC Beauty Brand Used Real-Time Social Listening to Save Their Product Launch Case Study Grok Case Study: How a Pharma Company Tracked Patient Sentiment During a Drug Launch and Caught a Safety Signal 48 Hours Before the FDA Case Study Grok Case Study: How a Disaster Relief Nonprofit Used Real-Time X/Twitter Monitoring to Coordinate Emergency Response 3x Faster Case Study Grok Case Study: How a Political Campaign Used X/Twitter Sentiment Analysis to Reshape Messaging and Win a Swing District Case Study How to Use Grok for Competitive Intelligence: Track Product Launches, Pricing Changes, and Market Positioning in Real Time How-To Grok vs Perplexity vs ChatGPT Search for Real-Time Information: Which AI Search Tool Is Most Accurate in 2026? Comparison How to Use Grok for Crisis Communication Monitoring: Detect, Assess, and Respond to PR Emergencies in Real Time How-To How to Use Grok for Product Improvement: Extract Customer Feedback Signals from X/Twitter That Your Support Team Misses How-To How to Use Grok for Conference Live Monitoring: Extract Event Insights and Identify Networking Opportunities in Real Time How-To How to Use Grok for Influencer Marketing: Discover, Vet, and Track Influencer Partnerships Using Real X/Twitter Data How-To How to Use Grok for Job Market Analysis: Track Industry Hiring Trends, Layoff Signals, and Salary Discussions on X/Twitter How-To How to Use Grok for Investor Relations: Track Earnings Sentiment, Analyst Reactions, and Shareholder Concerns in Real Time How-To How to Use Grok for Recruitment and Talent Intelligence: Identifying Hiring Signals from X/Twitter Data How-To How to Use Grok for Startup Fundraising Intelligence: Track Investor Sentiment, VC Activity, and Funding Trends on X/Twitter How-To How to Use Grok for Regulatory Compliance Monitoring: Real-Time Policy Tracking Across Industries How-To NotebookLM Best Practices for Financial Analysts: Due Diligence, Investment Research & Risk Factor Analysis Across SEC Filings Best Practices NotebookLM Best Practices for Teachers: Build Curriculum-Aligned Lesson Plans, Study Guides, and Assessment Materials from Your Own Resources Best Practices NotebookLM Case Study: How an Insurance Company Built a Claims Processing Training System That Cut Errors by 35% Case Study