How a Small Real Estate Investment Team Automated Market Analysis with Gemini Advanced: 15 Hours to 3 Hours Weekly
How a Small Investment Team Cut Weekly Research from 15 Hours to 3 Hours Using Gemini Advanced
A five-person real estate investment team in Seoul was spending 15 hours each week manually collecting property listings, running yield calculations, and compiling market reports. By building an automated workflow powered by Gemini Advanced and the Gemini API, they reduced that time to just 3 hours — an 80% reduction — while improving the consistency and depth of their analysis. This case study walks through their exact workflow, the code they used, and the lessons learned along the way.
The Problem: Manual Research Bottlenecks
The team’s weekly research cycle involved three painful steps:
- Data collection — Scraping listings from multiple portals, copying prices, sizes, and locations into spreadsheets.- Yield calculation — Running cap rate, cash-on-cash return, and gross rent multiplier formulas for each property.- Report generation — Writing narrative summaries and ranking properties for the investment committee.Each step was error-prone and repetitive. The team needed a solution that could handle structured data processing and natural-language report writing in a single pipeline.
The Solution Architecture
The final workflow has three stages, all orchestrated through Python scripts calling the Gemini API:
- Stage 1: Collect and normalize listing data from CSV exports.- Stage 2: Calculate financial metrics with Python, then pass results to Gemini for interpretation.- Stage 3: Generate a weekly investment brief using Gemini Advanced’s long-context capabilities.
Step-by-Step Implementation
Step 1: Environment Setup
# Install the Google Generative AI SDK
pip install google-generativeai pandas openpyxl
Set your API key as an environment variable
export GEMINI_API_KEY=“YOUR_API_KEY”
You can obtain an API key from Google AI Studio (aistudio.google.com). A Gemini Advanced subscription gives you access to the latest models with higher rate limits.
Step 2: Configure the Gemini Client
import google.generativeai as genai
import os
import pandas as pd
genai.configure(api_key=os.environ[“GEMINI_API_KEY”])
model = genai.GenerativeModel(“gemini-2.5-pro”)
Step 3: Load and Normalize Listing Data
def load_listings(csv_path: str) -> pd.DataFrame:
df = pd.read_csv(csv_path)
# Standardize column names
df.columns = [c.strip().lower().replace(" ", "_") for c in df.columns]
# Ensure numeric types
for col in ["price", "monthly_rent", "area_sqm"]:
df[col] = pd.to_numeric(df[col], errors="coerce")
df.dropna(subset=["price", "monthly_rent"], inplace=True)
return df
listings = load_listings("weekly_listings.csv")
print(f"Loaded {len(listings)} valid listings")
Step 4: Calculate Financial Metrics
def calculate_metrics(df: pd.DataFrame) -> pd.DataFrame:
# Annual gross rent
df["annual_rent"] = df["monthly_rent"] * 12
# Cap Rate = Annual Rent / Purchase Price
df["cap_rate"] = (df["annual_rent"] / df["price"] * 100).round(2)
# Gross Rent Multiplier = Price / Annual Rent
df["grm"] = (df["price"] / df["annual_rent"]).round(1)
# Price per square meter
df["price_per_sqm"] = (df["price"] / df["area_sqm"]).round(0)
return df
listings = calculate_metrics(listings)
top_10 = listings.nlargest(10, "cap_rate")
Step 5: Generate the Investment Brief with Gemini
def generate_report(df: pd.DataFrame) -> str:
data_summary = df.to_markdown(index=False)
prompt = f"""You are a real estate investment analyst.
Below is this week's top property listing data with calculated metrics.
{data_summary}
Write a professional weekly investment brief that includes:
1. Market overview (price trends, notable patterns)
2. Top 3 recommended properties with reasoning
3. Risk factors to watch
4. A comparison table of the top 5 listings by cap rate
Format the output in clean HTML suitable for an email report."""
response = model.generate_content(prompt)
return response.text
report_html = generate_report(top_10)
with open("weekly_report.html", "w", encoding="utf-8") as f:
f.write(report_html)
print("Report saved to weekly_report.html")
Step 6: Automate with a Weekly Cron Job
# crontab -e (Linux/macOS)
0 8 * * 1 cd /home/user/re-analysis && python run_pipeline.pyOn Windows, use Task Scheduler to trigger python run_pipeline.py every Monday at 8 AM.
Results: Before and After
| Metric | Before | After |
|---|---|---|
| Weekly research hours | 15 hours | 3 hours |
| Listings analyzed per week | 30–50 | 200+ |
| Report turnaround | Wednesday | Monday morning |
| Calculation errors per month | 5–8 | 0 |
| Cost (monthly) | ~$0 (labor only) | ~$20 (API usage) |
response_mime_type="application/json" to generate_content when you need Gemini to return parsed JSON instead of prose. This makes downstream processing trivial.- **Leverage long context:** Gemini 2.5 Pro supports up to 1 million tokens. Feed it 6 months of historical listing data alongside the current week to get trend-aware commentary.- **Chain prompts for depth:** Use a first call to identify outliers, then a second call focused only on those outliers for deep-dive analysis.- **Cache repeated context:** Use the [context caching](https://ai.google.dev/gemini-api/docs/caching) feature for large static datasets (e.g., neighborhood profiles) to reduce costs and latency.- **Version your prompts:** Store prompts in separate .txt files under version control. Small wording changes can significantly alter output quality.
## Troubleshooting Common Issues
| Error | Cause | Fix |
|---|---|---|
google.api_core.exceptions.ResourceExhausted | Rate limit exceeded | Add exponential backoff: time.sleep(2 ** attempt). Consider upgrading to a paid tier for higher RPM. |
ValueError: Could not convert string to float | Non-numeric characters in price columns (e.g., currency symbols) | Add df["price"] = df["price"].str.replace(r"[^\d.]", "", regex=True) before conversion. |
| Report contains hallucinated property data | Gemini fills gaps when data is incomplete | Add an explicit instruction: *"Only reference properties present in the data. Do not fabricate listings."* |
PermissionDenied: 403 | API key lacks Generative Language API access | Enable the **Generative Language API** in Google Cloud Console for your project. |
| Report formatting inconsistent week to week | Non-deterministic generation | Set generation_config=genai.GenerationConfig(temperature=0.2) for more consistent output. |
Can Gemini Advanced replace a professional real estate analyst?
No. Gemini automates the repetitive data processing and report drafting tasks, but investment decisions still require human judgment about factors like neighborhood development plans, regulatory changes, and relationship-based deal flow. The team in this case study uses Gemini’s output as a starting point that their analysts then review, annotate, and present.
How much does it cost to run this workflow weekly?
For a dataset of 200 listings per week with one report generation call, the Gemini API cost is typically under $1 per run using Gemini 2.5 Pro. A Gemini Advanced subscription ($19.99/month) provides access to the model via AI Studio. Total monthly cost is approximately $20–25 including API overages. This is significantly cheaper than the labor cost of 12+ hours of manual work.
Can this workflow handle listings in languages other than English?
Yes. Gemini supports multilingual input and output. The team originally worked with Korean-language listing data and generated reports in both Korean and English. Simply specify the desired output language in your prompt, e.g., “Write the report in Korean”. The model handles mixed-language data (e.g., Korean addresses with English financial terms) without issues.