Perplexity API Complete Setup Guide: API Key, Python SDK, Citation Parsing & Search Models

Perplexity API Complete Setup Guide

Perplexity AI offers a powerful API that combines large language models with real-time web search capabilities. This guide walks you through everything from obtaining your API key to parsing source citations and selecting the right search-augmented model for your use case.

Step 1: Create a Perplexity Account and Generate Your API Key

  • Visit perplexity.ai and sign up for an account (or log in if you already have one).- Navigate to Settings → API or go directly to the API settings page.- Click Generate API Key. Copy the key immediately — it will only be displayed once.- Add billing information. Perplexity API uses a pay-per-request pricing model. You must have a valid payment method on file before making API calls.- Store your key securely. Never commit it to version control.

Store Your Key as an Environment Variable

# Linux / macOS export PERPLEXITY_API_KEY=“YOUR_API_KEY”

Windows PowerShell

$env:PERPLEXITY_API_KEY=“YOUR_API_KEY”

Or add to a .env file (never commit this file)

echo “PERPLEXITY_API_KEY=YOUR_API_KEY” >> .env

Step 2: Install the Required Python Packages

Perplexity's API is compatible with the OpenAI SDK, so you can use the official openai Python library as your client. # Install the OpenAI Python SDK pip install openai

Optional: for environment variable management

pip install python-dotenv

Optional: for async workflows

pip install httpx

Step 3: Configure the Python Client

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("PERPLEXITY_API_KEY", "YOUR_API_KEY"),
    base_url="https://api.perplexity.ai"
)

That's it — three lines and you have a fully configured client. The key difference from standard OpenAI usage is the base_url parameter pointing to Perplexity's endpoint.

Step 4: Make Your First API Call

response = client.chat.completions.create( model=“sonar-pro”, messages=[ {“role”: “system”, “content”: “You are a helpful research assistant. Be concise and cite sources.”}, {“role”: “user”, “content”: “What are the latest developments in quantum computing in 2026?”} ] )

print(response.choices[0].message.content)

Step 5: Parse Source Citations

One of Perplexity's most valuable features is returning source citations alongside generated answers. Citations are returned in the response object. response = client.chat.completions.create( model="sonar-pro", messages=[ {"role": "user", "content": "What is retrieval-augmented generation?"} ] )

Access the answer

answer = response.choices[0].message.content print(“Answer:”, answer)

Access citations (returned in the response metadata)

if hasattr(response, ‘citations’): citations = response.citations print(“\nSources:”) for i, citation in enumerate(citations, 1): print(f” [{i}] {citation}”)

Building a Citation Formatter

import re

def format_response_with_citations(response):
    """Format Perplexity response with numbered source citations."""
    content = response.choices[0].message.content
    citations = getattr(response, 'citations', [])

    if not citations:
        return content

    formatted = content + "\n\n--- Sources ---\n"
    for i, url in enumerate(citations, 1):
        formatted += f"[{i}] {url}\n"

    return formatted

# Usage
result = format_response_with_citations(response)
print(result)

Step 6: Choose the Right Search-Augmented Model

Perplexity offers several models optimized for different use cases:

ModelBest ForContext WindowKey Feature
sonarQuick lookups, simple Q&A128k tokensFast, cost-effective search
sonar-proComplex research, multi-step reasoning200k tokensMulti-search, deeper analysis
sonar-reasoningMath, logic, scientific tasks128k tokensExtended thinking with search
sonar-reasoning-proAdvanced reasoning with research128k tokensBest reasoning + search combo
sonar-deep-researchComprehensive research reports128k tokensExhaustive multi-step research
# Example: Using sonar-reasoning for a complex query response = client.chat.completions.create( model="sonar-reasoning", messages=[ {"role": "user", "content": "Compare the environmental impact of lithium-ion vs solid-state batteries."} ] ) print(response.choices[0].message.content) ## Step 7: Advanced Configuration Options
# Fine-tune search behavior with additional parameters
response = client.chat.completions.create(
    model="sonar-pro",
    messages=[
        {"role": "user", "content": "Latest Python 3.14 features"}
    ],
    temperature=0.2,           # Lower = more factual
    max_tokens=1024,           # Control response length
    top_p=0.9,                 # Nucleus sampling
    search_recency_filter="week",  # Filter: day, week, month, year
    return_related_questions=True   # Get follow-up suggestions
)
## Pro Tips for Power Users - **Use streaming for long responses:** Add stream=True to your API call and iterate over chunks for real-time output display.- **System prompts matter:** A well-crafted system prompt dramatically improves citation quality. Tell the model to "always cite sources" and "prefer recent publications."- **Batch requests efficiently:** For multiple queries, use Python's asyncio with httpx to run concurrent API calls and reduce total latency.- **Cache frequent queries:** Implement a simple dictionary or Redis cache keyed by query hash to avoid redundant API calls and reduce costs.- **Filter by recency:** Use search_recency_filter when freshness matters — set to "day" for breaking news or "month" for recent developments.- **Monitor usage:** Check your API dashboard regularly. Set billing alerts to avoid unexpected charges. ## Troubleshooting Common Errors
ErrorCauseFix
401 UnauthorizedInvalid or missing API keyVerify your key is correct and the environment variable is loaded. Regenerate the key if needed.
403 ForbiddenNo billing method on fileAdd a payment method in your Perplexity API settings before making calls.
429 Too Many RequestsRate limit exceededImplement exponential backoff. Default rate limits vary by plan tier.
model_not_foundIncorrect model nameDouble-check the model ID. Use sonar, sonar-pro, or sonar-reasoning.
Connection refusedWrong base URLEnsure base_url is set to https://api.perplexity.ai exactly.
# Robust error handling template
import time
from openai import OpenAI, APIError, RateLimitError

client = OpenAI( api_key=os.environ[“PERPLEXITY_API_KEY”], base_url=“https://api.perplexity.ai” )

def query_with_retry(prompt, model=“sonar-pro”, max_retries=3): for attempt in range(max_retries): try: response = client.chat.completions.create( model=model, messages=[{“role”: “user”, “content”: prompt}] ) return response except RateLimitError: wait = 2 ** attempt print(f”Rate limited. Retrying in {wait}s…”) time.sleep(wait) except APIError as e: print(f”API Error: {e}”) raise raise Exception(“Max retries exceeded”)

Frequently Asked Questions

Is the Perplexity API compatible with the OpenAI Python SDK?

Yes. Perplexity's API follows the OpenAI chat completions format. You simply install the openai package and point the base_url to https://api.perplexity.ai. All standard parameters like temperature, max_tokens, and stream work as expected, with additional Perplexity-specific options like search_recency_filter.

How do I access and parse citations from Perplexity API responses?

Citations are returned as part of the response object. Access them via response.citations, which provides a list of source URLs referenced in the generated answer. You can pair these with bracketed numbers in the response text to build fully attributed outputs for research or content workflows.

Which Perplexity model should I use for my project?

Use sonar for fast, simple lookups where speed and cost matter most. Choose sonar-pro for complex research requiring multiple search passes and deeper analysis. Pick sonar-reasoning or sonar-reasoning-pro for tasks that involve logic, math, or scientific analysis combined with real-time web data. For exhaustive multi-step reports, sonar-deep-research is the most thorough option.

Explore More Tools

Grok Best Practices for Academic Research and Literature Discovery: Leveraging X/Twitter for Scholarly Intelligence Best Practices Grok Best Practices for Content Strategy: Identify Trending Topics Before They Peak and Create Content That Captures Demand Best Practices Grok Case Study: How a DTC Beauty Brand Used Real-Time Social Listening to Save Their Product Launch Case Study Grok Case Study: How a Pharma Company Tracked Patient Sentiment During a Drug Launch and Caught a Safety Signal 48 Hours Before the FDA Case Study Grok Case Study: How a Disaster Relief Nonprofit Used Real-Time X/Twitter Monitoring to Coordinate Emergency Response 3x Faster Case Study Grok Case Study: How a Political Campaign Used X/Twitter Sentiment Analysis to Reshape Messaging and Win a Swing District Case Study How to Use Grok for Competitive Intelligence: Track Product Launches, Pricing Changes, and Market Positioning in Real Time How-To Grok vs Perplexity vs ChatGPT Search for Real-Time Information: Which AI Search Tool Is Most Accurate in 2026? Comparison How to Use Grok for Crisis Communication Monitoring: Detect, Assess, and Respond to PR Emergencies in Real Time How-To How to Use Grok for Product Improvement: Extract Customer Feedback Signals from X/Twitter That Your Support Team Misses How-To How to Use Grok for Conference Live Monitoring: Extract Event Insights and Identify Networking Opportunities in Real Time How-To How to Use Grok for Influencer Marketing: Discover, Vet, and Track Influencer Partnerships Using Real X/Twitter Data How-To How to Use Grok for Job Market Analysis: Track Industry Hiring Trends, Layoff Signals, and Salary Discussions on X/Twitter How-To How to Use Grok for Investor Relations: Track Earnings Sentiment, Analyst Reactions, and Shareholder Concerns in Real Time How-To How to Use Grok for Recruitment and Talent Intelligence: Identifying Hiring Signals from X/Twitter Data How-To How to Use Grok for Startup Fundraising Intelligence: Track Investor Sentiment, VC Activity, and Funding Trends on X/Twitter How-To How to Use Grok for Regulatory Compliance Monitoring: Real-Time Policy Tracking Across Industries How-To NotebookLM Best Practices for Financial Analysts: Due Diligence, Investment Research & Risk Factor Analysis Across SEC Filings Best Practices NotebookLM Best Practices for Teachers: Build Curriculum-Aligned Lesson Plans, Study Guides, and Assessment Materials from Your Own Resources Best Practices NotebookLM Case Study: How an Insurance Company Built a Claims Processing Training System That Cut Errors by 35% Case Study