Perplexity API Complete Setup Guide: Key Generation, Python SDK, Citation Parsing & Search Mode
Perplexity API Complete Setup Guide
Perplexity AI offers a powerful API that combines large language models with real-time web search capabilities. This guide walks you through everything from obtaining your API key to advanced citation parsing and search-augmented generation workflows.
Step 1: Create a Perplexity Account and Generate Your API Key
- Visit perplexity.ai and sign up or log in to your account.- Navigate to Settings → API or go directly to
perplexity.ai/settings/api.- Click Generate API Key and copy the key immediately — it will only be shown once.- Add billing credits to your account. Perplexity API uses a pay-per-use credit system separate from a Pro subscription.Important: Your API key starts withpplx-. Store it securely and never commit it to version control.
Step 2: Install the Required Python Packages
Perplexity’s API is compatible with the OpenAI SDK format, so you can use the official OpenAI Python client.
# Install the OpenAI Python SDK
pip install openai
Optional: install python-dotenv for environment variable management
pip install python-dotenv
Optional: install httpx for direct REST calls
pip install httpx
Step 3: Configure Your Environment
Create a .env file in your project root to manage your API key securely:
# .env
PPLX_API_KEY=pplx-YOUR_API_KEY
Then load it in Python:
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(
api_key=os.getenv(“PPLX_API_KEY”),
base_url=“https://api.perplexity.ai”
)
Step 4: Make Your First API Call
Perplexity uses a chat completions endpoint identical in structure to the OpenAI API:
response = client.chat.completions.create(
model="sonar",
messages=[
{"role": "system", "content": "You are a helpful research assistant. Be precise and cite sources."},
{"role": "user", "content": "What are the latest developments in quantum computing in 2026?"}
]
)
print(response.choices[0].message.content)
Available Models
| Model | Description | Best For |
|---|---|---|
sonar | Standard search-augmented model | General queries with web sources |
sonar-pro | Advanced multi-step search model | Complex research tasks |
sonar-reasoning | Extended thinking with search | Deep analysis and reasoning |
sonar-reasoning-pro | Premium reasoning model | Most complex research workflows |
One of Perplexity's most powerful features is inline citations. The API returns source URLs that you can extract and display.
import json
response = client.chat.completions.create(
model=“sonar”,
messages=[
{“role”: “user”, “content”: “What is retrieval-augmented generation?”}
]
)
Access the main response
answer = response.choices[0].message.content
print(“Answer:”, answer)
Extract citations from the response object
citations = getattr(response, “citations”, None)
if citations:
print(“\nSources:”)
for i, url in enumerate(citations, 1):
print(f” [{i}] {url}“)
Advanced Citation Mapping
Map inline reference numbers in the text to their corresponding URLs:
import re
def parse_citations(content, citations):
"""Map inline [n] references to source URLs."""
ref_pattern = re.compile(r’[(\d+)]’)
refs_used = sorted(set(int(m) for m in ref_pattern.findall(content)))
mapped = {}
for ref_num in refs_used:
idx = ref_num - 1
if 0 <= idx < len(citations):
mapped[ref_num] = citations[idx]
return mapped
citation_map = parse_citations(answer, citations or [])
for num, url in citation_map.items():
print(f”[{num}] → {url}“)
Step 6: Use Search-Enhanced Mode with Parameters
Control the search behavior with additional parameters:
response = client.chat.completions.create(
model="sonar",
messages=[
{"role": "system", "content": "Provide detailed technical analysis."},
{"role": "user", "content": "Compare FastAPI vs Django for REST API development"}
],
temperature=0.2,
max_tokens=1024,
search_domain_filter=["stackoverflow.com", "github.com", "docs.python.org"],
search_recency_filter="week"
)
### Search Parameter Reference
| Parameter | Type | Description |
|---|---|---|
search_domain_filter | list | Limit search to specific domains (max 3) |
search_recency_filter | string | Filter by time: hour, day, week, month |
return_related_questions | bool | Get follow-up question suggestions |
temperature | float | Controls randomness (0.0–2.0) |
If you prefer direct HTTP calls without an SDK:
curl -X POST https://api.perplexity.ai/chat/completions \
-H "Authorization: Bearer pplx-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "sonar",
"messages": [
{"role": "user", "content": "Explain WebAssembly use cases"}
],
"max_tokens": 512
}'
## Step 8: Streaming Responses
For real-time output, enable streaming:
stream = client.chat.completions.create(
model="sonar",
messages=[
{"role": "user", "content": "Summarize recent AI regulation news"}
],
stream=True
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
Pro Tips for Power Users
- System prompts matter: Use detailed system messages to control output format, citation style, and response depth. Perplexity’s search results quality improves with specific system instructions.- Chain queries for deep research: Use
sonar-profor multi-step research. Feed the output of one query as context into the next for iterative exploration.- Domain filtering for accuracy: When researching technical topics, restrict searches to authoritative domains like official docs and peer-reviewed sources usingsearch_domain_filter.- Cost optimization: Usesonarfor simple factual queries and reservesonar-proor reasoning models for complex analysis. Monitor token usage viaresponse.usage.- Structured output: Request JSON output by specifying format in your system prompt and parsing the response accordingly.
Troubleshooting Common Errors
| Error | Cause | Solution |
|---|---|---|
401 Unauthorized | Invalid or expired API key | Regenerate your key at perplexity.ai/settings/api |
403 Forbidden | Insufficient credits | Add billing credits to your Perplexity account |
429 Too Many Requests | Rate limit exceeded | Implement exponential backoff; default is 50 req/min |
Model not found | Incorrect model name | Use exact model names: sonar, sonar-pro, etc. |
Connection error | Wrong base_url | Ensure base_url is https://api.perplexity.ai |
Q1: Is the Perplexity API the same as a Perplexity Pro subscription?
No. The API and Pro subscription are separate products with independent billing. A Pro subscription gives you unlimited searches on the web app, while the API uses a pay-per-token credit system. You need to add API credits separately even if you have a Pro plan.
Q2: Can I use the Perplexity API without real-time search?
The Sonar models are inherently search-augmented — web search is a core feature. If you need a pure LLM without search, consider using a different provider. However, you can influence search behavior using domain filters and recency filters to narrow or focus the search scope.
Q3: How do I manage costs when using the Perplexity API?
Monitor usage via the response.usage object which returns prompt_tokens and completion_tokens. Use max_tokens to cap response length. Choose the appropriate model tier — sonar is significantly cheaper than sonar-pro or reasoning variants. Set up billing alerts in your Perplexity dashboard to avoid unexpected charges.