AWS Lambda functions for Agora RTC token generation and ConvoAI agent management to be used in conjunction with the Agora telephony/SIP gateway.
- Overview
- token_gen.py — PSTN CallLookup
- launch_agent.py — ConvoAI Agent Launcher
- Token Generation
- Troubleshooting
This repo contains two independent Lambda functions:
| Lambda | File | Purpose |
|---|---|---|
| PSTN CallLookup | token_gen.py |
Returns an RTC token + channel for inbound PSTN calls |
| ConvoAI Agent Launcher | launch_agent.py |
Generates tokens, launches/hangs-up Agora ConvoAI agents |
Both share the same v007 token generation code but serve different use cases.
Handles the Agora PSTN gateway CallLookup webhook. When an inbound phone call arrives, the gateway POSTs caller information and this Lambda responds with an RTC token and channel name so the gateway can connect the caller.
- PSTN gateway sends
POST {did, pin, callerid} - Lambda generates a random 10-character channel name
- Builds a v007 RTC token (or uses APP_ID if no certificate)
- Returns the CallLookup response so the gateway joins the caller to the channel
APP_ID=your_agora_app_idAPP_CERTIFICATE=your_agora_app_certificate # omit to use APP_ID as token
USER_UID=101 # default: "101"
AUDIO_SCENARIO=0 # default: "0"
WEBHOOK_URL=https://example.com/webhook # included in response if set
SDK_OPTIONS={"key":"value"} # included in response if setRequest (POST from PSTN gateway):
{
"did": "17177440111",
"pin": "",
"callerid": "1765740333"
}Response:
{
"token": "007eJxT...",
"uid": "101",
"channel": "A1B2C3D4E5",
"appid": "your_app_id",
"audio_scenario": "0"
}Optional fields webhook_url and sdk_options are included when the corresponding environment variables are set.
| Setting | Value |
|---|---|
| Handler | token_gen.lambda_handler |
| Timeout | 10 seconds |
| Memory | 128 MB |
Launches and manages Agora Conversational AI agents with configurable TTS, STT, and LLM providers.
- Multi-vendor TTS support: Rime, ElevenLabs, OpenAI, Cartesia
- Multi-vendor STT support: Ares (Agora built-in), Deepgram
- Flexible LLM backend: Any OpenAI-compatible API
- Profile-based configuration: Support multiple agent configurations via profiles
- Token-only mode: Generate tokens without starting an agent
- Agent lifecycle management: Join and hangup capabilities
- RTM support: Real-time messaging integration
1. Rime
TTS_VENDOR=rime
RIME_API_KEY=your_api_key
RIME_SPEAKER=astra (default)
RIME_MODEL_ID=mistv2 (default)
RIME_LANG=eng (default)
RIME_SAMPLING_RATE=16000 (default)
RIME_SPEED_ALPHA=1.0 (default)
2. ElevenLabs
TTS_VENDOR=elevenlabs
TTS_KEY=your_api_key
TTS_VOICE_ID=your_voice_id
TTS_VOICE_STABILITY=1 (default: 0-1)
TTS_VOICE_SAMPLE_RATE=24000 (default)
3. OpenAI
TTS_VENDOR=openai
TTS_KEY=your_api_key
TTS_VOICE_ID=alloy|echo|fable|onyx|nova|shimmer
TTS_VOICE_SPEED=1.0 (default: 0.25-4.0)
4. Cartesia
TTS_VENDOR=cartesia
CARTESIA_API_KEY=your_api_key
CARTESIA_MODEL=sonic-3 (default)
CARTESIA_VOICE_ID=your_voice_id
CARTESIA_SAMPLE_RATE=24000 (default)
Ares (default) — Agora's built-in ASR, no API key required:
ASR_VENDOR=ares (default)
ASR_LANGUAGE=en-US (default)
Deepgram
ASR_VENDOR=deepgram
DEEPGRAM_KEY=your_api_key
DEEPGRAM_MODEL=nova-3 (default)
DEEPGRAM_LANGUAGE=en (default)
Any OpenAI-compatible API:
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=your_api_key
LLM_MODEL=gpt-4o-mini
APP_ID=your_agora_app_id
LLM_URL=your_llm_endpoint
LLM_API_KEY=your_llm_api_key
LLM_MODEL=your_model_name# Option 1: APP_CERTIFICATE (recommended)
# Generates v007 tokens for both API auth and channel join.
# API calls use "agora token=<v007_token>" authorization.
APP_CERTIFICATE=your_agora_app_certificate
# Option 2: AGENT_AUTH_HEADER (Basic auth)
# Uses Basic auth for API calls, APP_ID as channel join token.
AGENT_AUTH_HEADER=Basic <base64_key:secret>If both are set, AGENT_AUTH_HEADER takes priority for API auth.
If neither is set, API calls will fail (APP_ID alone is not valid for API auth).
See provider-specific settings above.
# Default: Ares (no API key needed)
ASR_VENDOR=ares
# Or use Deepgram:
ASR_VENDOR=deepgram
DEEPGRAM_KEY=your_deepgram_key
DEEPGRAM_MODEL=nova-3
DEEPGRAM_LANGUAGE=en# Agent Behavior
DEFAULT_PROMPT="Your custom system prompt"
DEFAULT_GREETING="hi there"
DEFAULT_FAILURE_MESSAGE="An error occurred, please try again later"
DEFAULT_MAX_HISTORY=32
# Voice Activity Detection
VAD_SILENCE_DURATION_MS=300
# Advanced Features
ENABLE_BHVS=true
ENABLE_RTM=true
ENABLE_AIVAD=true
ENABLE_ERROR_MESSAGE=true
# Agent Settings
IDLE_TIMEOUT=120
# Optional Graph ID
GRAPH_ID=your_graph_idhttps://your-lambda-url.amazonaws.com/your-stage/
GET /?channel=my_channel
# Optional parameters:
# - profile: Configuration profile to use
# - prompt: Override system prompt
# - greeting: Override greeting message
# - tts_vendor: rime|elevenlabs|openai|cartesia
# - voice_id: TTS voice identifier
# - llm_model: Override LLM model
# - debug: Include debug informationResponse:
{
"audio_scenario": "10",
"token": "user_rtc_token",
"uid": "101",
"channel": "my_channel",
"appid": "your_app_id",
"user_token": {
"token": "user_rtc_token",
"uid": "101"
},
"agent_video_token": {
"token": "agent_video_rtc_token",
"uid": "102"
},
"agent": {
"uid": "100"
},
"agent_rtm_uid": "100-my_channel",
"enable_string_uid": false,
"agent_response": {
"status_code": 200,
"response": "{...}",
"success": true
}
}GET /?connect=false
# Optional:
# - channel: Specify channel (auto-generated if omitted)
# - profile: Configuration profileResponse:
{
"audio_scenario": "10",
"token": "user_rtc_token",
"uid": "101",
"channel": "AUTOGEN123",
"appid": "your_app_id",
"user_token": {
"token": "user_rtc_token",
"uid": "101"
},
"agent_video_token": {
"token": "agent_video_rtc_token",
"uid": "102"
},
"agent": {
"uid": "100"
},
"agent_rtm_uid": "100-AUTOGEN123",
"enable_string_uid": false,
"token_generation_method": "RTC tokens with privileges",
"agent_response": {
"status_code": 200,
"response": "{\"message\":\"Token-only mode...\"}",
"success": true
}
}GET /?hangup=true&agent_id=your_agent_id
# Required:
# - agent_id: ID of the agent to disconnectResponse:
{
"agent_response": {
"status_code": 200,
"response": "{...}",
"success": true
}
}GET /?debug=true&channel=my_channel
GET /?debug=true&env_debug=true # Show environment variables- User UID:
"101"— For end-user RTC connection - Agent UID:
"100"— For AI agent audio - Agent Video UID:
"102"— For agent video stream (if applicable) - String UIDs: Disabled by default (
enable_string_uid: false)
Enable text chat alongside voice:
ENABLE_RTM=trueAI-powered voice activity detection:
ENABLE_AIVAD=trueEnable agent behavior extensions:
ENABLE_BHVS=trueReturn error messages to users:
ENABLE_ERROR_MESSAGE=trueUse profile suffix to override defaults for specific use cases:
# Default configuration
LLM_MODEL=gpt-4o-mini
DEFAULT_GREETING="Hi there"
# Profile-specific (accessed via ?profile=premium)
LLM_MODEL_premium=gpt-4o
DEFAULT_GREETING_premium="Welcome, premium user"TTS_VENDOR=elevenlabs
TTS_KEY=sk_...
TTS_VOICE_ID=cgSgspJ2msm6clMCkdW9
ASR_VENDOR=deepgram
DEEPGRAM_KEY=...
DEEPGRAM_MODEL=nova-3
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o-miniTTS_VENDOR=rime
RIME_API_KEY=...
RIME_SPEAKER=astra
RIME_MODEL_ID=mistv2
ASR_VENDOR=deepgram
DEEPGRAM_KEY=...
LLM_URL=https://your-llm-endpoint.com/v1/chat/completions
LLM_API_KEY=...
LLM_MODEL=your-custom-modelTTS_VENDOR=cartesia
CARTESIA_API_KEY=...
CARTESIA_MODEL=sonic-3
CARTESIA_VOICE_ID=...
CARTESIA_SAMPLE_RATE=24000
ASR_VENDOR=deepgram
DEEPGRAM_KEY=...
DEEPGRAM_MODEL=nova-3
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini| Setting | Value |
|---|---|
| Handler | launch_agent.lambda_handler |
| Timeout | 30 seconds |
| Memory | 256 MB |
| CORS | Enable for browser clients |
Both Lambdas use v007 service-based tokens.
Generates v007 tokens with RTC privileges:
- RTC Service: JOIN_CHANNEL, PUBLISH_AUDIO/VIDEO/DATA_STREAM privileges
- RTM Service (launch_agent only): LOGIN privilege with separate RTM UID (
{agent_uid}-{channel})
token_gen.py generates RTC-only tokens (PSTN callers don't use RTM).
launch_agent.py generates tokens with both RTC and RTM services.
Token expires in 24 hours.
Returns APP_ID as token for channel join (testing mode).
For launch_agent.py, this requires AGENT_AUTH_HEADER for API authentication.
- Verify either
APP_CERTIFICATEorAGENT_AUTH_HEADERis set - Check
APP_IDmatches your Agora project - Ensure Lambda has internet access (VPC configuration)
- Verify TTS provider credentials
- Check TTS_VENDOR matches your configuration
- Review CloudWatch logs for TTS errors
- Verify Deepgram API key
- Check microphone permissions on client side
- Ensure audio is being sent to channel
- Verify the PSTN gateway is configured to POST to your Lambda URL
- Check
APP_IDis correct - Review CloudWatch logs for the CallLookup request
- Verify
APP_CERTIFICATEis correct (must be 32-character hex) - Check token hasn't expired (24h default)
- Ensure UID matches between client and token
See repository license.