VoiceBlender is an open source Go service that bridges SIP and WebRTC voice calls with multi-party audio mixing, a REST API, and real-time webhooks. Plug in your own TTS, STT, and AI agent models.
go run ./cmd/voiceblender
A complete toolkit for voice transformation, built in the open.
Originate and Receive SIP calls. Multiple codecs supported, including PCMA, PCMU and Opus.
Browser-based voice via SDP offer/answer with trickle ICE. Connect users directly from the browser with no plugins required.
Mix multiple participants in a single room. Join via SIP, WebRTC, or WebSocket.
Built-in support for ElevenLabs, Google Cloud, and AWS Polly for TTS. Real-time STT with partial transcripts.
Attach a conversational AI agent to any leg or room with barge-in support. Supports ElevenLabs, VAPI, and Pipecat out of the box.
Full REST API for legs, rooms, playback, recording, and more. Real-time event delivery with HMAC-SHA256 signing and retry.
Up and running in minutes.
go build, go run, or pull the Docker image. REST API on :8080, SIP on :5060.
Set environment variables for SIP, ICE servers, webhooks, and your TTS/STT/AI provider API keys.
Originate SIP calls, accept inbound calls via webhooks, or connect browsers over WebRTC.
Everything you need to get started.
# Build and run
go build -o voiceblender ./cmd/voiceblender
./voiceblender
# Or run directly
go run ./cmd/voiceblender
# REST API on :8080, SIP on 127.0.0.1:50601. Register a webhook POST /v1/webhooks
2. Receive inbound call --> webhook: leg.ringing {leg_id, from, to}
3. Answer POST /v1/legs/{id}/answer
4. Create a room POST /v1/rooms
5. Add legs to room POST /v1/rooms/{id}/legs
6. Attach AI agent POST /v1/legs/{id}/agent
7. Start recording POST /v1/legs/{id}/record
8. Hang up DELETE /v1/legs/{id}POST /v1/legs # Originate outbound SIP call
GET /v1/legs # List all legs
POST /v1/legs/{id}/answer # Answer ringing inbound leg
POST /v1/legs/{id}/early-media # Enable early media (183)
DELETE /v1/legs/{id} # Hang up
POST /v1/legs/{id}/dtmf # Send DTMF digits
POST /v1/legs/{id}/play # Play audio or tone
POST /v1/legs/{id}/tts # Text-to-speech
POST /v1/legs/{id}/record # Start recording
POST /v1/legs/{id}/stt # Start speech-to-text
POST /v1/legs/{id}/agent # Attach AI agentPOST /v1/rooms # Create room
GET /v1/rooms # List rooms
DELETE /v1/rooms/{id} # Delete room (hangs up all legs)
POST /v1/rooms/{id}/legs # Add leg to room
GET /v1/rooms/{id}/ws # Join room via WebSocket
POST /v1/rooms/{id}/play # Play audio or tone to room
POST /v1/rooms/{id}/tts # TTS to room
POST /v1/rooms/{id}/record # Record room mix
POST /v1/rooms/{id}/agent # Attach AI agent to roomexport HTTP_ADDR=:8080 # REST API listen address
export SIP_BIND_IP=127.0.0.1 # IP for SDP/Contact/Via headers
export SIP_PORT=5060 # SIP listen port
export ICE_SERVERS=stun:stun.l.google.com:19302
export RECORDING_DIR=/tmp/recordings
export LOG_LEVEL=info # debug, info, warn, error
export WEBHOOK_URL=https://example.com/hooks
export ELEVENLABS_API_KEY=sk-... # TTS, STT, Agent
export VAPI_API_KEY=... # VAPI Agent provider
export S3_BUCKET=my-recordings # Optional S3 upload# WebRTC
POST /v1/webrtc/offer # SDP offer/answer exchange
POST /v1/legs/{id}/ice-candidates # Add trickle ICE candidate
GET /v1/legs/{id}/ice-candidates # Get gathered ICE candidates
# Webhooks
POST /v1/webhooks # Register webhook
GET /v1/webhooks # List webhooks
DELETE /v1/webhooks/{id} # Unregister webhookMeasured end-to-end with real SIP calls using the built-in benchmark suite.
# Run the benchmark (default scales: 5, 10, 25, 50, 100 rooms)
go test -tags integration -v -timeout 300s \
-run TestConcurrentRoomsScale ./tests/integration/
# Example output at 100 rooms:
# Phase 1 — Setup: 100 rooms in 3.7s (26.9 rooms/sec)
# call+room setup latency: avg=570ms p50=615ms p95=728ms p99=751ms
# Goroutines: 1914 | Heap alloc: 19.0 MB
# Phase 2 — Sustaining 100 rooms for 3s... All 200 calls connected
# Phase 3 — Audio latency: avg=20ms p50=10ms p95=62ms p99=64ms
# Phase 4 — Teardown: 100 rooms in 5.6ms (17782 rooms/sec)Voiceblender is built by the community. Whether you write code, report bugs, or improve docs, every contribution matters.