VOX Logo
  • Pricing
  • Contact
Log InSign Up
Voxworks Logo

AI Voice Systems for Australia

NVIDIA Inception Program

© Copyright 2026 Voxworks. All Rights Reserved.

Features

  • Virtual Receptionist
  • Batch Calling
  • Voice Engine
  • Australian Voices
  • Security & Compliance

Industries

  • Real Estate
  • Property Management
  • Automotive
  • Trades
  • Medical

Company

  • About Us
  • Careers
  • Voice Talent
  • Contact

Resources

  • Blog
  • Docs
  • FAQs
  • Support

Legal

  • Terms of Service
  • Privacy Policy
Voxworks Voice Engine

Australia's fastest AI voice engine for phone calls.

Built for ultra-low latency calling on Australian networks. Voxworks delivers natural conversational turn handling at speed, so your calls feel human, stay on-script, and stay safe.

Core Capabilities

Engineered for the millisecond.

Most LLMs are too slow for phone calls. We optimized the entire stack: VAD, STT, LLM, and TTS to shave off every millisecond of delay.

Sub-800ms Response

Optimised for real-time calling. We host inference at the edge in Sydney to minimize network travel time for local calls.

True Barge-In

Users can interrupt the AI mid-sentence. Our Voice Activity Detection (VAD) stops audio playback instantly, just like a human would.

Telco Noise Robustness

Tuned for 8kHz phone audio. It filters out background noise, static, and poor reception to understand intent clearly.

Safety Guardrails

Define strict guardrails that controls what the AI says in relation to your products or competitors, rather than letting AI improvise.

Smart Endpointing

The engine distinguishes between a "pause for thought" and "end of turn," reducing those awkward moments where the AI cuts you off.

Why Developers Choose Voxworks

Latency is the feature.

In voice, speed = intelligence. Slow responses make users hang up. Our infrastructure is peered directly with major AU carriers to ensure the fastest possible packet transit.

Voxworks Voice Engine↗
Sydney
Text
HTTP / Webhooks
Audio Out
Contact
I/O Channels
Gateway
STT
LLM
TTS
Custom
Business
Logic
Calendar
CRM

Pre-configured Agents

Don't start from zero. Clone these battle-tested JSON configurations.

The "Listener" Config

High endpointing timeout (1200ms) for consultative calls where users speak in long paragraphs.

TherapySupportAdvice

The "Rapid Fire" Config

Aggressive turn-taking (400ms endpointing) and concise answers for fast-paced qualifying calls.

LeadsQualifyingDispatch

The "Secure" Config

PII scrubbing enabled, strict state enforcement, and zero data retention mode.

FinanceHealthEnterprise

Hear the Engine

Listen to raw audio output demonstrating edge-case handling.

Interruption Handling

User: "Actually, wait, stop." -> AI stops instantly and asks for clarification.

Fast Turn-Taking

A rapid back-and-forth conversation checking name, address, and date without pauses.

Aussie Slang Test

User uses terms like "Arvo", "Rego", and "Ute" -> AI understands and responds appropriately.

Architecture

How the Voice Engine pipeline works.

1

Audio Ingest (WebSocket)

Your telephony provider sends µ-law 8kHz audio via WebSocket. We buffer and process typically within <50ms.

2

VAD & Transcription

Our Voice Activity Detector flags speech vs noise. The transcription model converts speech to text, optimized for Australian accents.

3

Reasoning & State Check

The LLM determines the next action based on your defined State Graph. It checks guardrails before generating a single token.

4

Streaming Synthesis

The TTS engine begins streaming audio bytes back to the caller before the full sentence is even generated.

Connectivity

Works with your stack.

Connect via standard protocols. No proprietary hardware required.

Twilio Media Streams

One-click XML configuration to fork audio from Twilio Programmable Voice.

Telnyx / SignalWire

Native support for VXML and WebSocket audio forks.

SIP Trunking

Direct SIP-in capabilities for high-volume enterprise diallers.

Voxworks API

Control the call logic from your own backend code in real-time.

Infrastructure

Enterprise uptime & security.

Check our Status Page for real-time latency metrics across all Australian capital cities.

  • 99.95% Uptime.
  • Servers located in Sydney (AWS ap-southeast-2) for minimum latency.
  • ISO27001 aligned infrastructure.
  • Ephemeral processing mode (no audio stored to disk).
Developer FAQs

Technical Questions

On Australian networks, we aim for <800ms "Voice-to-Voice" latency. This includes network transit, transcription, LLM token generation, and TTS synthesis.

Start building today.

Login today and make your first AI phone call in under 5 minutes.