API Speech Recognition

Integrated into your conversational tools: best-in-class speech recognition, built for developers.

Word Error Rate

< 5 %

Most accurate engine in France, post-LMF

Latency

< 600 ms

On live conversational streams, in real conditions

Pricing

Lowest

On the market, per-minute commitment

Trusted by leading enterprises, integrators and researchers

Language Model Factory
Language model adaptation, express

With our Language Model Factory (LMF), no domain vocabulary goes unrecognised. Train custom models in as little as 15 minutes and select the one that best fits your use case.

A production-ready model in 15 minutes

Jargon, proper nouns, industry acronyms

Pre-trained vertical models available off the shelf

01

Vertical models

Pre-trained on industry-specific verticals.

02

Jargon & proper nouns

Vocabulary tied to your business.

03

Acronyms

Automatic recognition and expansion.

04

Business expressions

Specific to your processes and use cases.

Vocal Cookie

Securing and redacting sensitive data

Automatic real-time redaction of sensitive information based on your use cases: personal data, banking and health records.

Transcript without redaction / Transcript with redaction

Batch
Speech-to-Text API for audio recordings

Simply drop your phone conversations onto a secure FTP for transcription within minutes, or use our connectors:

Available languages

French, English, Spanish, German, Italian, +5 others (Europe)

Python SDK
# Secure FTP upload
import uhlive

client = uhlive.connect("api.uh.live")

# Batch transcription
job = client.transcribe_file(
  file="call_2026-04-28.wav",
  model="en-telephony-v3",
  redaction=True
)

# Result
transcript = job.result()
print(transcript.text)
Streaming — Live
Streaming API for humans

Connect your audio streams directly via WebSocket to receive real-time multi-speaker transcription, or via Trunk SIP / SIP REC.

Available languages

French, English, Spanish and German

Streaming — Bot
Streaming API for bots

Streaming API for IVRs and voicebots. Transcribe your live interactions with our advanced solutions.

Our protocols

MRCP v2

WebSocket

Built-in for every interaction

Speech activity detection, language model selection, grammars, address recognition, dates, numbers and boolean responses.

Available languages

French, English, Spanish and German

WER < 5 %

Most accurate engine in France

100 M

Calls analysed per year

40 %

Of analyses in real time

Ready to transcribe your first calls?

Access the uh!ive Speech-to-Text API. Set up in minutes.