PricingJune 28, 2026

The 10M-minute audit: what speech analytics platforms actually pay for STT

If your platform processes 10 million minutes a month, your STT line item ranges from $4,200 to $240,000 depending on the vendor. The full per-minute audit with sourced prices, and the line items inside the line item.

Mateo Bustamante7 min read

Speech analytics platforms tend to talk about volume in seat counts and revenue in ARR. The number that decides the gross margin lives in neither place. It's the per-minute STT cost at the platform's actual monthly throughput, and at production scale it ranges almost two orders of magnitude across the vendors most teams evaluate.

The volume number nobody publishes

Public benchmarks rarely include 10M+ min/mo workloads because the vendors who can serve them charge enterprise contracts and prefer not to anchor pricing in the open. The volume number itself isn't exotic — a single CI platform with 1,000 moderately-active seats hits it; a contact-center QA tool running on a 5,000-agent client doubles past it. It's the threshold at which a 10× price spread stops being a footnote and starts being a board-level conversation.

At 10 million minutes per month, here is what the same workload — transcripts plus speaker diarization, delivered through the vendor's standard production endpoint — actually costs:

Prices as of June 2026 · public pricing pages, normalized to USD/min

Provider	Batch ($/min)	Real-time ($/min)	Notes
Orchard (production volume)	$0.00042	$0.00042	Bundle: STT + speaker diarization on the same per-minute rate. No per-feature add-ons.
OpenAI Whisper API	$0.0060	—	Batch-only. Diarization is not a first-party feature. Real-time requires a separate vendor.
Deepgram Nova-3	$0.0043	$0.0077	STT only. Diarization, sentiment, redaction, summarization each priced separately.
AssemblyAI Universal	$0.0117	$0.0150	Bundles speaker labels at the listed rate. Sentiment, entity detection, content moderation are paid add-ons.
Azure AI Speech	$0.0167	$0.0167	Standard tier. Conversation transcription with speaker separation listed as the same rate.
AWS Transcribe	$0.0240	$0.0240	Standard tier. Call Analytics (sentiment + categories) prices at \$0.0365/min for the first 250K min.
Google Cloud STT	$0.0240	$0.0240	Standard model. Speaker diarization is bundled at the rate above; Conversation Insights is a separate API.

Multiplied across 10M minutes, the monthly bills land at:

Orchard — $4,200 / month
Deepgram Nova-3 — $43,000 / month (before diarization add-on)
OpenAI Whisper API — $60,000 / month (batch only)
AssemblyAI Universal — $117,000 / month
Azure AI Speech — $167,000 / month
AWS Transcribe — $240,000 / month (standard; Call Analytics is materially higher)
Google Cloud STT — $240,000 / month

At 10M minutes a month, picking the wrong STT vendor is a $2.8M/year mistake.

The line items inside the line item

The per-minute number is rarely the per-minute number. Three line items move it materially in either direction depending on the vendor, and most internal audits miss at least one:

Diarization billed separately. Deepgram, Rev.ai, Gladia, Google CCAI all charge speaker labels as an add-on. At call-center volume that's typically a 20-50% uplift on the published STT rate. Orchard bundles diarization at the same per-minute price; the rate you read is the rate you pay.
Intelligence stacking. Sentiment, topic detection, entity extraction, summarization, redaction — each is a separate per-minute charge on AssemblyAI and AWS Transcribe Call Analytics. A "$0.0117/min" listed price ships at 3-4× that on the actual invoice once a real QA pipeline is wired up.
Real-time premium. AssemblyAI, Rev.ai, Deepgram all charge real-time at 30-200% above batch. If your QA pipeline mixes both (live agent assist + recorded batch review), the blended rate doesn't match either column on the pricing page.

What changes at $4,200/mo

A 10M-minute platform spending $240K/month on AWS is spending $2.88M/year on a function that, executed correctly, costs $50K/year. The delta — $2.83M — is not a savings line. It's a re-deployable budget the size of a small engineering team's annual cost. The math doesn't ask whether the migration is worth it. It asks how fast you can ship the swap.

Running the audit yourself

The cleanest version of this audit in our experience is four rows on a single spreadsheet:

Last 6 months of actual minute volume (invoice or usage console).
Last 6 months of actual STT spend across every line item: base STT, diarization, intelligence, real-time premium.
Blended price per minute — divide spend by volume; this is the only number that matters for the comparison.
Re-price that same volume at $0.00042/min and read the delta off the bottom-right cell.

That spreadsheet is the entire business case. The hard part of the migration is the spreadsheet, not the code; the code is one line.

The volume number nobody publishes

The line items inside the line item

What changes at $4,200/mo

Running the audit yourself

The cheapest minute on the market. 500 minutes free at signup, no card.