A Practitioner's Guide to A2P SMS Delivery
What actually breaks in A2P SMS at scale — DLRs, throughput, sender IDs, registration, and the operational habits that keep delivery healthy.
Why "send the SMS" is the easy part
Most messaging platforms make the API call look trivial: one POST, one message, one delivery report. In production, the gap between "accepted by the carrier" and "read by the user" is where revenue, OTPs, and reputation actually live.
This guide covers the operational realities of Application-to-Person (A2P) SMS at scale — the things vendors don't put on their landing pages.
DLRs are not delivery
A delivery receipt (DLR) tells you the carrier acknowledged the message. It does not tell you the handset rendered it. A "DELIVRD" status from a poorly behaved route can sit alongside a 30% silent-failure rate caused by:
- Spam filtering at the operator
- Aggregator-level throttling that quietly drops over-quota traffic
- Handset-level blocklists on unregistered sender IDs
- Number portability mismatches that route to a dead range
The only honest delivery signal is conversion — did the user click the link, enter the OTP, or reply. Everything else is a proxy. Build dashboards that correlate DLR status with downstream conversion per route, not aggregate.
Throughput is a contract, not a number
When a vendor advertises "200 messages/second", read the small print. That figure is usually:
- Per connection, not per account
- Subject to fair-use throttling
- Negotiated separately for short codes vs. long codes vs. alphanumeric senders
- Different again for the first hour of a campaign vs. steady state
For OTP traffic, peak-second capacity matters more than daily volume. A login spike of 5,000 OTPs in 30 seconds will hit ceilings that a 500,000/day batch never touches. Always pressure-test with realistic burst patterns, not steady-state averages.
Sender ID, registration, and the 10DLC tax
In the US, A2P traffic on long codes now requires 10DLC registration with The Campaign Registry. Unregistered traffic gets filtered, throttled, or surcharged. In the UK and most of Europe, alphanumeric sender IDs need pre-registration with each MNO — and the lists differ.
Operationally this means:
- A change of sender ID is a 2–6 week project, not a config flip
- Each market needs its own registration calendar
- Onboarding a new vendor for failover requires duplicating registrations end-to-end
Plan registrations ahead of campaigns. Maintain a "warm" pool of pre-registered sender IDs you can rotate to during incidents.
Routing decisions that actually matter
Once traffic exceeds a few million messages a month, routing becomes the highest-leverage lever. The decisions that move the needle:
- Direct vs. aggregator per destination — direct connections give better DLR fidelity but require operational overhead per market.
- Primary + warm backup per route — not just one fallback, but a vendor you actually send live traffic to so it stays warm.
- Per-traffic-class routing — transactional (OTP) and marketing should never share a route. Marketing surges hurt OTP latency.
- Cost-per-converted-message, not cost-per-message — the cheapest route is the one whose users actually receive the message.
What an "incident" looks like
Production messaging incidents rarely look like outages. They look like:
- DLRs at 99% but conversion at 60% on one specific operator
- A 200ms increase in OTP delivery time in one country, breaking the 60-second TOTP window
- A vendor silently re-routing your traffic through a third party at month-end
- Carrier filter rules changing without notice after a regulator advisory
Detection requires per-route, per-operator, per-traffic-class observability. Resolution requires a vendor relationship where you can escalate within minutes, not days.
What to put in your runbook
If you operate messaging in-house, your runbook should answer:
- Which vendor handles each market today, and which is the failover?
- What's the threshold to flip routing? Who authorises it?
- Who at each vendor takes a 3am call?
- What's the maximum acceptable conversion drop per market before escalation?
- How long does a sender-ID rotation take if the current one is suddenly filtered?
Most teams discover they don't have answers when they need them.
The build vs. operate decision
Building messaging connectivity is straightforward. Operating it — with the registrations, the routing, the vendor escalations, the per-operator anomaly detection — is a full-time discipline that scales with the number of markets and traffic classes.
That's the gap Flowstates fills. We run the gateway, the vendor relationships, and the operations so your team can focus on the product the messages are actually for.
If your A2P traffic has grown past the point where someone on your team can keep all of this in their head, book a 30-minute messaging review — we'll walk through your current setup and where the operational risk is.