DigitalOcean’s Inference Cloud Platform Doubles Throughput for Character.ai, Cutting Cost per Token by 50%

DigitalOcean announced that its inference‑cloud platform, powered by AMD Instinct MI300X and MI325X GPUs, has doubled production inference throughput for Character.ai, a leading AI entertainment service that processes more than a billion queries per day. The upgrade also reduced Character.ai’s cost per token by 50%, allowing the company to serve its massive user base at a lower expense while maintaining strict latency targets.

The performance gains were achieved through a close collaboration with AMD, optimizing the ROCm driver stack, the vLLM inference engine, and the AITER runtime. DigitalOcean’s hardware‑aware scheduler and unified orchestration layer coordinated distributed inference across multiple GPU nodes, ensuring that the higher‑throughput workloads could be balanced without sacrificing response time. The MI325X’s larger memory pool and higher memory bandwidth were key enablers for handling the large language models that Character.ai runs at scale.

For DigitalOcean, the result demonstrates the effectiveness of its full‑stack AI strategy. By delivering a 2× throughput increase and a 50% cost reduction, DigitalOcean shows that it can compete with hyperscalers on performance while offering a simpler, developer‑friendly experience. The partnership also expands DigitalOcean’s usable capacity for high‑volume customers, reinforcing its positioning as a cost‑effective alternative for inference workloads that demand both speed and predictability.

The achievement aligns with DigitalOcean’s broader AI momentum. In Q1 2025, the company reported a 14% year‑over‑year revenue rise to $211 million and an AI annual recurring revenue growth of more than 160% YoY. The Character.ai milestone is a tangible example of how the company’s AI platform is translating into real‑world performance gains that can drive future revenue and margin expansion.

In the competitive landscape, DigitalOcean’s focus on a tightly integrated hardware‑software stack differentiates it from the larger hyperscalers, which typically offer more fragmented services. By leveraging AMD’s MI300X/MI325X GPUs and its own orchestration layer, DigitalOcean can deliver high‑throughput inference at a lower total cost of ownership, appealing to mid‑market enterprises that need AI capabilities without the complexity of multi‑cloud deployments.

The content on BeyondSPX is for informational purposes only and should not be construed as financial or investment advice. We are not financial advisors. Consult with a qualified professional before making any investment decisions. Any actions you take based on information from this site are solely at your own risk.