
Alon Horev
Co-Founder & CTO, VAST Data
The Conference for the Inference Era
The Conference for the Inference Era
Join leaders from Character, Workato, VAST Data, Arcee, and the vLLM ecosystem as they break down how they are running AI in production at scale.
You’ll hear directly from:
The hardest problem in AI is no longer training models. It’s running inference in production: latency, throughput, system reliability, and unit economics, all at once.
Deploy is where you will see how leading AI-native companies solve this, through real architectures, real tradeoffs, and live systems running today.
Across sessions and hands-on demos, you’ll learn how teams are designing inference systems that scale without breaking performance or their margin structure.


Co-Founder & CTO, VAST Data

VP, Generative AI Software Products, NVIDIA

CEO, Inferact

CEO, Arcee

Co-Founder, Hippocratic AI

Partner, Bessemer Venture Partners

Chief Architect, Character.ai

CEO, DigitalOcean

Co-Founder, Premise VC

AI Research Technical Lead, Workato

Co-Founder & CEO, Andi

CEO & Co-Founder, Weaviate

CPTO, DigitalOcean

Global Managing Partner, 500 Global

CTO/CISO, ISMG

Co-Founder, LawVo

CEO & Co-founder, Higgsfield AI

Senior Director, Developer Advocacy, DigitalOcean

Founder, Probably AI

Managing Partner, Antler

Head of Growth & Marketing, DigitalOcean

Staff Developer Advocate, DigitalOcean

SVP, Engineering, DigitalOcean

VP of Engineering, DigitalOcean

Senior Director, Marketing & Communications, DigitalOcean

Principal Engineer, DigitalOcean

Senior Director of Engineering, DigitalOcean

Senior Manager, Engineering, DigitalOcean

Director of Product Management, DigitalOcean
70% of AI spend is now inference. See how Character.AI partnered with Inferact and DigitalOcean to cut inference costs by 50%, while improving throughput, on AMD GPUs.
Scaling inference isn't a model problem. It's a decisions problem. Industry leaders from Workato Research Lab, ISMG, and Hippocratic AI share the decisions, tradeoffs, and investments that got them to production AI at scale.
Kari Briski, VP Gen AI, NVIDIA, and Salman Paracha, SVP AI, DigitalOcean discuss why AI-native teams are demanding openness, model flexibility, and infrastructure built for agents that never sleep — and how the convergence of NVIDIA's software layer and DigitalOcean's platform is designed to meet exactly that moment.
Everyone has access to the same models. So what actually matters? It's everything around them – routing requests to the right model, connecting to live data, scaling from prototype to production without ripping your code apart. We'll walk through the full journey: from a single API call to intelligent routing across GPU fleets, showing what it looks like when the platform owns the stack end to end. Live demos included.
Model selection, routing logic, and the economics of running AI at volume — a technical walkthrough including live demo of how the layer between your request and your response actually works.
Early-stage founders building real AI companies today—what they’re solving, what’s getting in the way, and how they’re pushing through it.
Leading investors break down the economics of scaling AI in production, from infrastructure bottlenecks to open vs. closed ecosystems, and share their predictions for what the AI industry will look like in five years.
Sponsors Exhibiting; Startup Showcase
The Deploy 2026 agenda is taking shape — check back in for some more exciting updates! Across sessions, expect live demos and hands-on examples showing how these systems actually run in production.

April 28, 2026 • 12:00pm – 8:00pm PT
📍 Convene 100 Stockton
Join the technical leaders and executives building the next generation of AI-native companies at Deploy, the Conference for the Inference Era.
Close out Deploy with the people building the future of AI. At 5pm, join the AI builder community for a casual mixer with demos, networking and more.

Deploy 2026 will be hosted in person at Convene 100 Stockton, 40 O'Farrell St, San Francisco. The mainstage keynote will also be streamed live to registrants.
Deploy is designed for teams responsible for managing or building AI workloads in production at scale.
Qualifying participants get $5,000 in promotional inference cloud credits when you deploy a qualifying AI workload. Terms Apply.
This is a special Deploy that represents an evolution of cloud infrastructure that will change the way companies with AI in production conceive of their businesses. DigitalOcean's vertically integrated agentic inference cloud delivers radically simple operations and predictable unit economics that will set AI-natives on a path to success and growth.
No. Deploy is free to attend. See you in San Francisco.
Qualifying participants get $5,000 in promotional inference cloud credits when you deploy a qualifying AI workload. Terms Apply.
Yes. Deploy follows the DigitalOcean Community Code of Conduct.