Deploy San Francisco

The Conference for the Inference Era

The Conference for the Inference Era

April 28, 2026

San Francisco - Convene 100 Stockton

12:00pm - 8:00pm PT

Mainstage keynote streamed live

Save your spot

Real companies.

Real inference workloads.

Running today.

Join leaders from Character, Workato, VAST Data, Arcee, and the vLLM ecosystem as they break down how they are running AI in production at scale.

You’ll hear directly from:

Kari Briski (VP, Generative AI Software Products, NVIDIA)
Alon Horev (Co-Founder and CTO, VAST Data)
Simon Mo (CEO, Inferact)
Mark McQuade (CEO, Arcee)
James Groeneveld (Chief Architect, Character.ai)
Oscar Wu (AI Research Technical Lead, Workato)
Dan Grosu (CTO/CISO, ISMG)

The hardest problem in AI is no longer training models. It’s running inference in production: latency, throughput, system reliability, and unit economics, all at once.

Deploy is where you will see how leading AI-native companies solve this, through real architectures, real tradeoffs, and live systems running today.

Across sessions and hands-on demos, you’ll learn how teams are designing inference systems that scale without breaking performance or their margin structure.

Thank you to our Sponsors

Why Attend Deploy

See How Real AI Companies Run Inference at Scale Hear from companies already running inference at scale and learn how they handle spiking traffic, latency constraints, and real cost pressure.
Learn the Architecture Behind Built-for-Scale AI Products Routing, batching, caching, autoscaling, quantization, and GPU orchestration. Learn how these systems fit together effectively in real production stacks.

As workloads shift toward multimodal and agentic loops, the teams that win will be the ones who design the right systems around their models.
Control the Economics of AI Products If you can't deliver high inference performance with predictable unit costs at scale, you don't have a product. You have a burn rate.

Walk away with practical frameworks to improve TTFT, tail latency, throughput per GPU, and cost per outcome without getting surprised by volatility, egress, or runaway retries.
Meet the Teams Building the Next Generation of AI Products Connect with founders, CTOs, and engineering leaders running AI systems in production today. Compare approaches, share lessons learned, and connect with peers solving the same challenges of scaling inference reliably and cost-effectively.

Meet the Speakers

Alon Horev

Co-Founder & CTO, VAST Data

Kari Briski

VP, Generative AI Software Products, NVIDIA

Simon Mo

CEO, Inferact

Mark McQuade

CEO, Arcee

Debajyoti (Debo) Datta

Co-Founder, Hippocratic AI

Lauri J. Moore

Partner, Bessemer Venture Partners

James Groeneveld

Chief Architect, Character.ai

Paddy Srinivasan

CEO, DigitalOcean

Vanessa Larco

Co-Founder, Premise VC

Oscar Wu

AI Research Technical Lead, Workato

Angela Hoover

Co-Founder & CEO, Andi

Bob Van Luijt

CEO & Co-Founder, Weaviate

Vinay Kumar

CPTO, DigitalOcean

Santiago Zavala

Global Managing Partner, 500 Global

Dan Grosu

CTO/CISO, ISMG

Hovsep Seraydarian

Co-Founder, LawVo

Alex Mashrabov

CEO & Co-founder, Higgsfield AI

April Dagonese

Senior Director, Developer Advocacy, DigitalOcean

Peter Elias

Founder, Probably AI

Bob Rosin

Managing Partner, Antler

Laura Schaffer

Head of Growth & Marketing, DigitalOcean

Amit Jotwani

Staff Developer Advocate, DigitalOcean

Salman Paracha

SVP, Engineering, DigitalOcean

Archana Kamath

VP of Engineering, DigitalOcean

Meghan Grady

Senior Director, Marketing & Communications, DigitalOcean

Piyush Srivastava

Principal Engineer, DigitalOcean

Karthik Pandian

Senior Director of Engineering, DigitalOcean

Karnik Modi

Senior Manager, Engineering, DigitalOcean

Dinesh Murthy

Director of Product Management, DigitalOcean

View all speakers

Deploy 2026 Agenda

11:00 - 12:00pm

Registration & Refreshments in Pacific Gallery

12:00 - 1:15pm

Keynote

Speakers:: Laura Schaffer, Head of Growth & Marketing, DigitalOcean, Paddy Srinivasan, CEO, DigitalOcean, Vinay Kumar, CPTO, DigitalOcean

1:15 - 2:15pm

Lunch in Pacific Gallery

2:15 - 2:45pmInference Track

The Cost Cliff: Improve Your Tokenomics as you Grow, ft. Character.AI & Inferact

70% of AI spend is now inference. See how Character.AI partnered with Inferact and DigitalOcean to cut inference costs by 50%, while improving throughput, on AMD GPUs.

Speakers:: Archana Kamath, VP of Engineering, DigitalOceanPiyush Srivastava, Principal Engineer, DigitalOceanJames Groenveld, Chief Architect, Character.aiSimon Mo, CEO, Inferact

2:15 - 2:45pmAI at Scale Track

Built for Mass Scale: Hard-Won Lessons from Teams Running High Volume Inference Workloads in Production

Scaling inference isn't a model problem. It's a decisions problem. Industry leaders from Workato Research Lab, ISMG, and Hippocratic AI share the decisions, tradeoffs, and investments that got them to production AI at scale.

Moderator:: Karnik Modi, Senior Manager of Engineering, DigitalOcean

Panelists:: Oscar Wu, AI Research Technical Lead, WorkatoDan Grosu, CTO/CISO, ISMGDebo Datta, Co-Founder, Hippocratic AI

2:50 - 3:20pmInference Track

Open by Design: How NVIDIA and DigitalOcean Are Building the Stack for the Always-On Agentic Era

Kari Briski, VP Gen AI, NVIDIA, and Salman Paracha, SVP AI, DigitalOcean discuss why AI-native teams are demanding openness, model flexibility, and infrastructure built for agents that never sleep — and how the convergence of NVIDIA's software layer and DigitalOcean's platform is designed to meet exactly that moment.

Speakers:: Salman Paracha, SVP of Engineering, DigitalOceanKari Briski, VP of Generative AI Software Products, NVIDIA

2:50 - 3:20pmAI at Scale Track

Your Model Doesn't Matter. Your Infrastructure Does. with Weaviate

Everyone has access to the same models. So what actually matters? It's everything around them – routing requests to the right model, connecting to live data, scaling from prototype to production without ripping your code apart. We'll walk through the full journey: from a single API call to intelligent routing across GPU fleets, showing what it looks like when the platform owns the stack end to end. Live demos included.

Speakers:: Karthik Pandian, Senior Director of Engineering, DigitalOceanRyan O'Connor, Senior Developer Advocate, DigitalOceanBob Van Luijt, CEO & Co-Founder, Weaviate

3:20 - 3:50pm

30 min Break in Pacific Gallery + Technical AMAs

3:55 - 4:20pmInference Track

Inference Routing: Automate Choosing the Best Model for Every Task

Model selection, routing logic, and the economics of running AI at volume — a technical walkthrough including live demo of how the layer between your request and your response actually works.

Speakers:: Amit Jotwani, Staff Developer Advocate, DigitalOceanApril Dagonese, Senior Director of Developer Advocacy, DigitalOcean

3:55 - 4:20pmAI at Scale Track

AI Disruptors: How the Next Generation of Business is Being Built

Early-stage founders building real AI companies today—what they’re solving, what’s getting in the way, and how they’re pushing through it.

Moderator:: Dinesh Murthy, Director of Product Management, DigitalOcean

Panelists:: Peter Elias, Founder, Probably AIHovsep Seraydarian, Co-Founder, LawVoAlex Mashrabov, CEO & Co-founder, Higgsfield AIAngela Hoover, Co-Founder & CEO, Andi

4:25 - 4:55pmInference Track

The Inference Economy: How Venture Is Betting on the Agentic Era

Leading investors break down the economics of scaling AI in production, from infrastructure bottlenecks to open vs. closed ecosystems, and share their predictions for what the AI industry will look like in five years.

Moderator:: Meghan Grady, Senior Director of Marketing & Communications, DigitalOcean

Panelists:: Lauri Moore, Partner, Bessemer Venture PartnersBob Rosin, Managing Partner, AntlerVanessa Larco, Co-Founder, Premise VCSantiago Zavala, Global Managing Partner, 500 Global

5:00 - 8:00pm

AI Builders Happy Hour in Pacific Gallery

Sponsors Exhibiting; Startup Showcase

The Deploy 2026 agenda is taking shape — check back in for some more exciting updates! Across sessions, expect live demos and hands-on examples showing how these systems actually run in production.

Secure your seat in San Francisco

April 28, 2026 • 12:00pm – 8:00pm PT
📍 Convene 100 Stockton

Join the technical leaders and executives building the next generation of AI-native companies at Deploy, the Conference for the Inference Era.

Drinks are on us at the AI Builder's Mixer

Close out Deploy with the people building the future of AI. At 5pm, join the AI builder community for a casual mixer with demos, networking and more.

Save your spot

AI Builder's Mixer cocktail illustration

FAQ

When and where is Deploy?

Deploy 2026 will be hosted in person at Convene 100 Stockton, 40 O'Farrell St, San Francisco. The mainstage keynote will also be streamed live to registrants.

Who should attend Deploy?

Deploy is designed for teams responsible for managing or building AI workloads in production at scale.

Qualifying participants get $5,000 in promotional inference cloud credits when you deploy a qualifying AI workload. Terms Apply.

Why should I attend Deploy?

This is a special Deploy that represents an evolution of cloud infrastructure that will change the way companies with AI in production conceive of their businesses. DigitalOcean's vertically integrated agentic inference cloud delivers radically simple operations and predictable unit economics that will set AI-natives on a path to success and growth.

Is there a cost to attend?

No. Deploy is free to attend. See you in San Francisco.

Qualifying participants get $5,000 in promotional inference cloud credits when you deploy a qualifying AI workload. Terms Apply.

Is there a code of conduct?

Yes. Deploy follows the DigitalOcean Community Code of Conduct.