Infrastructure Economics: Technical Strategies for Cost-Efficient AI Scaling

About the webinar

Over 80% of businesses have embraced AI to some extent, with 35% of these companies utilizing AI across multiple departments. AI is becoming central to business operations, and the demand is increasing rapidly. But according to the research, most of the demand (nearly 85%) will be for inference, either in the data center or on edge.

It is essential for organizations to find cost-effective solutions to scale inference, while balancing speed and accuracy. In this webinar, AI and infrastructure experts from InfraCloud and Baseten break down the economic complexities of large-scale inference. Learn practical strategies to optimize infrastructure, choose the right models, and streamline operations to scale the AI inference seamlessly.

What to expect

Infrastructure Optimization: Rightsizing resources, auto-scaling, multi-cloud approaches, and cost attribution and budgeting.
Cost-Effective Model Deployment: Choosing between smaller specialized models and large general models, optimizing request batching, and calculating RoI on specialized hardware.
Operational Best Practices: Implementing DevOps techniques, Infrastructure as Code, and monitoring systems to track and control costs in real time.
Expert Insights: Real-world strategies from AI and cloud experts who have helped organizations cut inference expenses while maintaining performance.

Meet the Speakers

Atulpriya Sharma

Sr. Dev Advocate @ InfraCloud

Host

Manual tester turned developer advocate. He talks about Cloud Native, Kubernetes, AI & MLOps to help other developers and organizations adopt cloud native. He is also a CNCF Ambassador and the organizer of CNCF Hyderabad.

Philip Kiely

Developer Advocate @ Baseten

Speaker

Philip Kiely leads Developer Relations at Baseten, the leading inference platform for AI-native applications. With a strong background in documentation and developer experience, he helps engineers and organizations navigate the complexities of AI inference.

Aman Juneja

Principal Solutions Engineer @ InfraCloud

Speaker

Aman specializes in AI Cloud solutions and cloud native design, bringing extensive expertise in containerization, microservices, and serverless computing. His current focus lies in exploring AI Cloud technologies and developing AI applications using cloud native architectures.

Other webinars you might enjoy

Architecting Modern AI Systems: A Microservices Approach

The Modern 10x Engineer - Code Generation & Beyond

Need a clear starting point to build your own AI lab?

Leverage our AI stack charts to empower your team with faster, more efficient AI service deployment on Kubernetes.

Download AI stack charts