content goes here. To create a button please use class="link-btn".
Wednesday, February 23, 2022 | 10:00am - 11:00am PT
AI inference can deliver faster, more accurate predictions to organizations of all sizes—but building a platform for production AI inference is hard.
Real-world use cases require different types of AI model architectures, and the models can contain hundreds of millions of parameters.
Models are trained in different frameworks (TensorFlow, PyTorch, XGBoost, Python, and others) and have different formats.
Applications have different requirements (real-time low latency, high-throughput batch, or streaming inputs), and then there are different execution environments (CPUs, GPUs, in the cloud, on premises, at the edge).
High-performance inference on specific hardware or in-framework is challenging because of competing constraints like latency, accuracy, throughput, and memory size that modern AI applications demand.
Join our webinar to explore how NVIDIA’s inference solution, including
open-source NVIDIA Triton™ Inference Server and NVIDIA® TensorRT™, delivers fast and scalable AI inference in production.
Join us after the presentation for a live Q&A session.
Join us after the presentation for a live Q&A session.
content goes here
Content goes here
Content goes here
Webinar: Description here
Date & Time: Wednesday, April 22, 2018