DIY LLMs – Hosting Your Own LLM Inference, From Silicon To Service

Share
Advertisement

Join us for the USF Data Science Speaker Series featuring Dr Charles Frye, Modal Developer Advocate!

Charles Frye is a passionate educator who specializes in teaching people to build AI applications. After publishing research in psychopharmacology & neurobiology, he got his Ph.D. at the University of California, Berkeley, for dissertation work on neural network optimization. He has taught thousands about the full stack of AI application development-from foundational linear algebra to advanced GPU techniques & creating defensible AI-driven businesses.

Charles will explore the essential components for running your own large language model (LLM) inference service. This talk will delve into:
Compute options: CPUs, GPUs, TPUs, & LPUs.
Model options: Qwen, LLaMA, & others.
Inference server options: TensorRT-LLM, vLLM, & SGLang.
Observability tools: OTel stack, LangSmith, W&B Weave, & Braintrust.

Don’t miss this opportunity to gain practical knowledge on building & hosting your own LLM services from a leading AI educator & expert!

#USFDataScienceSpeakerSeries #DataScience #MSDS #LLMs #AI #MachineLearning #AIApplications

Hey, we saw you’ve got
an ad blocker on.

We get it, but ads help keep this site running!
Please whitelist us so we can keep bringing you awesome content. Thank you!