Project Awesome project awesome

Pretrained Models and Inference > vllm

A high-throughput and memory-efficient inference and serving engine for LLMs.

Package 74.1k stars GitHub
Back to Python