News
Newest
Ask
Show
Jobs
Open on GitHub - (Updated The Title)
Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA
(github.com)
35 points | by
yu3zhou4
1 hour ago
2 comments
yu3zhou4
56 minutes ago
README is in my opinion (author here) the most interesting - I wrote it to help others build useful mental model to be able to recreate the project yourself, without need to even read my code
nazgulsenpai
54 minutes ago
I love the documentation formatted in lessons. I can't wait to read through it.
2 comments