arXiv3h ago

Knowledge Packs: Zero-Token Knowledge Delivery via KV Cache Injection

Andrey Pustovit

View original ↗

Analysis

Viral velocity

low

Implementation gapYES

Novelty9/10

Categoryblog

Topics

raginference

Opportunity Brief

Create a tool to pre-calculate and cache KV states as 'Knowledge Packs', injecting them directly into the model for RAG tasks without the token overhead. This is a game-changer for latency and cost.

Suggested repo: kpack

"RAG for free: deliver facts directly via KV cache injection."

Estimated effort: 70h