Implement a custom KV cache compression layer using the Shannon limit approach described. Test performance gains on standard 7B-70B models.