Implement a lightweight compression wrapper for checkpointing using nvCOMP. This is a huge win for distributed training budgets and storage optimization.