Pruning + Optimization

10.0

Build a unified pipeline that optimizes LLMs using both structured pruning and dynamic prompt compression via compressed sensing. Developers need a way to chain these two techniques without destroying model performance during inference.

+0

emergingimplementation gap

pruningcompressionquantizationoptimizationinference

Signals (2)

arXiv5h ago

Compressed-Sensing-Guided, Inference-Aware Structured Reduction for Large Language Models

arXiv1d ago

Pruning + Optimization

Signals (2)

Compressed-Sensing-Guided, Inference-Aware Structured Reduction for Large Language Models

MOONSHOT : A Framework for Multi-Objective Pruning of Vision and Large Language Models