Implement a linear-memory distillation method to preserve short-text performance when extending model context windows.