Fine Tuning + Reasoning

76.0

Build a modular tool for designing multimodal data mixtures that allows for 'benchmark-targeted' training recipes. This simplifies the black art of balancing text, image, and video data for LLM training.

+1035

activeimplementation gap

fine-tuningsymbolictrainingembeddingsreasoningmultimodaltabulardata-mixtureinferencesynthetic-data

Signals (9)

HuggingFace1d ago

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

aws ai blog18h ago

Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock

arXiv1d ago

Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation

arXiv1d ago

ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold

arXiv5h ago

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

Google AI17h ago

New ways to create personalized images in the Gemini app

YHN19h ago

Qwen3.6-35B-A3B: Agentic coding power, now open to all

arXiv5h ago

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

arXiv5h ago

Fine Tuning + Reasoning

Signals (9)

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock

Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation

ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

New ways to create personalized images in the Gemini app

Qwen3.6-35B-A3B: Agentic coding power, now open to all

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification