hypedarhypedar
feedtrendsdiscovershowcasearchive
login
login
login
FeedTrendsDiscoverShowcaseArchiveDashboard
Submit Showcase

Trending now

Math + Games56Design + Ui + Agents51Fine Tuning + Reasoning + Inference47
View all trends →

hypedar

AI trend radar for developers. Catch emerging papers, repos, and discussions before the hype peaks.

AboutGitHubDiscord

By the makers of hypedar

Codepawl

Open-source tools for developers.

Explore our tools →
AboutPrivacyTermsX

© 2026 Codepawl

Built by Codepawl·© 2026

About·Terms·Privacy·Security

GitHub·Discord·X

feedtrendsdiscovershowcasearchive
← trends

Safety + Rl + Fine Tuning

19.0

Explore and implement AltTrain, a structural fine-tuning approach to reasoning models. This changes how models process logic to bake-in safety without sacrificing performance.

+0
emergingimplementation gap
rlreasoningfine-tuningsafetyagents

Signals (8)

arXiv7h ago

Human-Guided Harm Recovery for Computer Use Agents

arXiv1d ago

SaFeR-Steer: Evolving Multi-Turn MLLMs via Synthetic Bootstrapping and Feedback Dynamics

arXiv1d ago

Subliminal Transfer of Unsafe Behaviors in AI Agent Distillation

arXiv1d ago

Preregistered Belief Revision Contracts

arXiv7h ago

Reasoning Structure Matters for Safety Alignment of Reasoning Models

arXiv2d ago

Harmonizing Multi-Objective LLM Unlearning via Unified Domain Representation and Bidirectional Logit Distillation

arXiv1d ago

Shifting the Gradient: Understanding How Defensive Training Methods Protect Language Model Integrity

arXiv7h ago

ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System