-
OpenLearnLM/deepseek_qwen3_8b_pedagogical_think_reward_grpo_step_300
8B • Updated • 1 -
OpenLearnLM/deepseek_qwen3_8b_pedagogical_think_noreward_grpo_step_300
8B • Updated -
OpenLearnLM/deepseek_qwen3_8b_think_noreward_grpo_step_300
8B • Updated -
OpenLearnLM/deepseek_qwen3_8b_think_reward_grpo_step_300
8B • Updated
OpenLearnLM
community
AI & ML interests
None defined yet.
Recent Activity
-
OpenLearnLM/deepseek_qwen3_8b_pedagogical_think_reward_grpo_step_300
8B • Updated • 1 -
OpenLearnLM/deepseek_qwen3_8b_pedagogical_think_noreward_grpo_step_300
8B • Updated -
OpenLearnLM/deepseek_qwen3_8b_think_noreward_grpo_step_300
8B • Updated -
OpenLearnLM/deepseek_qwen3_8b_think_reward_grpo_step_300
8B • Updated
models 6
OpenLearnLM/qwen2.5_7b_nothink_noreward_grpo_step_300
8B • Updated
OpenLearnLM/deepseek_qwen3_8b_think_reward_grpo_step_300
8B • Updated
OpenLearnLM/deepseek_qwen3_8b_think_noreward_grpo_step_300
8B • Updated
OpenLearnLM/deepseek_qwen3_8b_nothink_grpo_step_300
8B • Updated
OpenLearnLM/deepseek_qwen3_8b_pedagogical_think_reward_grpo_step_300
8B • Updated
• 1
OpenLearnLM/deepseek_qwen3_8b_pedagogical_think_noreward_grpo_step_300
8B • Updated
datasets 0
None public yet