VisualSphinx
/

VisualSphinx-Difficulty-Tagging

Model card Files Files and versions

VisualSphinx-Difficulty-Tagging / README.md

EthanSta's picture

Update README.md

57e028b verified 10 months ago

|

history blame contribute delete

1.79 kB

	---
	license: mit
	datasets:
	- VisualSphinx/VisualSphinx-Seeds
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-VL-7B-Instruct
	---
	# 🦁 VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL
	VisualSphinx is the largest fully-synthetic open-source dataset providing vision logic puzzles. It consists of over 660K automatically generated logical visual puzzles. Each logical puzzle is grounded with an interpretable rule and accompanied by both correct answers and plausible distractors.
	- 🌐 [Project Website](https://visualsphinx.github.io/) - Learn more about VisualSphinx
	- 📖 [Technical Report](https://arxiv.org/abs/2505.23977) - Discover the methodology and technical details behind VisualSphinx
	- 🔧 [Github Repo](https://github.com/VisualSphinx/VisualSphinx) - Access the complete pipeline used to produce VisualSphinx-V1
	- 🤗 HF Datasets:
	- [VisualSphinx-V1 (Raw)](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-V1-Raw);
	- [VisualSphinx-V1 (For RL)](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-V1-RL-20K);
	- [VisualSphinx-V1 (Benchmark)](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-V1-Benchmark);
	- [VisualSphinx (Seeds)](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-Seeds);
	- [VisualSphinx (Rules)](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-V1-Rules).
	![VisualSphinx](https://visualsphinx.github.io/static/images/pipeline.jpg)

	## 📊 About This Model

	This model is used for tagging the difficulty of our [VisualSphinx-V1](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-V1-Raw) synthetic dataset. To train this model, we perform GRPO on Qwen/Qwen2.5-VL-7B-Instruct using our [seed dataset](https://huggingface.co/datasets/VisualSphinx/VisualSphinx-Seeds) for 256 steps.