| --- |
| datasets: |
| - XenArcAI/MathX-5M |
| base_model: |
| - google/gemma-3-1b-it |
| pipeline_tag: text-generation |
| --- |
| |
| # Model Card: Parveshiiii/M1-MathX |
|
|
| ## Model Details |
| - **Model Name:** Parveshiiii/M1-MathX |
| - **Base Architecture:** Gemma (1B parameters) |
| - **Model Type:** Causal Language Model (text-generation) |
| - **Training Framework:** Hugging Face Transformers |
| - **Precision:** fp16 |
| - **Attention Mechanism:** Hybrid sliding-window and full attention layers |
| - **Tokenizer:** Gemma tokenizer (vocab size 262,144) |
|
|
| ## Usage |
| ```python |
| from transformers import pipeline, TextStreamer |
| |
| pipe = pipeline("text-generation", model="Parveshiiii/M1-MathX") |
| messages = [ |
| {"role": "user", "content": "Who are you?"}, |
| ] |
| streamer = TextStreamer(pipe.tokenizer) |
| pipe(messages, streamer=streamer, max_new_tokens=10000) |
| ``` |
| ## Intended Use |
| - Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations. |
| - Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment. |
| - Not intended for general-purpose conversation or sensitive domains outside mathematics. |
|
|
| ## Training Data |
| - **Dataset:** MathX (curated mathematical reasoning dataset) |
| - **Samples Used:** ~300 |
| - **Training Steps:** 50 |
| - **Method:** GRPO (Group Relative Policy Optimization) fine-tuning |
| - **Objective:** Reinforcement-style alignment for improved reasoning clarity and correctness. |
|
|
| ## Performance |
| - Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks. |
| - Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets. |
| - Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison. |
|
|
| ## Limitations |
| - Small dataset and limited training steps mean coverage is narrow. |
| - May overfit to MathX patterns and fail on broader or more complex problems. |
| - Not guaranteed to generalize outside mathematical reasoning. |
| - As a 1B model, capacity is limited compared to larger LLMs. |
|
|
| ## Ethical Considerations |
| - Intended for safe educational use. |
| - Should not be deployed in high-stakes environments without further validation. |
| - Outputs may contain errors; human oversight is required. |
|
|
| ## Citation |
| If you use this model, please cite as: |
| ``` |
| @misc{Parvesh2025M1MathX, |
| author = {Parvesh Rawal}, |
| title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO}, |
| year = {2025}, |
| howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}} |
| } |
| ``` |
|
|
| --- |