Fully generalist synthetic dataset and SOTA small reasoners
AI & ML interests
Open Science LLMs
Recent Activity
Organization Card
PleIAs is a French private AI Lab training the next generation of Language Models for document processing.
PleIAs is committed to open science and has coordinated the release of some of the largest open corpus for pre-training.
For more information, visit our website : https://pleias.fr/
Contact us : contact@pleias.fr
spaces 7
Runtime error
2
baguettotron_demo
📜
Runtime error
4
Vintage OCR Corrector (GPU)
📜
Correct OCR errors in your text
Runtime error
7
Vintage OCR Corrector (CPU)
📜
Correct OCR errors in text
Build error
9
Finance Commons Explorer
💻
Browse finance datasets on Hugging Face
Runtime error
9
Reversed-Zotero
📜
models 29
PleIAs/Monad
Text Generation • 56.7M • Updated
• 1.46k • 66
PleIAs/Baguettotron
Text Generation • 0.3B • Updated
• 4.54k • 236
PleIAs/Baguettotron-GGUF
0.3B • Updated
• 1.23k • 9
PleIAs/celadon
Text Classification • 0.1B • Updated
• 53 • 36
PleIAs/OCRerrcr
Token Classification • 0.4B • Updated
• 26 • 14
PleIAs/ksante-colbert-small
33.4M • Updated
• 4 • 1
PleIAs/Pleias-RAG-350M
Text Generation • 0.4B • Updated
• 124 • 31
PleIAs/Pleias-RAG-1B
Text Generation • Updated
• 80 • 66
PleIAs/Pleias-RAG-1B-gguf
1B • Updated
• 141 • 10
PleIAs/Pleias-RAG-350M-gguf
0.4B • Updated
• 51 • 4
datasets 55
PleIAs/BSF_Redline
Updated
PleIAs/common_corpus
Viewer
• Updated
• 69.9k • 69.1k • 376
PleIAs/Japanese-PD
Viewer
• Updated
• 1.38M • 53
PleIAs/Arabic-PD
Viewer
• Updated
• 221k • 44
PleIAs/verse-wikisource
Preview
• Updated
• 23 • 2
PleIAs/SYNTH
Viewer
• Updated
• 68M • 62.5k • 250
PleIAs/Youtube-Commons-Audio-Sample-1000
Updated
• 10
PleIAs/gpt-oss20b-samples-dedup
Viewer
• Updated
• 179k • 103 • 5
PleIAs/Post-OCR-Correction
Viewer
• Updated
• 50.4k • 821 • 135
PleIAs/GoldenSwag
Viewer
• Updated
• 1.53k • 1.36k • 5