arxiv:2512.11192
Pedro Ortiz Suarez
AI & ML interests
Language modeling, parsing, sequence tagging, NER, historical languages.
Recent Activity
published
a dataset 18 days ago
commoncrawl/CommonLID updated
a dataset 19 days ago
commoncrawl/CommonLID authored
a paper
about 1 month ago
SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing