AI & ML interests

Open science and open source

Recent Activity

albertvillanova 
posted an update 1 day ago
view post
Post
903
🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0
albertvillanova 
posted an update 16 days ago
view post
Post
1672
5 years already working in democratizing AI 🤗
Grateful to be part of such an awesome team making it happen every day.
louisbrulenaudet 
posted an update 6 months ago
view post
Post
6304
Supercharge Apple’s Shortcuts using Cloudflare Workers and Gemini within minutes (and for free, up to 1,500 requests per day) ☁️✨

Hello everyone, last week, while experimenting for fun, I created an API that allows you to easily access AI models (in this case, Google's) from the Shortcut app in order to analyze data from my apps and make the most of it thanks to the generative capabilities of advanced models.

It costs me nothing, and I think it might be good to share it so that others can build on it.

In README.md, you will find everything you need to get started and put your own microservice into production, which you can call from the app’s HTTP request features.

You will simply be asked to have a free Cloudflare account and an API key obtained from Google's AI Studio.

Feel free to take a look and get back to me if you encounter any problems during deployment.

Here is the GitHub repo where you can find all the source code and run it on your own: https://github.com/louisbrulenaudet/genai-api
louisbrulenaudet 
posted an update 6 months ago
view post
Post
779
Although more and more code editors are aligning themselves with the AGENTS.md file standard, some still use specific nomenclatures that can make it difficult to maintain different configuration files when several people are working on the same project with different agents.

Bodyboard addresses this by generating canonical instructions for code helpers from a single AGENTS.md file, thereby streamlining the production of adapter outputs for Gemini CLI, Copilot, Cline, Claude, Rules, Windsurf, and OpenAI Codex integrations.

You just have to:
npm install -g bodyboard

Then run, at the root of your project:
bodyboard all

Link to npm: https://www.npmjs.com/package/bodyboard
Link to the GitHub repo: https://github.com/louisbrulenaudet/bodyboard

It's a very simple project, but it addresses certain issues I've encountered, so why not make it available to everyone...

If you have other ideas for adapters to create, feel free to open a PR on the GitHub repo.
albertvillanova 
posted an update 7 months ago
view post
Post
4597
Latest smolagents release supports GPT-5: build agents that think, plan, and act.
⚡ Upgrade now and put GPT-5 to work!
  • 1 reply
·
albertvillanova 
posted an update 7 months ago
view post
Post
698
🚀 smolagents v1.21.0 is here!
Now with improved safety in the local Python executor: dunder calls are blocked!
⚠️ Still, not fully isolated: for untrusted code, use a remote executor instead: Docker, E2B, Wasm.
✨ Many bug fixes: more reliable code.
👉 https://github.com/huggingface/smolagents/releases/tag/v1.21.0
albertvillanova 
posted an update 8 months ago
view post
Post
814
🚀 New in smolagents v1.20.0: Remote Python Execution via WebAssembly (Wasm)

We've just merged a major new capability into the smolagents framework: the CodeAgent can now execute Python code remotely in a secure, sandboxed WebAssembly environment!

🔧 Powered by Pyodide and Deno, this new WasmExecutor lets your agent-generated Python code run safely: without relying on Docker or local execution.

Why this matters:
✅ Isolated execution = no host access
✅ No need for Python on the user's machine
✅ Safer evaluation of arbitrary code
✅ Compatible with serverless / edge agent workloads
✅ Ideal for constrained or untrusted environments

This is just the beginning: a focused initial implementation with known limitations. A solid MVP designed for secure, sandboxed use cases. 💡

💡 We're inviting the open-source community to help evolve this executor:
• Tackle more advanced Python features
• Expand compatibility
• Add test coverage
• Shape the next-gen secure agent runtime

🔗 Check out the PR: https://github.com/huggingface/smolagents/pull/1261

Let's reimagine what agent-driven Python execution can look like: remote-first, wasm-secure, and community-built.

This feature is live in smolagents v1.20.0!
Try it out.
Break things. Extend it. Give us feedback.
Let's build safer, smarter agents; together 🧠⚙️

👉 https://github.com/huggingface/smolagents/releases/tag/v1.20.0

#smolagents #WebAssembly #Python #AIagents #Pyodide #Deno #OpenSource #HuggingFace #AgenticAI
louisbrulenaudet 
posted an update 8 months ago
view post
Post
2861
Because hackathons are often the starting point for many AI projects, I've created a Python-backend template incorporating my feedback to streamline collaboration and urgent deployments 🏎️

Within a year, I had the opportunity to participate in hackathons organized by Mistral, OpenAI, and DeepMind and this GitHub template is structured around several fundamental building blocks and recommendations I offer developers eager to participate in their first hackathon, whether as part of a team or individually. Its emphasis is on rapid setup and deployment through:
- uv as a package manager, simplifying usage via a series of pre-configured make commands.
- FastAPI for API management, structured in a modular architecture designed to minimize branch conflicts during merges to main branches (using minimal health-check and ping routes to verify Docker’s proper execution and backend accessibility on the local network).
- Pydantic for validation and type handling, which simplifies debugging and enhances understanding of data objects.
- A set of custom instructions tailored for agents (Cline and GitHub Copilot), aimed at improving overall comprehension of the application and optimizing the vibe-coding experience.

This template includes unit tests with a 100% success rate and test coverage, as well as a minimal CI file ensuring that the FastAPI application runs correctly. Thus, merging code that breaks the server into production becomes impossible ⛔️

In general, I would reiterate an essential piece of advice: your two main adversaries are branch conflicts—particularly when the same file is modified concurrently within a brief period, especially if your architecture isn’t built for scalability—and deployment issues under urgent circumstances ⏱️

Link to GitHub: https://github.com/louisbrulenaudet/hackathon-backend

Simply issue these commands and you can ship your code at the speed of light:
make init
make dev
albertvillanova 
posted an update 8 months ago
view post
Post
1836
🚀 SmolAgents v1.19.0 is live!
This release brings major improvements to agent flexibility, UI usability, streaming architecture, and developer experience: making it easier than ever to build smart, interactive AI agents. Here's what's new:

🔧 Agent Upgrades
- Support for managed agents in ToolCallingAgent
- Context manager support for cleaner agent lifecycle handling
- Output formatting now uses XML tags for consistency

🖥️ UI Enhancements
- GradioUI now supports reset_agent_memory: perfect for fresh starts in dev & demos.

🔄 Streaming Refactor
- Streaming event aggregation moved off the Model class
- ➡️ Better architecture & maintainability

📦 Output Tracking
- CodeAgent outputs are now stored in ActionStep
- ✅ More visibility and structure to agent decisions

🐛 Bug Fixes
- Smarter planning logic
- Cleaner Docker logs
- Better prompt formatting for additional_args
- Safer internal functions and final answer matching

📚 Docs Improvements
- Added quickstart examples with tool usage
- One-click Colab launch buttons
- Expanded reference docs (AgentMemory, GradioUI docstrings)
- Fixed broken links and migrated to .md format

🔗 Full release notes:
https://github.com/huggingface/smolagents/releases/tag/v1.19.0

💬 Try it out, explore the new features, and let us know what you build!

#smolagents #opensource #AIagents #LLM #HuggingFace
louisbrulenaudet 
posted an update 8 months ago
view post
Post
1245
🌐 Clinical Trials Dataset now available on Hugging Face! 🧬

I’ve just released a comprehensive, ML-ready dataset featuring 500,000+ clinical trial records sourced directly from ClinicalTrials.gov for biomedical NLP, healthcare analytics, and clinical research applications 🤗

I wanted to produce the most complete and up-to-date dump with all raw data partially flattened to simplify extraction, self-querying and processing.

Do you have any ideas about what we can do with it? Using descriptions to enhance specialized embedding models?

louisbrulenaudet/clinical-trials
albertvillanova 
posted an update 9 months ago
albertvillanova 
posted an update 10 months ago
Yosun 
posted an update 10 months ago
view post
Post
551
Is it possible to pay for more ZeroGPU usage quota?
·
albertvillanova 
posted an update 10 months ago
view post
Post
2903
smolagents v1.14.0 is out! 🚀
🔌 MCPClient: A sleek new client for connecting to remote MCP servers, making integrations more flexible and scalable.
🪨 Amazon Bedrock: Native support for Bedrock-hosted models.
SmolAgents is now more powerful, flexible, and enterprise-ready. 💼

Full release 👉 https://github.com/huggingface/smolagents/releases/tag/v1.14.0
#smolagents #LLM #AgenticAI
lianghsun 
posted an update 11 months ago
view post
Post
3726

With the arrival of Twinkle April — Twinkle AI’s annual open-source celebration held every April — our community is excited to unveil its very first project:

📊 Twinkle Eval (https://github.com/ai-twinkle/Eval), a next-generation evaluation tool led by our contributor @tedslin .

Unlike traditional evaluation tools like iKala’s ievals (https://github.com/ikala-ai/ievals), which can only evaluate language models (LMs) one sample at a time, Twinkle Eval is designed with Large Reasoning Models (LRMs) in mind. As reasoning time increases with more complex models, traditional tools become increasingly inefficient 😲 — for example, evaluating LRMs on the ikala/tmmluplus benchmark could take *
half a day without finishing.

One question we were especially curious about:
Does shuffling multiple-choice answer order impact model accuracy? 🤔
→ See: "Change Answer Order Can Decrease MMLU Accuracy" – arXiv:2406.19470v1

To address these challenges, Twinkle Eval brings three key innovations to the table:

1️⃣ Parallelized evaluation of samples
2️⃣ Multi-round testing for stability
3️⃣ Randomized answer order to test robustness

After running experiments, we observed that Twinkle Eval can speed up evaluation by up to 15× 🚀🚀. Interestingly, most models scored slightly lower under the 2️⃣3️⃣ test settings compared to their claimed performance — suggesting further benchmarking is needed.

This framework also comes with additional tunable parameters and detailed logging of LM behavior per question — perfect for those who want to dive deeper. 😆

If you find Twinkle Eval useful, please ⭐ the project and help spread the word 🤗
·
emre 
posted an update 11 months ago
view post
Post
3816
having trouble with auto train
hello there this is the first time i am testing auto train with a 1.8k SFT dataset. Howevery i am not quite sure the training is going smooth. Logs seem quite confusing, token did not match can not auth, generates confusing train splits, do you know how i can check my running job properly?
what is being used for training as data?
any ideas?
  • 1 reply
·