SomosNLP

non-profit

https://somosnlp.org/

SomosNLP_

somosnlp

Activity Feed

AI & ML interests

Democratizar el PLN en español e incentivar su aplicación para generar impacto social 💛

espejelomar

posted an update 10 days ago

Post

4706

Sharing WorldForge with @abdelstark

It's an open-source Python project for evaluating and replaying robotics and world-model workflows.

The useful part is not only calling a model. WorldForge records the run, validates action shapes, translates outputs into actions, and keeps replay artifacts you can inspect later.

The current demo uses LeRobot + LeWorldModel on PushT through the official loader:

stable_worldmodel.policy.AutoCostModel("pusht/lewm")

The harness also has replay-only paths for Cosmos-Policy and GR00T-style outputs, so you can inspect the provider contract from saved artifacts without keeping a GPU server online.

Try it:

pip install worldforge-ai
uv run --extra harness worldforge-harness --flow robotics-compare

Repo: https://github.com/AbdelStark/worldforge
Docs: https://abdelstark.github.io/worldforge/

Pre-1.0, MIT, and actively looking for contributors. Good areas:
- robotics provider adapters
- replay artifacts
- eval flows
- docs & first-run demos

Good first issues: https://github.com/AbdelStark/worldforge/contribute

If you're building robot policy evals or model adapters, would love a PR — or an issue describing what's missing.

albertvillanova

posted an update 3 months ago

Post

2788

🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0

suchirsalhan

authored a paper 3 months ago

BabyLM Turns 4: Call for Papers for the 2026 BabyLM Workshop

Paper • 2602.20092 • Published Feb 23 • 1

mariagrandury

authored 2 papers 3 months ago

BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data

Paper • 2510.10159 • Published Oct 11, 2025 • 3

Measuring what Matters: Construct Validity in Large Language Model Benchmarks

Paper • 2511.04703 • Published Nov 3, 2025 • 8

lewtun

submitted 2 papers to Daily Papers 3 months ago

Single-minus gluon tree amplitudes are nonzero

Paper • 2602.12176 • Published Feb 12 • 8

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

Paper • 2602.03773 • Published Feb 3 • 13

albertvillanova

posted an update 4 months ago

Post

1977

5 years already working in democratizing AI 🤗
Grateful to be part of such an awesome team making it happen every day.

juanjucm

posted an update 4 months ago

Post

327

Last week,

zai-org dropped zai-org/GLM-4.7-Flash. Now, we bring it to Microsoft Foundry!

- 🏆 30B-A3B MoE, the strongest model in the 30B class. It excels at coding tasks, agentic workflows and reasoning.
- 🤏🏻 Lighter version of his 358B big brother, balancing performance and efficiency.

Not light enough for you? We are also adding

unsloth unsloth/GLM-4.7-Flash-GGUF to the catalog, with GPU and CPU support powered by llama.cpp 🔥

Go join the hype and deploy them from the Hugging Face collection on Microsoft Foundry!

2 replies

reddrex

updated a dataset 4 months ago

somosnlp/LingComp_QA

Viewer • Updated Jan 15 • 1k • 75 • 1

pcuenq

posted an update 5 months ago

Post

5006

👉 What happened in AI in 2025? 👈

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 — Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 — Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 — "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 — Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🤯

Credits
🙏 NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫡 @reach-vb for the original idea, design and recipe

🙌 @ariG23498 and yours truly for compiling and verifying the 2025 edition

🥳 Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! 🥂

3 replies

suchirsalhan

authored 4 papers 7 months ago

BLiSS 1.0: Evaluating Bilingual Learner Competence in Second Language Small Language Models

Paper • 2510.19419 • Published Oct 22, 2025 • 1

What is the Best Sequence Length for BABYLM?

Paper • 2510.19493 • Published Oct 22, 2025 • 1

Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction

Paper • 2510.20411 • Published Oct 23, 2025 • 2

BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data

Paper • 2510.10159 • Published Oct 11, 2025 • 3

suchirsalhan

authored 3 papers 8 months ago

Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling

Paper • 2510.08470 • Published Oct 9, 2025 • 1

Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research

Paper • 2509.16413 • Published Sep 19, 2025 • 1

Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages

Paper • 2509.02160 • Published Sep 2, 2025 • 1

mariagrandury

authored 2 papers 8 months ago

Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings

Paper • 2509.14405 • Published Sep 17, 2025 • 2

Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans

Paper • 2506.22439 • Published May 29, 2025 • 3

AI & ML interests

Team members 315

somosnlp's activity