view post Post 1248 Excited to share that I've joined the Hugging Face Fellows program! 🤗Looking forward to contributing to & working more closely with the open-source ecosystem - huge thanks to everyone who's supported me on this journey! 🚀 See translation 🤗 2 2 + Reply
ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods Paper • 2110.02871 • Published Oct 6, 2021
MuPT: A Generative Symbolic Music Pretrained Transformer Paper • 2404.06393 • Published Apr 9, 2024 • 16
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation Paper • 2211.06687 • Published Nov 12, 2022 • 4
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
A Single Merging Suffices: Recovering Server-based Learning Performance in Decentralized Learning Paper • 2507.06542 • Published Jul 9
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation Paper • 2406.07529 • Published Jun 11, 2024
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping Paper • 2510.03230 • Published Oct 3 • 3
Chronological Thinking in Full-Duplex Spoken Dialogue Language Models Paper • 2510.05150 • Published Oct 2
Scope: Selective Cross-modal Orchestration of Visual Perception Experts Paper • 2510.12974 • Published Oct 14
InteractComp: Evaluating Search Agents With Ambiguous Queries Paper • 2510.24668 • Published Oct 28 • 97
view post Post 6006 Trained a model for emotion-controllable TTS based on MiMo audio on LAION's dataset.Still very early and does have an issue with hallucinating but results seem pretty good so far, given that it is very early into the training run.Will probably kick off a new run later with some settings tweaked.Put up a demo here: https://huggingface.co/spaces/mrfakename/EmoAct-MiMo(Turn 🔊 on to hear audio samples) See translation 5 replies · 🔥 12 12 + Reply