Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Cactooz
/
DeepMMAudio
like
2
Video-Text-to-Text
Loie/VGGSound
CLAPv2/Clotho
cvssp/WavCaps
video-to-audio
License:
mit
Model card
Files
Files and versions
xet
Community
Code:
https://github.com/Cactooz/DeepMMAudio
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference Providers
NEW
Video-Text-to-Text
This model isn't deployed by any Inference Provider.
🙋
1
Ask for provider support
Model tree for
Cactooz/DeepMMAudio
Base model
hkchengrex/MMAudio
Finetuned
(
3
)
this model
Datasets used to train
Cactooz/DeepMMAudio
cvssp/WavCaps
Viewer
•
Updated
Jul 6, 2023
•
1
•
5.42k
•
54
Loie/VGGSound
Viewer
•
Updated
Mar 26, 2023
•
1
•
3.86k
•
49
CLAPv2/Clotho
Viewer
•
Updated
Mar 12, 2025
•
5.93k
•
940
•
3