Add installation and sample usage, and downstream tasks link to model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +75 -3
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
- license: apache-2.0
3
- library_name: torch
4
  base_model:
5
  - microsoft/wavlm-large
 
 
6
  pipeline_tag: audio-to-audio
7
  ---
8
 
@@ -20,15 +20,86 @@ This repository contains the **50 Hz causal checkpoint with a codebook size of 2
20
 
21
  - ๐ŸŒ **Project Page**: https://lucadellalib.github.io/focalcodec-web/
22
 
 
 
23
  - ๐Ÿ’พ **GitHub**: https://github.com/lucadellalib/focalcodec
24
 
25
  <img src="focalcodec-stream.png" width="700">
26
 
27
  ---------------------------------------------------------------------------------------------------------
28
 
 
 
 
 
 
 
 
 
 
 
29
  ## โ–ถ๏ธ Quickstart
30
 
31
- See the readme at: https://github.com/lucadellalib/focalcodec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  ---------------------------------------------------------------------------------------------------------
34
 
@@ -47,6 +118,7 @@ See the readme at: https://github.com/lucadellalib/focalcodec
47
  author = {Luca {Della Libera} and Cem Subakan and Mirco Ravanelli},
48
  journal = {arXiv preprint arXiv:2509.16195},
49
  year = {2025},
 
50
  }
51
  ```
52
 
 
1
  ---
 
 
2
  base_model:
3
  - microsoft/wavlm-large
4
+ library_name: torch
5
+ license: apache-2.0
6
  pipeline_tag: audio-to-audio
7
  ---
8
 
 
20
 
21
  - ๐ŸŒ **Project Page**: https://lucadellalib.github.io/focalcodec-web/
22
 
23
+ - ๐Ÿ”Š **Downstream Tasks**: https://github.com/lucadellalib/audiocodecs
24
+
25
  - ๐Ÿ’พ **GitHub**: https://github.com/lucadellalib/focalcodec
26
 
27
  <img src="focalcodec-stream.png" width="700">
28
 
29
  ---------------------------------------------------------------------------------------------------------
30
 
31
+ ## ๐Ÿ› ๏ธ Installation
32
+
33
+ First of all, install [Python 3.8 or later](https://www.python.org). Then, open a terminal and run:
34
+
35
+ ```bash
36
+ pip install huggingface-hub safetensors sounddevice soundfile torch torchaudio
37
+ ```
38
+
39
+ ---------------------------------------------------------------------------------------------------------
40
+
41
  ## โ–ถ๏ธ Quickstart
42
 
43
+ **NOTE**: the `audios` directory contains audio samples that you can download and use to test the codec.
44
+
45
+ You can easily load the model using `torch.hub` without cloning the repository:
46
+
47
+ ```python
48
+ import torch
49
+ import torchaudio
50
+
51
+ # Load FocalCodec model
52
+ codec = torch.hub.load(
53
+ repo_or_dir="lucadellalib/focalcodec",
54
+ model="focalcodec",
55
+ config="lucadellalib/focalcodec_50hz",
56
+ force_reload=True, # Fetch the latest FocalCodec version from Torch Hub
57
+ )
58
+ codec.eval().requires_grad_(False)
59
+
60
+ # Load and preprocess the input audio
61
+ audio_file = "audios/librispeech-dev-clean/251-118436-0003.wav"
62
+ sig, sample_rate = torchaudio.load(audio_file)
63
+ sig = torchaudio.functional.resample(sig, sample_rate, codec.sample_rate_input)
64
+
65
+ # Encode audio into tokens
66
+ toks = codec.sig_to_toks(sig) # Shape: (batch, time)
67
+ print(toks.shape)
68
+ print(toks)
69
+
70
+ # Convert tokens to their corresponding binary spherical codes
71
+ codes = codec.toks_to_codes(toks) # Shape: (batch, code_time, log2 codebook_size)
72
+ print(codes.shape)
73
+ print(codes)
74
+
75
+ # Decode tokens back into a waveform
76
+ rec_sig = codec.toks_to_sig(toks)
77
+
78
+ # Save the reconstructed audio
79
+ rec_sig = torchaudio.functional.resample(rec_sig, codec.sample_rate_output, sample_rate)
80
+ torchaudio.save("reconstruction.wav", rec_sig, sample_rate)
81
+ ```
82
+
83
+ Alternatively, you can install FocalCodec as a standard Python package using `pip`:
84
+
85
+ ```bash
86
+ pip install focalcodec@git+https://github.com/lucadellalib/focalcodec.git@main#egg=focalcodec
87
+ ```
88
+
89
+ Once installed, you can import it in your scripts:
90
+
91
+ ```python
92
+ import focalcodec
93
+
94
+ config = "lucadellalib/focalcodec_50hz"
95
+ codec = focalcodec.FocalCodec.from_pretrained(config)
96
+ ```
97
+
98
+ Check the code documentation for more details on model usage and available configurations.
99
+
100
+ **NOTE**: the initial **v0.0.1** release is still available at https://github.com/lucadellalib/focalcodec/tree/v0.0.1.
101
+ It can be loaded via `torch.hub` as `repo_or_dir="lucadellalib/focalcodec:v0.0.1"`, or installed via `pip` as
102
+ `focalcodec@git+https://github.com/lucadellalib/focalcodec.git@v0.0.1#egg=focalcodec`.
103
 
104
  ---------------------------------------------------------------------------------------------------------
105
 
 
118
  author = {Luca {Della Libera} and Cem Subakan and Mirco Ravanelli},
119
  journal = {arXiv preprint arXiv:2509.16195},
120
  year = {2025},
121
+
122
  }
123
  ```
124