File size: 4,593 Bytes
31d3d96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70d6715
 
 
 
8e29cf7
70d6715
 
8e29cf7
70d6715
8e29cf7
70d6715
8e29cf7
70d6715
8e29cf7
 
 
70d6715
8e29cf7
 
 
70d6715
8e29cf7
70d6715
8e29cf7
 
 
 
 
 
 
 
70d6715
8e29cf7
 
 
70d6715
8e29cf7
 
 
70d6715
 
 
8e29cf7
 
 
 
 
 
 
 
 
 
 
 
 
 
31d3d96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8e29cf7
31d3d96
8e29cf7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70d6715
8e29cf7
70d6715
8e29cf7
70d6715
8e29cf7
70d6715
8e29cf7
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: Fin-ModernBERT
  results: []
datasets:
- clapAI/FinData-dedup
language:
- en
pipeline_tag: fill-mask
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->


# Fin-ModernBERT

Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora.

---

## Model Description

- **Base model:** ModernBERT-base (context length = 1024)  
- **Domain:** Finance, Stock Market, Cryptocurrency  
- **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.)  

---

## Training Data

We collected and combined multiple publicly available finance-related datasets, including:

- [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News)  
- [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets)  
- [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets)  
- [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels)  
- [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news)  
- [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq)  
- [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance)  
- [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets)  

After aggregation, we obtained **~50M financial records**.  
A deduplication process reduced this to **~20M records**, available at:  
👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup)

---

## Training Hyperparameters

The following hyperparameters were used during training:

- **Learning rate:** 2e-4  
- **Train batch size:** 24  
- **Eval batch size:** 24  
- **Seed:** 0  
- **Gradient accumulation steps:** 128  
- **Effective total train batch size:** 3072  
- **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08  
- **LR scheduler:** Linear  
- **Epochs:** 1  

---

## Evaluation Benchmarks

We benchmarked **Fin-ModernBERT** against two strong baselines:  
- [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert)  
- [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)  

### Fine-tuning Setup
All models were fine-tuned under the same configuration:  
- **Optimizer:** AdamW  
- **Learning rate:** 5e-5  
- **Batch size:** 16  
- **Epochs:** 5  
- **Scheduler:** Linear  

### Results

| Dataset | Metric | FinBERT (ProsusAI) | ModernBERT-base | Fin-ModernBERT |
|---------|--------|---------------------|-----------------|----------------|
| CIKM (datht/fin-cikm) | F1-score | 42.77 | 53.08 | **54.89** |
| PhraseBank (soumakchak/phrasebank) | F1-score | 86.33 | 85.03 | **88.09** |

> Further evaluations on additional datasets and tasks are ongoing to provide a more comprehensive view of its performance.


---

## Use Cases

Fin-ModernBERT can be used for various financial NLP applications, such as:

- **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets)  
- **Event-driven Stock Prediction**  
- **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments)  
- **Document Classification & Clustering**  
- **Question Answering over financial reports and news**  

---

## How to Use

```python
from transformers import AutoTokenizer, AutoModel

model_name = "clapAI/Fin-ModernBERT"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

text = "Federal Reserve hints at possible interest rate cuts."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

```

## Citation

If you use this model, please cite:

```@misc{finmodernbert2025,
  title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain},
  author={ClapAI},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}}
}