File size: 4,593 Bytes

---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: Fin-ModernBERT
  results: []
datasets:
- clapAI/FinData-dedup
language:
- en
pipeline_tag: fill-mask
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->


# Fin-ModernBERT

Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora.

---

## Model Description

- **Base model:** ModernBERT-base (context length = 1024)  
- **Domain:** Finance, Stock Market, Cryptocurrency  
- **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.)  

---

## Training Data

We collected and combined multiple publicly available finance-related datasets, including:

- [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News)  
- [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets)  
- [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets)  
- [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels)  
- [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news)  
- [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq)  
- [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance)  
- [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets)  

After aggregation, we obtained **~50M financial records**.  
A deduplication process reduced this to **~20M records**, available at:  
👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup)

---

## Training Hyperparameters

The following hyperparameters were used during training:

- **Learning rate:** 2e-4  
- **Train batch size:** 24  
- **Eval batch size:** 24  
- **Seed:** 0  
- **Gradient accumulation steps:** 128  
- **Effective total train batch size:** 3072  
- **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08  
- **LR scheduler:** Linear  
- **Epochs:** 1  

---

## Evaluation Benchmarks

We benchmarked **Fin-ModernBERT** against two strong baselines:  
- [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert)  
- [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)  

### Fine-tuning Setup
All models were fine-tuned under the same configuration:  
- **Optimizer:** AdamW  
- **Learning rate:** 5e-5  
- **Batch size:** 16  
- **Epochs:** 5  
- **Scheduler:** Linear  

### Results

| Dataset | Metric | FinBERT (ProsusAI) | ModernBERT-base | Fin-ModernBERT |
|---------|--------|---------------------|-----------------|----------------|
| CIKM (datht/fin-cikm) | F1-score | 42.77 | 53.08 | **54.89** |
| PhraseBank (soumakchak/phrasebank) | F1-score | 86.33 | 85.03 | **88.09** |

> Further evaluations on additional datasets and tasks are ongoing to provide a more comprehensive view of its performance.


---

## Use Cases

Fin-ModernBERT can be used for various financial NLP applications, such as:

- **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets)  
- **Event-driven Stock Prediction**  
- **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments)  
- **Document Classification & Clustering**  
- **Question Answering over financial reports and news**  

---

## How to Use

```python
from transformers import AutoTokenizer, AutoModel

model_name = "clapAI/Fin-ModernBERT"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

text = "Federal Reserve hints at possible interest rate cuts."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

```

## Citation

If you use this model, please cite:

```@misc{finmodernbert2025,
  title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain},
  author={ClapAI},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}}
}