File size: 4,593 Bytes
31d3d96 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 31d3d96 8e29cf7 31d3d96 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 70d6715 8e29cf7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: Fin-ModernBERT
results: []
datasets:
- clapAI/FinData-dedup
language:
- en
pipeline_tag: fill-mask
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Fin-ModernBERT
Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora.
---
## Model Description
- **Base model:** ModernBERT-base (context length = 1024)
- **Domain:** Finance, Stock Market, Cryptocurrency
- **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.)
---
## Training Data
We collected and combined multiple publicly available finance-related datasets, including:
- [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News)
- [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets)
- [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets)
- [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels)
- [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news)
- [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq)
- [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance)
- [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets)
After aggregation, we obtained **~50M financial records**.
A deduplication process reduced this to **~20M records**, available at:
👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup)
---
## Training Hyperparameters
The following hyperparameters were used during training:
- **Learning rate:** 2e-4
- **Train batch size:** 24
- **Eval batch size:** 24
- **Seed:** 0
- **Gradient accumulation steps:** 128
- **Effective total train batch size:** 3072
- **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08
- **LR scheduler:** Linear
- **Epochs:** 1
---
## Evaluation Benchmarks
We benchmarked **Fin-ModernBERT** against two strong baselines:
- [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert)
- [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
### Fine-tuning Setup
All models were fine-tuned under the same configuration:
- **Optimizer:** AdamW
- **Learning rate:** 5e-5
- **Batch size:** 16
- **Epochs:** 5
- **Scheduler:** Linear
### Results
| Dataset | Metric | FinBERT (ProsusAI) | ModernBERT-base | Fin-ModernBERT |
|---------|--------|---------------------|-----------------|----------------|
| CIKM (datht/fin-cikm) | F1-score | 42.77 | 53.08 | **54.89** |
| PhraseBank (soumakchak/phrasebank) | F1-score | 86.33 | 85.03 | **88.09** |
> Further evaluations on additional datasets and tasks are ongoing to provide a more comprehensive view of its performance.
---
## Use Cases
Fin-ModernBERT can be used for various financial NLP applications, such as:
- **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets)
- **Event-driven Stock Prediction**
- **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments)
- **Document Classification & Clustering**
- **Question Answering over financial reports and news**
---
## How to Use
```python
from transformers import AutoTokenizer, AutoModel
model_name = "clapAI/Fin-ModernBERT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
text = "Federal Reserve hints at possible interest rate cuts."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
```
## Citation
If you use this model, please cite:
```@misc{finmodernbert2025,
title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain},
author={ClapAI},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}}
} |