kortukov's picture
Update README.md
b7216e6 verified
---
license: apache-2.0
datasets:
- kortukov/answer-equivalence-dataset
language:
- en
pipeline_tag: text-classification
---
# Overview
BEM - BERT Matching model from paper [Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation](https://arhttps://arxiv.org/abs/2202.07654xiv.org/abs/2202.07654) (reproduction).
It is a [bert-base-uncased](https://huggingface.co/bert-base-uncased) model trained on the [Answer Equivalence dataset](https://huggingface.co/datasets/kortukov/answer-equivalence-dataset)
Consider this example (pseudocode):
```python
question = 'how is the weather in california'
reference answer = 'infrequent rain'
candidate answer = 'rain'
bem(question, reference, candidate) ~ 0
```
This model can be used as a metric to evaluate automatic question answering systems: when the produced answer is different from the reference, it might still be equivalent to the reference and hence count as correct.
See the paper [Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation](https://arxiv.org/abs/2202.07654) for a detailed explanation of how the data was collected and how this metric compares to others such as exact match of F1.
# Example use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from torch.nn import functional as F
tokenizer = AutoTokenizer.from_pretrained("kortukov/answer-equivalence-bem")
model = AutoModelForSequenceClassification.from_pretrained("kortukov/answer-equivalence-bem")
question = "What does Ban Bossy encourage?"
reference = "leadership in girls"
candidate = "positions of power"
def tokenize_function(question, reference, candidate):
text = f"[CLS] {candidate} [SEP]"
text_pair = f"{reference} [SEP] {question} [SEP]"
return tokenizer(text=text, text_pair=text_pair, add_special_tokens=False, padding='max_length', truncation=True, return_tensors='pt')
inputs = tokenize_function(question, reference, candidate)
out = model(**inputs)
prediction = F.softmax(out.logits, dim=-1).argmax().item()
```