kortukov
/

answer-equivalence-bem

Text Classification

Model card Files Files and versions

answer-equivalence-bem / README.md

kortukov's picture

Update README.md

b7216e6 verified almost 2 years ago

|

history blame contribute delete

2.08 kB

	---
	license: apache-2.0
	datasets:
	- kortukov/answer-equivalence-dataset
	language:
	- en
	pipeline_tag: text-classification
	---

	# Overview
	BEM - BERT Matching model from paper [Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation](https://arhttps://arxiv.org/abs/2202.07654xiv.org/abs/2202.07654) (reproduction).

	It is a [bert-base-uncased](https://huggingface.co/bert-base-uncased) model trained on the [Answer Equivalence dataset](https://huggingface.co/datasets/kortukov/answer-equivalence-dataset)

	Consider this example (pseudocode):
	```python
	question = 'how is the weather in california'
	reference answer = 'infrequent rain'
	candidate answer = 'rain'
	bem(question, reference, candidate) ~ 0
	```

	This model can be used as a metric to evaluate automatic question answering systems: when the produced answer is different from the reference, it might still be equivalent to the reference and hence count as correct.

	See the paper [Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation](https://arxiv.org/abs/2202.07654) for a detailed explanation of how the data was collected and how this metric compares to others such as exact match of F1.

	# Example use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	from torch.nn import functional as F

	tokenizer = AutoTokenizer.from_pretrained("kortukov/answer-equivalence-bem")
	model = AutoModelForSequenceClassification.from_pretrained("kortukov/answer-equivalence-bem")

	question = "What does Ban Bossy encourage?"
	reference = "leadership in girls"
	candidate = "positions of power"

	def tokenize_function(question, reference, candidate):
	text = f"[CLS] {candidate} [SEP]"
	text_pair = f"{reference} [SEP] {question} [SEP]"
	return tokenizer(text=text, text_pair=text_pair, add_special_tokens=False, padding='max_length', truncation=True, return_tensors='pt')

	inputs = tokenize_function(question, reference, candidate)
	out = model(**inputs)

	prediction = F.softmax(out.logits, dim=-1).argmax().item()
	```