lekssays's picture
Update README.md
61d9a68 verified
---
license: apache-2.0
datasets:
- QCRI/AZERG-Dataset
language:
- en
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- STIX standard
- threat intelligence
- MITRE ATT&CK
---
# QCRI/AZERG-MixTask-Mistral
This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 specialized for Cyber Threat Intelligence (CTI) tasks. It was trained on the AZERG Dataset covering a mixture of all four tasks required for STIX data generation:
- T1: Entity Detection
- T2: Entity Type Identification
- T3: Related Pair Detection
- T4: Relationship Type Identification
This is the most versatile model in the AZERG collection, capable of handling all STIX extraction sub-tasks.
## Intended Use
This model is intended to be used within the [AZERG framework](https://github.com/QCRI/azerg/) to extract STIX entities and relationships from security reports. Please check the exact prompts in the framework.
Example Prompt (Task 1: Entity Detection):
```
Instruction:
You are a helpful threat intelligence analyst. Your task is to extract all STIX entities mentioned in the input. To help you, here is a list of the possible STIX entity types.
STIX entity types:
- ATTACK_PATTERN: A type of TTP that describes ways that adversaries attempt to compromise targets. (e.g., T1051, T1548.001, etc.)
[...]
Answer in the following format: <entities>LIST OF IDENTIFIED ENTITIES SEPARATED BY PIPE |</entities>
Input:
- Text Passage: [INPUT TEXT]
Response:
```
## Citation
If you use this model, please cite our paper:
```
@article{lekssays2025azerg,
title={From Text to Actionable Intelligence: Automating STIX Entity and Relationship Extraction},
author={Lekssays, Ahmed and Sencar, Husrev Taha and Yu, Ting},
journal={arXiv preprint arXiv:2507.16576},
year={2025}
}
```