eren23 commited on
Commit
d3f6529
·
verified ·
1 Parent(s): ca085e6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DINOv3 → YOLO11 Distilled OCR Detector
2
+
3
+ This repository contains a **YOLO11-based OCR object detector** distilled from a **DINOv3 ViT-B/16 teacher** using **LightlyTrain**.
4
+ The goal: produce a *lightweight but high-recall text box detector* suitable for OCR, ID scanning, document parsing, and multi-language text extraction.
5
+
6
+ ---
7
+
8
+ ## Model Summary
9
+
10
+ - **Teacher:** `dinov3/vitb16`
11
+ - **Student:** `YOLO11s` (custom convolutional backbone)
12
+ - **Method:** LightlyTrain `distillation` (features-only MSE loss)
13
+ - **Data:** 1,200 unlabeled resume-like document crops + synthetic webpage/document images
14
+ - **Use-case:** OCR region detection (not recognition)
15
+ - **Export Format:** Ultralytics `.pt`
16
+ - **File:** `exported_models/exported_last.pt`
17
+
18
+ ---
19
+
20
+ ## Intended Use
21
+
22
+ This model is trained to **detect text regions** inside real-world documents:
23
+
24
+ - CVs / resumes
25
+ - ID cards
26
+ - Business documents
27
+ - Screenshots
28
+ - Webpage fragments
29
+ - PDF pages (converted to images)
30
+
31
+ It **does not perform OCR itself** — recognition should be done with a second-stage model (Tesseract, TrOCR, Nougat, PaddleOCR, VietOCR, etc.)
32
+
33
+ ---
34
+
35
+ ## Example Usage
36
+
37
+ ### Python (Ultralytics)
38
+ ```python
39
+ from ultralytics import YOLO
40
+
41
+ model = YOLO("exported_last.pt")
42
+ results = model("/content/example.jpg")
43
+
44
+ results[0].show() # visualize text boxes
45
+ ```
46
+
47
+ ### Extract BB
48
+ boxes = results[0].boxes.xyxy.cpu().numpy()
49
+ confs = results[0].boxes.conf.cpu().numpy()
50
+
51
+ for xyxy, conf in zip(boxes, confs):
52
+ print(xyxy, conf)
53
+
54
+ ### Distillation
55
+
56
+ lightly_train.train(
57
+ out="dinov3_yolo11_distilled",
58
+ data="/content/unlabeled_idl_images",
59
+ model="yolo11s",
60
+ method="distillation",
61
+ method_args={
62
+ "teacher": "dinov3/vitb16",
63
+ "teacher_weights": "/content/dinov3_vitb16_pretrain.pth"
64
+ },
65
+ epochs=2,
66
+ batch_size=4,
67
+ precision="16-mixed"
68
+ )