---
language: 
- en
tags:
- automl
- tabular-classification
- autogluon
- cmu-course
datasets:
- aedupuga/lego-sizes
metrics:
- type: accuracy
- type: f1
model-index:
- name: Lego Brick Classification (Classical AutoML)
  results:
  - task:
      type: tabular-classification
      name: Tabular Classification
    dataset:
      name: aedupuga/lego-sizes
      type: classification
      split: augmented
    metrics:
    - type: accuracy
      value: 0.97
    - type: f1
      value: 0.96
  - task:
      type: tabular-classification
      name: Tabular Classification
    dataset:
      name: aedupuga/lego-sizes
      type: classification
      split: original
    metrics:
    - type: accuracy
      value: 0.90
    - type: f1
      value: 0.89
---

# Model Card for Lego Brick Classification (Classical AutoML)

This model classifies LEGO pieces into three types — **Standard**, **Flat**, and **Sloped** — using their geometric dimensions (*Length, Height, Width, Studs*).  
It was trained using **AutoGluon Tabular AutoML**, which automatically searched over classical ML models (LightGBM, XGBoost, CatBoost, Random Forest, k-NN, Neural Network) and selected the best-performing one.

---

## Model Details

### Model Description
- **Developed by:** Xinxuan Tang (CMU)  
- **Dataset curated by:** Anuhya Edupuganti (CMU)  
- **Model type:** AutoML ensemble (best model = LightGBM)  
- **Language(s):** N/A (tabular data)  
- **Finetuned from:** Not applicable  

### Model Sources
- **Repository:** [Hugging Face Model Repo](https://huggingface.co/)  
- **Dataset:** [aedupuga/lego-sizes](https://huggingface.co/datasets/aedupuga/lego-sizes)

---

## Uses

### Direct Use
- Educational practice in **tabular classification**.  
- Experimenting with AutoML search and hyperparameter tuning.  

### Downstream Use
- Could be used as a **teaching example** for AutoML pipelines on small tabular datasets.  

### Out-of-Scope Use
- **Not suitable for industrial LEGO quality control**, since dataset is synthetic and small.

---

## Bias, Risks, and Limitations

- **Small dataset**: only 30 original bricks, augmented to 300 synthetic samples.  
- **Synthetic data bias**: jitter augmentation may not reflect real-world LEGO variations.  

### Recommendations
Users should treat results as **proof-of-concept** and not deploy in production.

---

## How to Get Started with the Model

```python
from autogluon.tabular import TabularPredictor
import pandas as pd

# Load trained predictor
predictor = TabularPredictor.load("autogluon_model/")

# Run inference
test_data = pd.DataFrame([{"Length": 4, "Height": 1.2, "Width": 2, "Studs": 4}])
print(predictor.predict(test_data))