ViTModelFT for Skin Cancer Classification

Model Details

  • Model Architecture: Vision Transformer (ViT)
  • Framework: PyTorch
  • Input Shape: 224x224 RGB images
  • Number of Parameters: ~86M (Based on ViT Base Model)
  • Output: Multi-class classification (9 classes)

Model Description

This model uses a Vision Transformer (ViT) as a backbone for skin cancer classification. The ViT model is pretrained on ImageNet and then fine-tuned for the task. The last layer is replaced with a fully connected network for multi-class classification, with 3 layers: 512, 256 neurons leading to 9 classes representing different skin cancer types.

The ViT model is frozen for all layers except the fully connected layers, allowing the model to adapt to the new classification task while retaining knowledge learned from ImageNet.

Training Details

Metrics (Validation Set)

Class Precision Recall F1-Score
0 0.69 0.56 0.62
1 0.60 0.75 0.67
2 0.90 0.56 0.69
3 0.20 0.06 0.10
4 0.47 1.00 0.64
5 0.63 0.75 0.69
6 0.00 0.00 0.00
7 0.67 0.50 0.57
8 0.60 1.00 0.75
  • Overall Accuracy: 0.59
  • Macro Average Precision: 0.53
  • Macro Average Recall: 0.58
  • Macro Average F1-Score: 0.52
  • Weighted Average Precision: 0.58
  • Weighted Average Recall: 0.59
  • Weighted Average F1-Score: 0.56

License

This model is released under the MIT License.


This model has been pushed to the Hub using the PytorchModelHubMixin integration:

  • Library: [More Information Needed]
  • Docs: [More Information Needed]
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sebastiansarasti/ViTSkinCancer

Finetuned
(2476)
this model