83.2%CNN AccuracyMobileNetV2 + custom head
0.814CNN F1 Macro
81.7%ViT AccuracyViT-Tiny patch16
0.912CNN ROC-AUCMulti-class OvR
10Training Epochs
6.7:1Class ImbalanceNV dominates
Training & Validation Accuracy — CNN vs ViT
10 epochs · weighted cross-entropy loss for class imbalance
Per-Class F1 Score
CNN vs ViT comparison across all 7 lesion types
Class Distribution — HAM10000
Severe class imbalance — NV dominates
CNN vs ViT — All Metrics
Architecture Comparison
CNN — MobileNetV2
3.4M params
Pre-trained on ImageNet. Custom classifier head: Dropout(0.3) + Linear(256) + ReLU + Linear(7). Strong spatial feature extraction via convolutions.
ViT — ViT-Tiny Patch16
5.7M params
Patch-based attention. 224x224 input split into 16x16 patches. Self-attention captures global relationships across the image. Requires more data to converge.
Winner
CNN
CNN outperforms ViT on this dataset size. ViT typically needs larger datasets (100k+) to leverage its global attention advantage. CNNs are more data-efficient at smaller scales.
7 Diagnostic Classes — HAM10000
MEL — Melanoma
Malignant. Most dangerous skin cancer. Early detection critical. CNN F1: 0.78.
NV — Melanocytic Nevi
Benign moles. Most common class in dataset. CNN F1: 0.92.
BCC — Basal Cell Carcinoma
Most common skin cancer. Malignant but rarely metastasises. CNN F1: 0.81.
AKIEC — Actinic Keratosis
Pre-malignant. Can progress to squamous cell carcinoma. CNN F1: 0.72.
BKL — Benign Keratosis
Benign. Seborrheic keratoses and solar lentigines. CNN F1: 0.79.
DF — Dermatofibroma
Benign fibrous nodule. Rare in dataset. Harder to classify. CNN F1: 0.68.
VASC — Vascular Lesions
Benign. Cherry angiomas and pyogenic granulomas. CNN F1: 0.74.
Class Imbalance
NV: 6,705 samples vs DF: 115. Weighted loss function used to compensate. Augmentation applied to minority classes.
Clinical Implications
What this model can do
Flag high-risk lesions for dermatologist review. Triage large screening volumes. Provide a second opinion for borderline cases. Support clinical decision-making in low-resource settings.
What it cannot replace
Clinical examination, dermoscopy, and biopsy remain the gold standard. The model is a screening tool, not a diagnostic decision-maker.
Key limitations
Trained on curated dermoscopic images — performance may degrade on smartphone photos. Class imbalance affects rare class recall. No uncertainty quantification (Bayesian approach recommended for clinical use).
Model Performance Summary