--- title: Sentiment Model Comparison emoji: 🚀 colorFrom: pink colorTo: indigo sdk: streamlit sdk_version: 5.37.0 app_file: app.py pinned: false license: mit short_description: Compare sentiment predictions from two deep learning models --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # 📊 Sentiment Model Comparison App This Streamlit app compares two sentiment classification models trained on IMDB movie reviews. - Model A: 6M params, 50k vocab (fast & lightweight) - Model B: 34M params, 256k vocab (high capacity) - Ensemble: Average of both predictions 🔗 **Live Demo:** [Try it on Spaces](https://huggingface.co/spaces/Daksh0505/sentiment-model-comparison) --- ## 🔍 Features - Enter single review text or upload a CSV (`review` column) - Get predictions from both models + ensemble average - Compare probabilities visually - Submit feedback (saved to Google Sheets) ## 🧠 Models ### 🔹 Model A - Filename: `sentiment_model_imdb_6.6M.keras` - **Trainable Parameters**: ~6.6 million - **Total Parameters**: ~13.06 million - **Vocabulary Size**: 50,000 tokens - Description: Lightweight and efficient; optimized for speed. ### 🔹 Model B - Filename: `sentiment_model_imdb_34M.keras` - **Trainable Parameters**: ~34 million - **Total Parameters**: ~99.43 million - **Vocabulary Size**: 256,000 tokens - Description: Larger and more expressive; higher accuracy on nuanced reviews. --- ## 🗂 Tokenizers Each model uses its own tokenizer in Keras JSON format: - `tokenizer_50k.json` → used with Model A - `tokenizer_256k.json` → used with Model B --- ## 🔧 Load Models & Tokenizers (from Hugging Face Hub) ```python from huggingface_hub import hf_hub_download from tensorflow.keras.models import load_model from tensorflow.keras.preprocessing.text import tokenizer_from_json import json # === Model A === model_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_6.6M.keras") tokenizer_path_a = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_50k.json") with open(tokenizer_path_a, "r") as f: tokenizer_a = tokenizer_from_json(json.load(f)) model_a = load_model(model_path_a) # === Model B === model_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="sentiment_model_imdb_34M.keras") tokenizer_path_b = hf_hub_download(repo_id="Daksh0505/sentiment-model-imdb", filename="tokenizer_256k.json") with open(tokenizer_path_b, "r") as f: tokenizer_b = tokenizer_from_json(json.load(f)) model_b = load_model(model_path_b) ``` --- ## 📁 Dataset - **Source:** [IMDB Multi-Movie Dataset](https://huggingface.co/datasets/Daksh0505/IMDB-Reviews) ## Citation (Please add if you use this dataset) ```ruby @misc{imdb-multimovie-reviews, title = {IMDb Multi-Movie Review Dataset}, author = {Daksh Bhardwaj}, year = {2025}, url = {https://huggingface.co/datasets/Daksh0505/IMDB-Reviews note = {Accessed: 2025-07-17} } ``` ---