Abstract Methodology Architecture Results Demo Authors Conclusion ⭐ GitHub
IEEE Transactions on Artificial Intelligence · 2026

BanglaLiteFormer: Efficient Sentiment Classification of Bangla E-Commerce Reviews

Tanjin Adnan Abir, Student Member, IEEE  ·  Md. Tanvir Mahmud Himel, Student Member, IEEE  ·  Shukdev Datta, Student Member, IEEE

97.00% Test Accuracy
99.48% Avg Confidence
4,370 Augmented Samples
~150K Parameters

Understanding Bengali-speaking consumers' perspectives through sentiment analysis of Bangla e-commerce product reviews is significant for making better business decisions. In this research, a lightweight transformer architecture, BanglaLiteFormer, is proposed to investigate sentiment analysis on Bangla e-commerce product reviews.

We collected an existing dataset of 1,631 reviews. To overcome dataset size constraints, we performed data augmentation strategies, including template-based generation; neural paraphrasing with pretrained multilingual models (mT5, mBART, and BanglaT5); back-translation techniques; and supervised fine-tuning of LLMs with semantic filtering. Our augmentation pipeline expanded the dataset to 4,370 samples.

BanglaLiteFormer combines token and positional embeddings, multi-head self-attention mechanisms, and hybrid pooling strategies to capture both explicit and implicit attitudes in user-generated content. We achieved 97.00% test accuracy with near-perfect prediction confidence (99.48%), outperforming Bangla-BERT, GRU, LSTM, and transformer ensemble methods.

Sentiment Analysis Transformer NLP Bangla / Bengali E-Commerce Low-Resource NLP Data Augmentation TensorFlow / Keras

End-to-End Pipeline

1
Data Collection
1,631 hand-labeled Bangla e-commerce reviews
2
Augmentation
Template, paraphrase, back-translate, LLM fine-tune
3
Preprocessing
Normalize, filter, stopword removal
4
Tokenization
Vocab 8,000 · maxlen 80
5
Model Training
BanglaLiteFormer + early stopping
6
Evaluation
97% accuracy · F1 0.97

Data Augmentation Strategy

To overcome data scarcity in the low-resource Bangla setting, we developed a four-stage hybrid synthetic data generation framework:

🧩 Template Synthesis

Polarity-specific lexical markers combined with authentic review fragments for controlled concatenation.

"পণ্যটি ভালো" → "পণ্যটি সত্যিই অসাধারণ"

🔀 Neural Paraphrasing

mT5, mBART, BanglaT5 with top-k and nucleus sampling for linguistic variability.

"এই পণ্যটি খুব ভালো" → "পণ্যটি সত্যিই প্রশংসনীয়"

🌐 Back-Translation

Bangla → English → Bangla via NMT models generating semantically equivalent but linguistically diverse samples.

Bangla → EN → Bangla

🤖 LLM Fine-Tuning

Qwen 2.5–7B and Bangla-GPT fine-tuned to generate realistic reviews, filtered by semantic similarity.

Qwen 2.5-7B + Bangla-GPT

Text Preprocessing

A domain-specific preprocessing pipeline tailored to Bangla's morphological and orthographic properties:

1
Type validation — convert all inputs to string type
2
Whitespace removal — regex-based elimination of tabs, newlines, extra spaces
3
Character filtering — retain Bangla (U+0980–U+09FF), English, digits; remove punctuation & emoji
4
Lowercase — normalize English characters to avoid duplicate tokens
5
Stopword removal — custom Bangla stopword list (এবং, যে, এই, …)
6
Rejoin — cleaned tokens rejoined for tokenization
"এই প্রোডাক্টটা অনেক ভালো!!! 😊😊"
↓ after preprocessing
"প্রোডাক্টটা ভালো"

Dataset Statistics

Dataset Growth
Original 1,631
After Augmentation 4,370
📈 +167.9% dataset growth via augmentation strategies
Sentiment Distribution
49/51 Neg / Pos
Positive 51%
Negative 49%
Well-balanced · minimal class bias ✓

BanglaLiteFormer Architecture

A purposefully lightweight single-block transformer designed to balance representational capacity against overfitting risk on small, domain-specific datasets. ~150,000 parameters vs. Bangla-BERT's ~110M — with a 27× lower generalization error bound.

Output
Softmax Output 2 classes
Dense (ReLU) + Dropout 32 neurons
Feature Aggregation
Concatenate [Avg ‖ Max] embed_dim × 2
Global Avg Pool embed_dim
Global Max Pool embed_dim
Transformer Block
LayerNorm + Residual maxlen × 32
Feed-Forward (ReLU) ff_dim=32
LayerNorm + Residual maxlen × 32
Multi-Head Attention heads=2, dk=32
Dropout (0.2) maxlen × 32
Embedding
Token + Positional Embedding maxlen × 32
Input (Padded Tokens) maxlen=80

Key Design Decisions

🎯

Hybrid Pooling Strategy

Combines global average pooling (overall sentiment consistency) with global max pooling (prominent emotional signals) — outperforming either alone.

144× Less Computation Than BERT

With dk=32 and h=2, attention operations = 409,600 vs BERT-base's 58,982,400 — enabling real-time deployment.

🧠

Single Transformer Block

Theoretically justified by statistical learning theory — generalization error bound ~27× lower than Bangla-BERT on our dataset size.

Hyperparameters

Embed dim32
Attention heads2
FF dim32
Transformer depth1 block
Max seq length80
Vocab size8,000
Dropout0.2–0.3
OptimizerAdam
Learning rate0.001
Batch size24
Max epochs50
Early stop patience10
Dense layer32, ReLU
Output2, Softmax

Experimental Results

97.00%
Test Accuracy
0.9706
Positive F1-Score
0.9692
Negative F1-Score
BanglaLiteFormer · Test Set 1 / 3
Accuracy97.00%
Precision97.00%
Recall97.00%
F1-Score (Weighted)97.00%
Avg Prediction Confidence99.48%
Validation Accuracy98.57%
Test Set · 632 Samples 2 / 3

Predicted →

← Actual
Negative
Positive
Negative
299True Neg
10False Pos
Positive
9False Neg
314True Pos
613
Correct
19
Errors
<2%
Error Rate
Per-Class Breakdown 3 / 3
😠 Negative Class Support: 309
Precision
0.9708
Recall
0.9676
F1-Score
0.9692
😊 Positive Class Support: 323
Precision
0.9691
Recall
0.9721
F1-Score
0.9706

Comparison with Ahmed et al. (2023) — Retrained on Our Dataset

ModelAccuracyPrecisionRecallF1-Score
GRU (Word2Vec) [Ahmed et al.]95.72%95.74%95.72%95.72%
BiLSTM (Word2Vec) [Ahmed et al.]95.24%95.26%95.24%95.24%
Bangla-BERT [Ahmed et al.]96.20%96.26%96.20%96.20%
BanglaLiteFormer (Ours)Best97.00%97.00%97.00%97.00%

Comparison with Hoque et al. (2024) — Retrained on Our Dataset

ModelAccuracyPrecisionRecallF1-Score
mBERT [Hoque et al.]93.83%93.90%93.83%93.83%
XLM-R-base [Hoque et al.]95.14%95.25%95.14%95.14%
BanglaBERT [Hoque et al.]96.33%96.32%96.33%96.32%
Transformer Ensemble [Hoque et al.]96.68%96.77%96.68%96.68%
BanglaLiteFormer (Ours)Best97.00%97.00%97.00%97.00%

Training vs. Validation Accuracy — All Models

ModelTrain Acc.Val Acc.
GRU [Ahmed et al.]98.70%96.17%
BiLSTM [Ahmed et al.]98.39%95.64%
BanglaBERT [Ahmed et al.]99.97%96.44%
mBERT [Hoque et al.]99.93%94.52%
BanglaBERT [Hoque et al.]99.97%95.95%
XLM-R-base [Hoque et al.]99.83%95.48%
Transformer Ensemble [Hoque et al.]99.97%96.43%
BanglaLiteFormer (Ours)Best100.00%98.57%

Bangla Sentiment Analyzer

An interactive Gradio-powered web application that deploys BanglaLiteFormer for real-time sentiment classification. Type or select a Bangla e-commerce review and watch the model analyze it in milliseconds.

Loading Demo...

Research Team

TAA
Tanjin Adnan Abir
M.Sc. Applied Physics & Electronics
Jahangirnagar University, Savar, Dhaka
Student Member, IEEE
TMH
Md. Tanvir Mahmud Himel
B.Sc. Computer Science & Engineering
Comilla University, Comilla, Bangladesh
Student Member, IEEE
SD
Shukdev Datta
M.Sc. Computer Science & Engineering
University of Dhaka, Bangladesh
Student Member, IEEE

Summary & Future Directions

This paper presented BanglaLiteFormer, a comprehensive and efficient framework for sentiment analysis of Bangla e-commerce reviews designed specifically for low-resource settings. By combining multi-strategy data augmentation, domain-specific preprocessing, and a purposefully lightweight transformer architecture, we demonstrated that state-of-the-art performance is achievable even with limited annotated training data.

Our model attained 97.0% test accuracy with balanced F1 scores across both sentiment classes, outperforming well-established architectures including Bangla-BERT, bidirectional GRU, bidirectional LSTM, and a five-model transformer ensemble — all while maintaining approximately 150,000 parameters compared to BERT's 110 million. The 99.48% average prediction confidence further validates that the model produces well-calibrated, reliable outputs on unseen user-generated content.

A key theoretical insight of this work is that generalization error scales with the ratio of parameters to training samples. Our single-block architecture achieves a roughly 27× lower generalization error bound than Bangla-BERT on the same dataset, which directly explains its superior practical performance despite fewer parameters. This finding has broad implications for NLP in low-resource language settings beyond Bangla.

Key Contributions

1

Hybrid augmentation pipeline — template synthesis, neural paraphrasing (mT5, mBART, BanglaT5), back-translation, and LLM fine-tuning (Qwen 2.5–7B, Bangla-GPT) expanded 1,631 samples to 4,370 while preserving sentiment fidelity.

2

Domain-specific preprocessing — a custom Bangla pipeline handling Unicode normalization, emoji removal, and stopword filtering tailored to e-commerce register.

3

BanglaLiteFormer architecture — single transformer block with hybrid avg+max pooling, providing 144× less attention computation than BERT-base while exceeding its performance on this task.

4

Theoretical justification — statistical learning theory formalization of why lightweight architectures outperform over-parameterized models on small domain-specific datasets.

5

Deployable Gradio application — an end-to-end real-time inference pipeline with live latency measurement, confidence scoring, and uncertainty flagging.

⚠️

Limitations

Binary classification only — the model assigns a single polarity label and cannot capture neutral, mixed, or nuanced multi-class sentiment gradations.

Synthetic augmentation risks — despite semantic filtering, neural paraphrasing may introduce subtle domain inconsistencies or amplify existing biases.

Single domain — trained on e-commerce product reviews; generalization to political text, news, or social media is not validated.

No aspect-level analysis — the model classifies the entire review without identifying which specific aspects (price, quality, delivery) are positive or negative.

Sarcasm and irony — the architecture has no pragmatic reasoning and is susceptible to misclassifying ironic positive-surface text.

🔭

Future Directions

Aspect-based sentiment analysis — extending BanglaLiteFormer to identify and score sentiment at the aspect level (product quality, delivery speed, pricing).

Emotion detection — moving beyond binary polarity to multi-class emotion labels (joy, anger, disappointment, surprise) for richer customer insight.

Cross-lingual transfer — leveraging multilingual pre-training to extend the framework to other low-resource South and Southeast Asian languages.

Code-mixed robustness — dedicated handling of Banglish (Bangla–English code-switching) which is pervasive in real social media and e-commerce content.

Larger annotated corpora — multi-source dataset construction with aspect-level and emotion annotations to push performance boundaries further.

"This study demonstrates that state-of-the-art performance can be obtained in a limited-resource scenario utilising a precisely built, lightweight transformer, domain-specific pre-processing, and deliberate data augmentation."

— BanglaLiteFormer, IEEE Transactions on Artificial Intelligence, 2026

Read Abstract View Results Try the Demo