kmack
/

HotelReviewClassifier

Text Classification

Sentiment Analysis

Text Classification

text-embeddings-inference

Model card Files Files and versions

HotelReviewClassifier / README.md

kmack's picture

Create README.md

9a7af52 verified over 1 year ago

|

history blame contribute delete

2.89 kB

	---
	license: mit
	datasets:
	- Aditya1010/17k-hotel-reviews-dataset
	metrics:
	- accuracy
	base_model:
	- distilbert/distilbert-base-uncased
	pipeline_tag: text-classification
	library_name: transformers
	tags:
	- Sentiment Analysis
	- DistilBERT
	- Text Classification
	- Hotel Reviews
	---
	# Hotel Review Classifier

	This model is a sentiment classification model for hotel reviews, trained to predict whether a review is positive or negative. The model was fine-tuned using the `distilbert-base-uncased` model architecture, based on the [DistilBERT model](https://huggingface.co/distilbert/distilbert-base-uncased) from Hugging Face, and trained on the [17k Hotel Reviews Dataset](https://huggingface.co/datasets/Aditya1010/17k-hotel-reviews-dataset).

	## Model Details
	- Model Type: DistilBERT-based model for sequence classification
	- Model Architecture: `distilbert-base-uncased`
	- Number of Parameters: Approximately 66M parameters
	- Training Dataset: The model was trained on the `17k-hotel-reviews-dataset`, which contains 17,000 hotel reviews with labels for sentiment (positive/negative).
	- Fine-Tuning Task: Sentiment analysis for hotel reviews (positive or negative sentiment)

	## Training Data
	- Dataset: [17k Hotel Reviews Dataset](https://huggingface.co/datasets/Aditya1010/17k-hotel-reviews-dataset)
	- Data Description: The dataset consists of 17,000 hotel reviews, each labeled with a sentiment (positive/negative).
	- Preprocessing: The dataset was preprocessed by cleaning the reviews to remove unwanted characters and URLs.

	## Training Details
	- Training Framework: Hugging Face Transformers and PyTorch
	- Learning Rate: 2e-5
	- Epochs: 3
	- Batch Size: 16
	- Optimizer: AdamW
	- Training Time: Approximately 2 hours on a GPU

	## Usage
	To use the model for inference, you can use the following code:

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch

	# Load the fine-tuned model and tokenizer
	model = AutoModelForSequenceClassification.from_pretrained("kmack/HotelReviewClassifier")
	tokenizer = AutoTokenizer.from_pretrained("kmack/HotelReviewClassifier")

	# Example review for prediction
	review = "This is the best hotel I've ever stayed in!"

	# Tokenize the input text
	inputs = tokenizer(review, return_tensors="pt", padding=True, truncation=True)

	# Get predictions
	with torch.no_grad():
	outputs = model(**inputs)

	# Get the predicted label (0 for negative, 1 for positive)
	prediction = torch.argmax(outputs.logits, dim=-1)
	print(f"Predicted sentiment: {'Positive' if prediction == 1 else 'Negative'}")
	```

	## Citation

	If you use this model in your research, please cite the following:

	```@misc{hotel_review_classifier,
	author = {Kmack},
	title = {Hotel Review Classifier},
	year = {2024},
	url = {https://huggingface.co/kmack/HotelReviewClassifier}
	}
	```