From review filtering to topic modeling: a deep learning-based approach for analyzing negative online reviews of five-star hotels

Wang, Boyu; Ge, Junlian; Lu, Fangyuan

doi:10.1108/JHTT-04-2025-0328

Article navigation

Article Contents

Introduction
Literature review
Negative online reviews
Topic modeling and topic analysis of negative online reviews
Methodology
Research design
Data collection and preprocessing
Review filtering method
Topic modeling method
Topic analysis method
Results
Results of review filtering and topic modeling
Topic analysis results
Summary and discussion
Research summary
Theoretical contributions
Managerial implications
Comparison with previous online review and topic modeling studies
Limitations and future research directions
References
Further reading

Research Article| December 17 2025

From review filtering to topic modeling: a deep learning-based approach for analyzing negative online reviews of five-star hotels

Boyu Wang;

Boyu Wang

School of Geography,

Nanjing Normal University

, Nanjing,

China

Boyu Wang is a graduate student in the School of Geographical Science, Nanjing Normal University, China. His research interests include intelligent tourism, tourism management and hospitality. His master’s supervisor is Ge.

Search for other works by this author on:

This Site

PubMed

Google Scholar

Junlian Ge;

Junlian Ge

School of Geography,

Nanjing Normal University

, Nanjing,

China

Junlian Ge is an Associate Professor in the School of Geographical Science, Nanjing Normal University, China. Ge focuses on tourism research in particular on tourism management and tourism informatization. She also has several years of teaching experience in the smart tourism education as well as managerial experience in the smart tourism industry.

Corresponding author Junlian Ge [email protected]

Search for other works by this author on:

This Site

PubMed

Google Scholar

Fangyuan Lu

School of Geography,

Nanjing Normal University

, Nanjing,

China

Fangyuan Lu a doctoral student in the School of Geography and Ocean Science, Nanjing University. Her research interests include tourism geography and emotional geography.

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Corresponding author Junlian Ge [email protected]

Publisher: Emerald Publishing

Received: April 22 2025

Revision Received: September 05 2025

Revision Received: November 14 2025

Accepted: November 17 2025

Online ISSN: 1757-9899

Print ISSN: 1757-9880

Funding

Funding Group:

Award Group:
- Funder(s):
  Postgraduate Research and Practice Innovation Program of Jiangsu Province
- Award Id(s):
  SJCX24_0625
Funding Statement(s):
This work was supported by Postgraduate Research and Practice Innovation Program of Jiangsu Province [SJCX24_0625]. The authors thank the funder for the financial support of the data collection.

2025

Emerald Publishing Limited

Licensed re-use rights only

Journal of Hospitality and Tourism Technology 1–25.

https://doi.org/10.1108/JHTT-04-2025-0328

Purpose

This study aims to develop a robust analytical framework for identifying and interpreting negative user-generated reviews of five-star hotels. It addresses the limitations of traditional review filtering based solely on star ratings and the manual subjectivity of topic modeling, offering a deep learning–based solution aligned with national hospitality standards.

Design/methodology/approach

Using a data set of 124,381 user reviews from 70 five-star hotels in Jiangsu Province, collected via the Ctrip platform, this study applies a fine-tuned Chinese bidirectional encoder representations from transformers (BERT) model to detect negative reviews with high semantic accuracy. BERTopic is then used for topic modeling. To enhance domain relevance and interpretability, a high-quality semantic vocabulary is constructed based on the national standard Classification and Accreditation for Star-Rated Tourist Hotels (GB/T 14308–2023). The extracted topics are mapped to this vocabulary to establish structured semantic alignment between customer feedback and industry evaluation dimensions.

Findings

The BERT model identified 18,578 negative reviews, a figure significantly exceeding the number captured by rating-based filters alone. Among these, 437 topic clusters were extracted via BERTopic, with 388 successfully mapped to a standardized topic vocabulary. Results highlight that negative feedback is concentrated in key service areas such as room facilities, cleanliness, staff responsiveness and safety assurance. Notably, approximately 13% of high-rated (4–5 stars) reviews also contained negative sentiment, exposing service blind spots hidden beneath favorable scores.

Research limitations/implications

This study focuses on Chinese-language five-star hotel reviews and applies a national standard (GB/T 14308–2023) for topic alignment, which may limit cross-regional generalizability. The reliance on full-review classification, rather than sentence-level sentiment separation, may overlook mixed-opinion nuances. Furthermore, the exclusion of reviews with model disagreement might introduce selection bias. Lastly, while ChatGPT and DeepSeek enhance topic validation, the lack of human adjudication may affect interpretive accuracy. Future research could adopt multilingual data sets, cross-standard mapping and hybrid annotation methods to improve adaptability and robustness.

Originality/value

This research pioneers the integration of deep semantic modeling (BERT and BERTopic) with standardized industry lexicons in the context of Chinese-language user reviews, offering a reproducible, interpretable and domain-aligned approach to analyzing hotel reviews. Beyond the luxury hospitality sector, the framework’s combination of deep learning classification, BERTopic clustering and semantic mapping to industry standards can be adapted to other service industries – such as healthcare, retail or transportation – where customer experience data is text-rich and domain-specific. The study introduces a dual-model cross-validation mechanism to ensure semantic rigor and presents a semantic mapping framework that bridges user sentiment data with operational evaluation systems, providing a scalable methodology for intelligent service optimization across diverse high-contact service environments.

2025

Emerald Publishing Limited

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

You could not be signed in. Please check your email address / username and password and try again.

From review filtering to topic modeling: a deep learning-based approach for analyzing negative online reviews of five-star hotels

Email Alerts

Cited By

From review filtering to topic modeling: a deep learning-based approach for analyzing negative online reviews of five-star hotels Available to Purchase

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

From review filtering to topic modeling: a deep learning-based approach for analyzing negative online reviews of five-star hotels