Skip to content

msmadi/ABSA-Hotels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

ABSA-Hotels

A. Reference Dataset Data Collection and Description The dataset used in this research has been prepared to support the Arabic track of Task5: Aspect Based Sentiment Analysis within the Semantic Evaluation Workshop 2016 (SemEval 20161). The dataset is a subset of the Hotels' reviews collected in [20]. Around 15,562 Hotels' reviews were thoroughly reviewed by this research authors and a subset of 2,291 reviews were selected. The original dataset has been collected from well known Hotels' booking websites such as Booking.com, TripAdvisor.com. The selected reviews belongs to Hotels from different Arabian cities such as Dubai, Mecca, Amman, Beirut, etc. In total, the final annotated dataset consists of 24,028 ABSA annotated tuples provided for training (19,226) and testing (4,802). Moreover the dataset was annotated on both text-level (2,291 reviews' texts) and sentence-level (6,029 annotated sentences). Table 1 summarizes the dataset size and distribution over ABSA research tasks.

For the domain of Hotels, the annotation guidelines of SemEval-ABSA16 have defined the following entities:

HOTEL: for opinions evaluating the hotel as whole or in terms of the lack or presence of extra features/facilities.

ROOMS: for opinions evaluating the rooms in terms of their size, general condition, view, furniture, bathroom, sleep quality and the lack or presence of extra features/amenities.

ROOM-AMENITIES: for opinions evaluating the rooms in terms of the amenities they include (e.g. air condition, refrigerator, microwave, mini bar, hair dryer, TV, toiletries, safe, balcony, coffee maker, linen).

FACILITIES: for opinions focusing on the hotel facilities in terms of specific installations/areas (e.g. swimming pool, spa&sauna, beauty salon, restaurants, cafe, etc.) or guest services offered by a hotel (e.g. shuttle, laundry, baby sitting or wake-up services, sports activities, 24-hour concierge front desk, etc.).

SERVICE: for opinions focusing on the staff's attitude and promptness, easiness to problem solving, execution of service in time, or the rooms/check-in/check-out/reception, service, etc.

LOCATION: for opinions focusing on the location of the reviewed hotel in terms of its position, the surroundings, the view, etc.

FOOD&DRINKS: for opinions focusing on the breakfast, the food and the drinks in general or in terms of specific dishes and drinks, dining/drinking options etc.

References:

if you want to use this dataset please don't forget to cite the following research:

  1. Mohammad, A. S., Qwasmeh, O., Talafha, B., Al-Ayyoub, M., Jararweh, Y., & Benkhelifa, E. (2016, December). An enhanced framework for aspect-based sentiment analysis of Hotels' reviews: Arabic reviews case study. In 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST) (pp. 98-103). IEEE.
  2. Al-Smadi, M., Talafha, B., Al-Ayyoub, M., & Jararweh, Y. (2019). Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. International Journal of Machine Learning and Cybernetics, 10(8), 2163-2175.
  3. Pontiki, M., Galanis, D., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., Mohammad, A. S., ... & Hoste, V. (2016, June). Semeval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016) (pp. 19-30).

About

Semeval-2016 task 5: Aspect based sentiment analysis. Arabic Dataset for Hotels reviews.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published