Abstract
In this paper we intend to present a dataset that contain a collection of tweets generated as reactions of the release of 50 different movies. The dataset can be used for gaining useful insights regarding the conversation that is generated around a particular movie. It is particularly suitable for conducting sentiment analysis and other NLP techniques. The dataset contains approximately 2.5 million tweets with their related meta data and cover 50 movies. For each movie, its IMDb rating is included. The movies are the 25 releases with the highest number of votes during 2020 and 2021. The collected tweets represent the reactions of the twitter community during the first week of the release date in US of that particular movie. The tweets per movie ranged from 1.000 to approximately 200.000 tweets with an average of 50.000 per release. We used The Internet Archive Wayback Machine in order to retrieve the IMDb movie rating after one week of the US release date. The tweets and related metadata have been collected using the Tweet Downloader tool.
Original language | English |
---|---|
Publisher | Preprints.org |
Publication status | Published - Jun 2022 |
Keywords
- dataset
- tweets
- IMDb ratings
- movies
- sentiment analysis
- NLP
Fingerprint
Dive into the research topics of 'A Dataset Containing Tweets and Their Meta Data for Understanding Social Media Conversations around Movies during Their Release'. Together they form a unique fingerprint.Datasets
-
A dataset containing tweets and their meta data for understanding social media conversations around movies during their release
Michielsen, J. (Creator) & Lelli, F. (Creator), DataverseNL, 13 Sept 2022
DOI: 10.34894/6weaur, https://dataverse.nl/citation?persistentId=doi:10.34894/6WEAUR
Dataset