A Review of Models for Hydrating Large-scale Twitter Data of COVID-19-related Tweets for Transportation Research
Preprints are manuscripts made publicly available before they have been submitted for formal peer review and publication. They might contain new research findings or data. Preprints can be a draft or final version of an author's research but must not have been accepted for publication at the time of submission.
In response to the Coronavirus disease (COVID-19) outbreak and the Transportation Research Board’s (TRB) urgent need for work related to transportation and pandemics, this paper contributes with a sense of urgency and provides a starting point for research on the topic. The main goal of this paper is to support transportation researchers and the TRB community during this COVID-19 pandemic by reviewing the performance of software models used for extracting large-scale data from Twitter streams related to COVID-19. The study extends the previous research efforts in social media data mining by providing a review of contemporary tools, including their computing maturity and their potential usefulness. The paper also includes an open repository for the processed data frames to facilitate the quick development of new transportation research studies. The output of this work is recommended to be used by the TRB community when deciding to further investigate topics related to COVID-19 and social media data mining tools.