Safe-D: Safety through Disruption

Data Mining Twitter to Improve Automated Vehicle Safety



Automated vehicle technologies may significantly improve driving safety, but only if they are widely adopted and if drivers use them appropriately. Prior work suggests that intentions to adopt new technology and appropriately rely on it are often driven by the user’s expectations. In recent years, these expectations increasingly depend on news presented on social media. For example, recent polls suggest that the majority of Twitter users primarily use the site as a news source. The power of social media in creating and changing expectations suggests that it may be a disruptive tool for increasing the adoption and safe use of automated vehicle technology. In this project, we seek to understand the conversation about automated vehicles on Twitter through a network and natural language processing analysis. We further focus on responses and changes of opinion surrounding automated vehicle crashes. These analyses will identify a set of terms, key opinion generators, and hashtags that lead to the most accurate and positive responses to automated vehicles. In the final phase of the project, we will translate these findings into guidelines for automated vehicle crash responses to help public information officers structure their communications about crashes. Research has shown that avoiding misinformation and structuring communication leads to improved outcomes in emergencies and thus we expect these guidelines to facilitate automated vehicle safety.

Project Highlights

  • Semi-supervised machine learning approaches are effective for identifying tweets relevant to vehicle crash events.
  • Tweets about Automated Vehicles include themes of Crashes, Fault & Safety, Market & Sales, Tech Companies, Electric Vehicles, and Public Transit.
  • Crash discussions begin and end shortly after crashes or major news events
  • PIOs should focus on efficiently conveying accurate information about AVs while maintaining a persistent interaction to educate the public on AV use and capabilities.

Final Report

Final Report 04-098

EWD & T2 Products

One Pager for Public Information Officers– Social media guidelines for discussing automated vehicle incidents. The one-pager was sent to the following groups below:

    • American Association of State Highway and Transportation Officials (AASHTO) – 106 members of the DOT Transportation Community group
    • TRB Public Engagement and Communications Committee – to the 200-person email list, which includes committee members and friends.

The project team created an application to teach high school students about data analysis with Twitter. The application is located at: The application is accompanied by a presentation to give students context and discuss pursuing a Ph.D. and a career in Transportation Research.

The project team also created a series of two in-class activities for undergraduate students in ISEN 413 Advanced Data Analytics where students were tasked with analyzing tweets after a recent Tesla autopilot crash. The PDFs of the activities, assignments, and rubrics can be found Code Clustering and Group Assignment Deciphering Topic Models.

Student Impact Statement (pdf): The student(s) working on this project provided an impact statement describing what the project allowed them to learn/do/practice and how it benefited their education.


Jefferson, J. A., and McDonald, A.D. (2019). The automated vehicle social network: Analyzing tweets after a recent Tesla Autopilot crash. To be presented at the Human Factors and Ergonomics Society’s 2019 International Annual Meeting, Seattle, WA, October 2019.

Wei, R., Alambeigi, H., McDonald, A.D. (2020). Topic modeling social media data after fatal automated vehicle crashes. Presented at the Human Factors and Ergonomics Society’s 2020 International Annual Meeting, Virtual, October 2020.

Alambeigi, H., Smith, A., Wei, R., McDonald, A.D., Arachie, C., Huang, B. (2021). A Novel Approach to Social Media Guideline Design and Its Application to Automated Vehicle Events. Proceedings of the Human Factors and Ergonomics Society 65th Annual Meeting. (Submitted)

Alambeigi, H., Smith, A., Wei, R., McDonald, A.D., Arachie, C., Huang, B. (2021). A Novel Approach to Social Media Guideline Design and Its Application to Automated Vehicle Events. To be presented at the Human Factors and Ergonomics Society 65th Annual Meeting.

McDonald, A. D., Huang, B., Wei, R., Alambeigi, H., Arachie, C., Smith, A., Jefferson, J. (2021). Data mining Twitter to improve automated vehicle safety. US DOT SAFE-D National Transportation Research Center. downloads/04-098-data- mining-twitter-to-improve-automated-vehicle-safety/

Final Dataset

The Twitter data gathered by the project is the property of Twitter and not directly publishable; however, we provide the tweet IDs for our collected tweets around each event: Here (COMING SOON)

Research Investigators (PI*)

Tony McDonald (TTI/TAMU)*
Bert Huang (VT)
Jacelyn Jefferson (TAMU)
Michelle Canton (TTI)
Shuangfei Fan (VT)

Project Information

Start Date: 2019-03-01
End Date: 2020-10-31
Status: Complete
Grant Number: 69A3551747115
Total Funding: $283,435
Source Organization: Safe-D National UTC
Project Number: 04-098

Safe-D Theme Areas

Big Data Analytics
Automated Vehicles

Safe-D Application Areas

Driver Factors and Interfaces
Planning for Safety
Vehicle Technology

More Information

UTC Project Information Form

Sponsor Organization

Office of the Assistant Secretary for Research and Technology
University Transportation Centers Program
Department of Transportation
Washington, DC 20590 United States

Performing Organization

Texas A&M University
Texas A&M Transportation Institute
3135 TAMU
College Station, Texas 77843-3135

Virginia Polytechnic Institute and State University
Virginia Tech Transportation Institute
3500 Transportation Research Plaza
Blacksburg, Virginia 24061