Skip to content
Mani Levian Asli edited this page Mar 30, 2020 · 124 revisions

Twitter Tool List

This list provides an overview of useful data collection tools that can be used for research on Twitter. If you face problems or issues with one of the applications on the list, feel free to post an Issue. It helps us to maintain this list.

Overview

All of the following tools have the ability to search for a specific username, hashtag, location or tweet and collect the associated data from Twitter. All tools download the associated media (i.e. pictures and videos) and related hashtags. The list below is sorted in an opinionated way, roughly in the order of what we think might help the most people. But all tools might be ideal for your specific use case, so have a look at the whole list.

Most of these Twitter tools connect to official Twitter APIs and therefore need an API Key from Twitter. You can retrieve an API key from Twitter easily, just follow the documentation. You are bound to the restrictions given by Twitter. You can read about the rate limits here.

Some of the tools are scrapers, which do not use the official APIs. Please be aware that the use of these tools might violate Twitter's Terms of Service. Despite being public, Twitter data can be very personal. Ensure to inform yourself thoroughly in order to follow data protection laws and ethical guidelines that apply to your research before starting your data collection.

Useful Scrapers

Overview

Keys

  • -: The tool will only partially pull the data.

  • x: The tool is not able to fetch the described data.

  • : The tool is able to fetch the described data.

  • User Info: In general, retrieving number of posts, followers/followings, creation date, username, etc..

  • Media: Feature includes the scraping of videos and pictures.

  • Followers/Followings: Allows you to download a list of all followers/followings from one or more accounts.

  • Login Module: The tool can log you into an account.

  • Posts and Hashtags: In general the function to retrieve tweets and seek tweets for a certain hashtag.

  • MetaData: Includes all data other than the actual tweets, user info, media or followers. This includes location and user-ID, which is crucial to maintain a DB.

  • API Based: The tool uses the official Twitter API.

  • Scheduled Data collection: Allows you to plan updates to data sets at/after a certain time.

Facepager was made for fetching public available data from YouTube, Twitter and other JSON-based APIs. All data is stored in a SQLite database and may be exported to csv. This app has a Graphic Unit Interface, which makes it extremely easy to use. Official API from Twitter is needed.

known issues and limitations:

  • is limited by the Twitter API

Notable Features:

  • Program with a Graphic User Interface (GUI) making it easy to use for unexperienced users.

Installation via: An installation package is available for Windows, Linux and MacOS

Documentation and Usage
Instructions and Download

A simple script to scrape for Tweets using the Python package requests to retrieve the content and Beautifulsoup4 to parse the retrieved content.

Notable Features:

  • Works without an API key, therefore no limitations through Twitter

Installation via: pip

Download
Documentation, Usage and Installation Instructions

Twint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API. Twint utilizes Twitter's search operators to let you scrape Tweets from specific users, scrape Tweets relating to certain topics, hashtags & trends, or sort out sensitive information from Tweets like e-mail and phone numbers. I find this very useful, and you can get really creative with it too.

Notable Features:

  • Can be used completely anonymously without an API or Twitter account
  • Built-In visual analysis tool.

Installation via: pip

Documentation and Usage
Download and Installation Instructions
Tutorial by Null-Byte

twarc is a command line tool and Python library for archiving Twitter JSON data. Each tweet is represented as a JSON object that is exactly what was returned from the Twitter API. Tweets are stored as line-oriented JSON. Twarc will handle Twitter API's rate limits for you. In addition to letting you collect tweets Twarc can also help you collect users, trends and hydrate tweet ids.

known issues and limitations:

  • limited to the Twitter API

Notable Features:

  • Controls all API limitations by itself

Installation via: pip

Download
Documentation, Installation and Usage

VOSONDash is an interactive R Shiny web application for the visualisation and analysis of social network data. The app has a dashboard layout with sections for visualising and manipulating network graphs, performing text analysis, displaying network metrics and the collection of network data using the vosonSML R package.

known issues and limitations:

  • is limited by the Twitter API limitations

Notable Features:

  • R application that connect to different social media APIs
  • Built-in visual analysis, accessible through an web interface
  • Cross-platform analysis

Installation via: CRAN

Download
Installation and Usage

TAGS is a free Google Sheet template which lets you setup and run automated collection of search results from Twitter.

known issues and limitations:

  • limited to search queries

Notable Features:

  • easy to use, without command line

Installation via: An Google account is needed to install this sheets

Download and installation instructions
Support forum for beginners and advanced users

SMO-TMAS allows users to pull tweets of specified Twitter handles and tweets containing specified keywords by querying Twitter's REST API GET search/tweets endpoint and statuses/user_timeline endpoint as well as Twitter's STREAM API. The collected tweets can be downloaded as .csv file and SMO-TMAS also provides data analysis components that can be used to analyze and visualize the collected data right away.

known issues and limitations:

  • is limited by the Twitter API limitations

Notable Features:

  • Ideal for small datasets
  • Accessible through the web
  • No local installation needed

Installation via: Accessible through a Web Application, no local installation needed.

Documentation and Development
Access

Other Useful Tools

Hydrator is an Electron based desktop application for hydrating Twitter ID datasets. Twitter's Terms of Service do not allow the full JSON for datasets of tweets to be distributed to third parties. However they do allow datasets of tweet IDs to be shared. Hydrator helps you turn these tweet IDs back into JSON and also CSV from the comfort of your desktop.

Notable Features:

  • Program with a Graphic User Interface (GUI) making it easy to use for inexperienced users.

Installation via: An installation package is available for Windows, Linux and MacOS

Downloads
Documentation and Usage

There are even more tools and we keep gathering. You can check out our Google Doc for applications that you wont find in this list.

Clone this wiki locally