In addition to being a versatile communications platform to users around the globe, Twitter is also an excellent source of current information. Data extracted from Twitter is used by researchers with different backgrounds (pollsters, marketers, academics from different disciplines) to answer a variety of questions, ranging from simple information about particular users or events (How many followers does a given user have? Who is the most active user tweeting under a certain hashtag?) to complex queries (Which users are central in a large network? How does information propagate among groups of users?). Some studies examine select individuals or small communities, while others require large volumes of information collected over long periods. Depending on the aims, different tools can be used to collect data—from Web-based analytics services that combine collection, analysis, and visualisation, to directly mining the Twitter API and interpreting the data using a dedicated statistics package. Collecting data as part of a project, whether directly through the API or by using a dedicated software package, remains one of the most challenging aspects of Twitter-based research. While the technical and methodological requirements may seem daunting at first glance, an in-depth knowledge of the tools and the kind of data available through them can address many common concerns. In this chapter, we provide an overview of different techniques and their respective advantages and limitations. First, we discuss collecting data via Twitter’s API, both directly and using a set of software packages, and then we turn to the question of how to integrate Twitter data into common social scientific study designs.
|Download Publication||Data collection on Twitter|