Tools

Everything online can become data with the right tool. Below we have organized our various tools to collect and analyze data from various sources the web, including the blogs, websites, and Web 2.0 platforms.

Tools

Webivore: New Media Analysis Toolkit

in

Webivore is a research-oriented web crawler.

The Webivore new media analysis toolkit facilitates the downloading and tracking of various aspects of web code, content, and commands. The goal of the project is to develop open source tools that can map the dynamics of code politics – the manner in which digital language, formats, and commands on the web are used for political, economic, or social goals.

Technorati Inlink Ripper

in

A simple script that returns the Technorati Inlinks Rank for a list of URLs from another project. The script still needs work and is provided in the hopes that other researchers might find it useful.

Blog Tools

Blog RSS Scraper

Either as a political subculture or as an alternative outlet for the mainstream media, bloggers have become an important part of political culture. In Canada, bloggers played an active role in the last federal election through their support or, more often, their antagonism towards candidates. At times, the activity and size of the blogosphere makes research difficult. Bloggers post frequently and the significance of post can be time-sensitive. After years of studying blogs, the lab has developed an approach to automate the tracking and sampling blogs.

Blog Link Ripper

The lab developed the Blog Link Ripper as part of the Blogometer. The analysis tools collects links from blog posts and then counts the results to determine the most popular links within a blog sample in a given window of time. At present, the tool is hard coded to collect last week's most popular posts, but in the future the tool will be more configurable.

Blogometer

The tool helps answer the simple question: how active are bloggers week to week? The analysis tool generate the percentage change in activity using data collected from a blog sample. The blogometer acts as an early warning system to monitor discussion in our blog sample. When a flame war erupts online, the blogometer lets you know.

The blogometer requires data collected and formatted from the Blog RSS Scraper.

Download the Blogometer Source Code

Twitter Tools

Twitter Scraper

In recent elections, Twitter has functioned as an informal chat room during major campaign events, such as speeches or debates. People tweet to share their opinions, express their reactions, and debate. Each topical tweet usually includes a hash tag that locates it within a cluster of commentators. Researchers can track Twitter discussion through hash tags since the Twitter site outputs an RSS feed for specific hash tags.

YouTube Tools

YouTube Search Results Scraper

YouTube remains one of the leading archives of user-generated content and visual culture. Its servers contain all kinds of campaign videos, gaffs, home-made endorsements, and political re-mixes. With such volume of content, the problem becomes how to sample relevant videos. The YouTube search sorts videos based on rating, data added, and cumulative views. This scraper adds to the list of sampling techniques by calculating the weekly or even daily views. The scraper collects the metadata for videos in a search return provided by YouTube. By archiving the returns, the scraper can calculate weekly views. The YouTube Video Scraper uses the YouTube API to archive search returns by keywords over time. The scripts is designed to run on a weekly basis to collect and store data from YouTube; however, the timeframe can be modified. The script is licensed under the GPL3.0.