Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Published August 30, 2013   |   
Jim Snell and Derek Care

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Read More