Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

How to Tell the Fashion Future?
August 29, 2013
Hadoop Tutorials: Using Hive with HBase
September 5, 2013
The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013  |   Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013  |   Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013  |   Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013  |   Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!