The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013 | Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Recent Blogs

October 11, 2024

Categories

The Crayon Blog

Tribute to Mr. Ratan Tata: A Life of Purpose, Vision, and Humanity

September 18, 2024

Categories

The Crayon Blog

The Superlative of Efficiency is Here!

May 23, 2024

Categories

The Crayon Blog

Navigating the Future of Lending: How AI is Revolutionizing Consumer Credit

April 16, 2024

Categories

The Crayon Blog

Is the GenAI out of the bottle?

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

SIGN UP HERE

The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013 | Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Recent Blogs

October 11, 2024

Categories

The Crayon Blog

Tribute to Mr. Ratan Tata: A Life of Purpose, Vision, and Humanity

September 18, 2024

Categories

The Crayon Blog

The Superlative of Efficiency is Here!

May 23, 2024

Categories

The Crayon Blog

Navigating the Future of Lending: How AI is Revolutionizing Consumer Credit

April 16, 2024

Categories

The Crayon Blog

Is the GenAI out of the bottle?

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

SIGN UP HERE

The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013 | Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Recent Blogs

October 11, 2024

Categories

The Crayon Blog

Tribute to Mr. Ratan Tata: A Life of Purpose, Vision, and Humanity

September 18, 2024

Categories

The Crayon Blog

The Superlative of Efficiency is Here!

May 23, 2024

Categories

The Crayon Blog

Navigating the Future of Lending: How AI is Revolutionizing Consumer Credit

April 16, 2024

Categories

The Crayon Blog

Is the GenAI out of the bottle?

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

SIGN UP HERE

The Crayon Blog

Legal Issues Raised By The Use Of Web Crawling And Scraping Tools For Analytics Purposes

Industry Articles | Published August 30, 2013 | Tejeswini Kashyappan

In 2010, Pete Warden, a software engineer living in Colorado, developed a software program to “crawl” publicly accessible Facebook pages and “scrape” (i.e., collect) information relating to Facebook’s members. Within hours of deploying his software, the application had visited approximately 500 million pages and collected information related to approximately 220 million Facebook users – including users’ names, location information, friends and interests. Using this dataset, which Mr. Warden offered to release in anonymized form for research purposes, he created a graphical analysis of the regional and relationship patterns among Facebook’s members.

The cost of this exercise: about $100. The results: more than 500,000 visits to Mr. Warden’s website, national media coverage, and cease-and-desist warnings from Facebook, which perceived Mr. Warden’s collection of data from its webpages as a violation of its terms of use prohibiting automated access to the website without the company’s permission. Ultimately, in order to avoid a potential legal dispute, Mr. Warden abandoned his plan to release the information he collected, and agreed to delete all copies of the dataset.1Summing up his experience, he later quipped, “Big data? Cheap. Lawyers? Not so much.”2.

Recent Blogs

October 11, 2024

Categories