Web scraping is a complicated legal subject involving more than a dozen different laws. As such, it’s not possible to summarize all the legal issues associated with web scraping in just a few hundred words. That said, you can read the bolded sections here and the infographics in just a few minutes and learn the basics.
The good news for web scrapers is that the trend has been toward greater permissiveness with web scraping. The bad news is that the trend is not uniform across legal jurisdictions, and the overall history of web-scraping litigation has not been kind to scrapers.
[This article was last updated on November 19, 2020]
There are a few websites online that purport to answer the question of “whether web scraping is legal.” And way too many of those websites, with unwavering confidence and a complete absence of caution, provide clear and concise answers to that question that are laughably and dangerously false.
One such website claims to have a “three-part test” to determine whether web scraping is legal.
But a web scraper could follow that test and still violate dozens of state and federal laws—even potentially finding themselves in jail. The blog post would be certifiable legal malpractice—except it wasn’t written by a lawyer in the first place.
With the increasing importance of data collection from privately owned websites, web scraping has grown from a niche enterprise to a bona fide industry in the last decade.
As a lawyer (and a python programmer) who has clients in the web-scraping space, I figured the internet was long overdue for a practitioner’s guide to web-scraping law that was actually grounded in the law. With that, I took the time to read every web-scraping case and scholarly journal on the subject published in the last ten years.
What you see here is the end-product of that research.
2. A Brief Overview of Web-Scraping Laws in the US
Congress never drafted a law (and probably never will) to help web scrapers know which web scraping practices are legal and which are not. If we’re being honest, most members of Congress probably couldn’t tell you what web scraping is.
But frequently enough, web scraping becomes the source of very real business and personal disputes. Which means that courts have been forced to resolve those disputes using judicial frameworks designed for other purposes. Usually, the way this works is that someone’s website is scraped. That person or company hires an attorney, who then writes a cease and desist letter telling the web scraper to stop. Then, either the web scraper stops, or they don’t. If they don’t, then the lawyer often files a lawsuit alleging all sorts of legal claims.
Since no law directly applies to web scraping, and courts aren’t inclined to invent new laws, plaintiffs’ lawyers have been forced to get creative in trying to explain to courts why web scraping is a violation of existing laws. In so doing, lawyers have attempted to shoehorn a wide range of legal theories and frameworks into web-scraping litigation.
Perhaps none of this should come as a surprise to web scraping practitioners. But I suspect that few web scraping experts realize just how many different laws have been applied in web-scraping legal cases. To give a sense of just how wide of a net lawyers have cast on this issue, I figured that it might be worthwhile to list some of the laws that have been litigated against scrapers in just the last decade.
The rest of this article is available only via PDF – please provide your email below to receive a copy. We will not share your information with anyone for any reason, other than to send you future emails about web scraping legal issues.