Web Scraping and Antitrust Law

//

Web Scraping and Antitrust Law

In 2019, in what was at the time the most important legal precedent in the United States with web scraping, the Ninth Circuit Court of Appeals wrote:

We agree with the district court that giving companies like LinkedIn free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use—risks the possible creation of information monopolies that would disserve the public interest.

hiQ Labs, Inc. v. LinkedIn Corp., 938 F. 3d 985 (9th Circuit 2019)

This was a powerful statement from the Ninth Circuit. Up until that time, web scrapers had little success in federal courts. From the early 2000s to 2019, almost every form of public data collection was viewed with suspicion. Then, for the first time at the circuit court level, one of the most important courts in the country acknowledged the obvious anticompetitive animus that was driving most web-scraping litigation.

It seemed like the tide had finally turned in the law of web scraping.

But in the intervening two years since the hiQ Labs Ninth Circuit opinion was first decided, the trend in the law has become murkier. In 2020, hiQ Labs had its antitrust claims against LinkedIn dismissed by the Northern District of California. In spite of what many had hoped, no other circuit court has yet to embrace or adopt the Ninth’s Circuit’s discussion of “information monopolies.” And then in the summer of 2021, the Supreme Court adopted a new and uncertain standard in Van Buren v. United States for how to apply the Computer Fraud and Abuse Act. And then, only a few days later later, the Supreme Court vacated the scraping-favorable hiQ Labs Ninth Circuit opinion, remanding it back to the Ninth Circuit for further consideration after Van Buren.

Now, it’s not so clear anymore what the future of the law of web scraping is going to be. In many ways, it feels like we’re starting from scratch.

The Standard Web-Scraping Litigation Fact Pattern

The critical antitrust issue that the Ninth Circuit talked about back in 2019, the problem of “information monopolies that would disserve the public interest,” is still there. If anything, it’s just gained steam. Companies such as LinkedIn, Facebook, and others have doubled down on their desire to protect and control publicly available data through whatever legal means they have at their disposal.

But, in spite of the clear language of the 9th Circuit about information monopolies, courts have been reluctant to do anything about the problem.

Typically, the fact pattern that emerges in web scraping litigation is as follows: A large incumbent such as Facebook, LinkedIn, or Craigslist notices that someone is doing something with data collected from their sites without their permission. The incumbent then writes a cease-and-desist letter to the data collector, accusing them of all sorts of horrible crimes for collecting and repurposing publicly available data.

If the data collector agrees to do what the large incumbent says, then we usually never hear about it again. It never gets litigated. It never appears in the public record.

If the data collector continues to push forward with data collection despite the legal threat, that’s when litigation often ensues.

The data incumbent in the ensuing lawsuit usually alleges violation of the CFAA, misappropriation, breach of contract, tortious interference with a contract, trespass to chattels, and sometimes a few state-law claims.

Where Antitrust Law Enters the Picture

In the past, web scrapers had few defenses to fight back against incumbents. But, for the rare company that has the resources to fight back, antitrust law provides an avenue to put pressure back on the giants.

There is ample evidence that data obfuscation harms consumers in a variety of ways. Companies that collect and reuse public data often do so in a way that benefits the public. In theory, this is what antitrust law is designed to protect against: anticompetitive behavior that harms consumers. But to date, courts have largely enabled incumbents’ anticompetitive behavior by allowing them to pursue claims against data collectors.

Antitrust litigation has been bubbling to the surface on these issues for years, but the best path forward for web scraping litigants is not yet clear.  There is section one of the Sherman Antitrust Act, which prohibits agreements that unreasonably restrain trade. There is section two of the Sherman Antitrust Act, which prohibits any person or company from seeking to “monopolize, or attempt to monopolize, or combine or conspire with any other person or persons, to monopolize any part of the trade.” Then there is the Clayton Antitrust Act, which prohibits predatory and discriminatory pricing, among other things. 

In hiQ Labs v. LinkedIn Corp., hiQ Labs pursued two separate antitrust theories against LinkedIn. First, it alleged “unilateral refusal to deal.” This theory states that LinkedIn refused to engage in a profitable course of dealing with them for the sole purpose of maintaining a monopoly or attempted monopoly in violation of § 2 of the Sherman Act.

This theory failed in the district court, because hiQ Labs failed to establish that it had a prior “profitable course of dealing” with LinkedIn. It’s not enough that LinkedIn just knew about hiQ Labs and didn’t do anything about it. To establish a unilateral refusal to deal antitrust trust claim, a plaintiff must show that the companies had a clear and unambiguous business relationship that one party abandoned for the sole purpose of driving the other out of business (or something similar). This element of the legal claim will likely prove insurmountable for most web-scraping businesses.

The other legal claim that hiQ Labs alleged was “denial of essential facilities.” In theory, this might be more promising for web scrapers.

To establish a “denial of essential facilities” claim, a plaintiff must prove:

To establish a violation of the essential facilities doctrine, [a plaintiff] must show (1) that [the defendant] is a monopolist in control of an essential facility, (2) that [the plaintiff], as [the defendant’s] competitor, is unable reasonably or practically to duplicate the facility, (3) that [the defendant] has refused to provide [the plaintiff] access to the facility, and (4) that it is feasible for [the defendant] to provide such access. Because mandating access, as the essential facilities doctrine implies, shares the same concerns as mandating dealing with a competitor, a facility is essential “only if control of the facility carries with it the power to eliminate competition in the downstream market.”

HIQ LABS, INC. v. LINKEDIN CORPORATION, Dist. Court, ND California 2020

On the one hand, this legal claim looks like it was tailor-made for web scraping. Unfortunately, no court has applied this theory to web scraping…yet. Specifically, in the hiQ Labs case, the court rejected hiQ Labs’ arguments not because they weren’t potentially valid, but merely on the basis that hiQ Labs has failed to properly identify a downstream market. According to the court:

In the instant case, hiQ asserts that LinkedIn’s social networking platform amounts to an essential facility. However, the Court cannot assess the viability of hiQ’s essential facilities argument without there first being a properly defined downstream market (i.e., the people analytics market). The Court therefore dismisses the essential facilities theory based on a failure to adequately allege a people analytics market.

The court explained the deficiency as follows:

The Court acknowledges hiQ’s suggestion that products using employer internal data or publicly available data other than LinkedIn’s are different in quality from hiQ’s products — and thus it is at least a question of fact whether there is some elasticity of demand between them and whether those products are in the same market as hiQ’s products. See generally Brown Shoe Co. v. United States, 370 U.S. 294, 326, 82 S. Ct. 1502, 1524-25 (1962) (“agree[ing] with the District Court that in this case a further division of product lines based on `price/quality’ differences [medium-priced shoes and low-priced shoes] would be `unrealistic'”); see also In re Live Concert Antitrust Litig., 247 F.R.D. 98, 129 (C.D. Cal. 2007) (indicating that consumers may differentiate or distinguish among products based on performance, price, and so forth but that does not necessarily mean that the products are in separate markets). The problem for hiQ is that it has not yet shown that it is plausible that the relevant market should be defined as that which uses only LinkedIn data.

HIQ LABS, INC. v. LINKEDIN CORPORATION, Dist. Court, ND California 2020

Reading between the lines here, for a web scraping company with the right legal counsel and the right set of circumstances, “denial of essential facilities” is a valid legal theory that a web scraper could use to “mandate access” to publicly available data. It just hasn’t happened yet.

Antitrust law isn’t simple or cheap to pursue. Plaintiffs who pursue antitrust claims lose more often than they win. But in reading the tea leaves in the Ninth Circuit, the time might be right for the right company with the right set of circumstances to prevail on an antitrust claim in this context.

I predict that this will happen over the next ten years. It’s merely a question of when and where.