Does Web Scraping Amount to Computer Fraud?

One potential legal challenge for companies developing generative AI products or services concerns the source of data.  Businesses are scraping parts of the Internet to obtain training data for their machine learning models.  Scraping data, however, raises the possibility of liability.  The most publicized scraping legal issue in the U.S. addresses the question of whether a business may be liable for scraping the website of another business under “computer fraud” laws.

“Computer fraud” is a bit of a misnomer.  These laws do not just cover “fraud” in the sense of conduct involving deception, concealment, or theft.  Instead, they cover a wide variety of misconduct involving computers and networks, including hacking, causing malware attacks, and engaging in distributed denial of service attacks.  The federal Computer Fraud and Abuse Act (CFAA)[1] is the computer fraud law of national scope.  There are also similar state laws, focused on conduct within a specific state, such as the California Comprehensive Computer Data Access and Fraud Act.[2]

Computer fraud laws can be the basis for criminal as well as civil liability.  In other words, governments can prosecute an individual, group, or organization for engaging in the conduct they prohibit.  At the same time, anyone injured or damage by the computer fraud of another is entitled to file a lawsuit under these laws to seek damages from the alleged wrongdoer.  A plaintiff filing such a suit can also seek a court order (“injunction”) against the alleged wrongdoer directing the wrongdoer to stop the conduct.  Violation of an injunction can constitute “contempt” of court and can warrant a court imposing fines, requiring the payment of the other party’s attorneys’ fees, or, in egregious cases,[3] jail time.

Computer fraud is the most publicized area of web scraping law in the United States because of one particular civil case:  LinkedIn Corp. v. hiQ Labs, Inc., originally filed in the U.S. District Court for the Northern District of California (Silicon Valley’s federal court).  Until recently, an appeal to the U.S. Court of Appeals for the Ninth Circuit was the key opinion.[4]  After the Ninth Circuit issued its original opinion in 2019, the Supreme Court asked the Ninth Circuit to review its decision based on a recent computer fraud case and remanded the case to the Ninth Circuit.  On remand, the Ninth Circuit reaffirmed its original ruling.[5]

The hiQ case has acted as test case for scraping in the U.S. since it was first filed, and the case has taken up a disproportionate share of attention in the press.  It gained even more attention after the Supreme Court of the United States took up the case (even though the Supreme Court sent the case back to the Court of Appeals without saying much).  Now that the Court of Appeals affirmed its original ruling, hiQ will act as an important precedent.  In any case, I do believe that the hiQ case will become a key precedent for courts around the country with regard to the application of the CFAA to scraping.

The dispute between hiQ and LinkedIn arose after hiQ began scraping LinkedIn for user profile information, to which it applied its analytics.  It provided two kinds of analytics services to employers:  one service predicted employees most likely to leave their employers and the second identified skill gaps in employees, which employers can address and hopefully promote retention.  The Court of Appeals originally affirmed a preliminary injunction in favor of hiQ and against LinkedIn, holding that hiQ raised serious questions as to whether LinkedIn could say that hiQ violated the CFAA’s prohibition against accessing a computer system without authorization, given that the profiles were public.[6]

The Supreme Court vacated the Ninth Circuit’s decision, asking the Ninth Circuit to reconsider the case in light of another Supreme Court CFAA case, Van Buren v. United States.[7]  Van Buren is favorable to hiQ (and other scrapers), given that it narrowed CFAA’s reach under a different clause of the CFAA (“exceeds authorized access”), saying that CFAA did not apply to those with legitimate access who accessed information for a bad motive (in that case, a police officer who used a police database to provide information in exchange for money).  The Court thought “exceeds authorized access” meant intruding into protected files, folders, or database that the user had no right to access even though the user had authorization to see other files, folders, or databases on the system.

The Supreme Court used a “gates up” or “gates down” metaphor.  Under this approach, if information is freely accessible to the public, in which the “gates are up” according to the metaphor, there is no computer fraud.  The user has not broken into some protected area.  However, if areas of a system are protected by a password or other obstacles, the “gates are down” under the Court’s metaphor.  When the gates are down and the user breaks through to gain access anyway, that is when a computer fraud violation occurs.

Upon remand, the Ninth Circuit reaffirmed its original ruling.  The Court cited and followed the Supreme Court’s “gates up” or “gates down” analysis.  Accordingly, the new opinion from the Ninth Circuit favors scrapers.  It will likely be persuasive authority for other federal courts interpreting the Computer Fraud and Abuse Act.  Also, state courts are likely to find hiQ persuasive in interpreting state computer fraud laws.

Given this analysis, scraping websites that allow anyone to access them and do not wall off access using methods such as passwords is unlikely to trigger computer fraud liability.


[1] 18 U.S.C. § 1030.

[2] Cal. Penal Code § 502.

[3] Routine first time violations will likely trigger fines and the shifting of attorneys’ fees, rather than jail time.  Without having done a survey, my guess would be that jail time would be a remedy only in instances of repeated violations.  Also, if a wealthy contemnor is apparently taking the attitude that fines can be absorbed as a cost of doing business, a court might use jail time as a remedy.

[4] 938 F.3d 985 (9th Cir. 2019), vacated and remanded, 141 S. Ct. 2752 (2021).

[5] 31 F.4th 1180 (9th Cir. 2022(.

[6] In preliminary injunction proceedings, a court does not make a final determination on the merits of the case as a whole.  Rather, a court is looking at whether the party asking for the injunction has shown a likelihood of success or at least has raised serious questions on the merits and shown significant irreparable harm.  Accordingly, the Ninth Circuit opinions about the preliminary injunction have limited weight, compared to an appeal from a final judgment in the district court.

[7] 141 S. Ct. 1648 (2021).

Previous
Previous

Privacy Policies for Robotics Companies

Next
Next

Save the Dates for the 2023 ABA AI and Robotics National Institute