Skip to content

Developments in Permissibility of Data Gathering (Profile Information)

Unsettled legal boundaries revealed in recent court case concerning data harvesting methods like scraping, crawling, botting, and creation of false profiles on sites not owned by the user. The question remains: How far can one intrude on another's territory when fishing for information?

Developments in Privacy Debates: Examining the Legal Status of User Profiles Scraping
Developments in Privacy Debates: Examining the Legal Status of User Profiles Scraping

Developments in Permissibility of Data Gathering (Profile Information)

In the digital age, data harvesting has become a common practice for businesses, researchers, and individuals alike. However, the legality of such activities is a complex issue that involves a delicate balance between law, privacy, and the terms of use set by websites.

Web scraping and crawling, the automated collection of data from websites, are generally not inherently illegal. When performed on publicly available, non-sensitive data and without violating a site’s terms of service, these activities are often considered legal, especially in the US and EU. However, legal trouble arises when personal data is scraped without consent, copyrighted material is infringed upon, terms of service are violated, or computer fraud and abuse laws are bypassed.

Automated access (bots) to scrape or interact with websites is legal if it complies with the site’s terms, does not overload servers, and does not access protected areas without permission. However, using bots to bypass authentication or simulate human users can lead to legal action under anti-hacking laws or site-specific terms.

Creating fake profiles to harvest data is almost always prohibited by a website’s terms of use and, depending on the intent and data collected, may be illegal under anti-fraud, privacy, or computer crime laws.

The inconsistency in court rulings on these matters highlights the ongoing debate. Websites frequently have more restrictive rules than the law requires, and breaching these terms does not always mean breaking the law, but it can result in account suspension, IP blocking, or legal threats from the website owner. Courts generally enforce site terms if they are clearly stated and agreed to, but the outcome can vary by jurisdiction and specific facts of the case.

In the US, recent cases show a split. Some courts have ruled that scraping publicly available data does not violate the Computer Fraud and Abuse Act (CFAA) if there is no password-protected or paywalled content accessed, and no clear technical barriers are bypassed. However, other courts have sided with website owners when the scraping violated terms of service or involved unauthorized access, especially under the CFAA.

The creation of fake online profiles for research purposes on LinkedIn is not considered a crime under the current federal court ruling in D.C. However, it could still be a civil violation because the terms of use are a legally binding contract.

The legal status of data harvesting activities involves complex interactions between law—such as privacy, intellectual property, and computer crime statutes—and the websites' own Terms of Use. Seeking legal advice before engaging in large-scale data harvesting is always advisable.

The ongoing case involving computer science professors from Northeastern University, who intend to compare rankings and responses depending on the race of the fake profiles they create, highlights the need for the Supreme Court to reconcile inconsistent jurisprudence on the question of what kind of unauthorized access in violation of the terms of use constitutes a crime under the anti-hacking law.

Data-and-cloud computing technology plays a significant role in data harvesting activities, particularly in web scraping and automated access for gathering data from websites. However, the legality of such actions hinges on complying with a site's terms of service, respecting privacy rights, and avoiding copyright infringement or computer fraud.

Read also:

    Latest