Web scraping has existed for a long time and, in its good form, it’s a key underpinning of the internet. While web scraping is rampant today, everyone suffers from the same apprehension:
Is web scraping legal?
Or is it something illegal and will it invite legal trouble?
It’s common knowledge that web scraping is a way of extracting data from websites. It’s a compulsion for many types of businesses to scrape data and analyze it. But it is equally true that many people are not sure of the legality of web scraping.
Web scraping itself is not illegal. But there are also some legal, moral and ethical restraints, which keep companies back from using web data scrapers. Some of the reasons as to why are presented in the following table:
1. Website terms and conditions Websites expressively forbid web scraping within their website terms and conditions.
2. Copyright As web scraping involves copying, it may lead to a claim for copyright infringement.
3. Database rights These rights are infringed when as a whole, or a part of a database is extracted without the owner’s consent.
4. Trademarks Reproducing a website owner’s trademarks without their consent, could lead to a claim for trademark infringement or passing off.
5. Data protection Scraping for information on individuals (in some cases considered as “personal data”), without their knowledge, could infringe data protection laws.
6. Criminal Damage It is an offense to cause criminal damage to a computer (including damage to data) or to use a computer to gain to access data without proper authorization.
Legality is totally dependent on the legal jurisdiction i.e. Laws are country and locality specific. Publicly available information gathering or scraping is not illegal, if it were illegal, Google would not exist as a company because they scrape data from every website in the world.
Here is a recent US court ruling (Sandvig vs Sessions) that talks directly about scraping which states:
“Scraping plausibly falls within the ambit of the First Amendment.”
“That plaintiffs wish to scrape data from websites rather than manually record information does not change the analysis. Scraping is merely a technological advance that makes information collection easier; it is not meaningfully different from using a tape recorder instead of taking written notes, or using the panorama function on a smartphone instead of taking a series of photos from different positions.”
There is a thin line between legal and illegal aspects of web scraping. It is, therefore, necessary to set it is legal or not. For example, here are a few things to consider when scraping public data from websites (note that the following addresses only US law):
• Website’s user agreement is not enforceable as a browsewrap agreement because companies do not provide sufficient notice of the terms to site visitors.
• Scrapers accesses website data as a visitor, and by following paths similar to a search engine. This can be done without registering as a user (and explicitly accepting any terms).
Social networks, for example, assign the value of becoming a user (based on call-to-action on public page), as the ability to:
1. Gain access to full profiles;
2. Identify common friends/connections;
3. Get introduced to others, and
4. Contact members directly.
As long as scrapers makes no attempt to perform any of these actions they do not gain “unauthorized access” to their services and thus does not violate Computer Fraud and Abuse Act (CFAA). Thus Parsers does not violate any of the rules or the law.