Are we doing crimes when scrapping data online? For example public available music?

Shin@piefed.social · 3 days ago

Are we doing crimes when scrapping data online? For example public available music?

TehPers@beehaw.org · edit-2 2 days ago

There’s no obvious answer to your question without more information (for example, where are you?) but I’m not aware of scraping being illegal anywhere, with some exceptions. For example, in the US (where I am), as long as you’re not doing “illegal hacking” to scrape your data, you’re probably fine.

There are TOSs that websites like to impose as well. If you have to agree to one to access any data, you should follow it. Breaking the TOS isn’t really “illegal” in a criminal sense (in the US), but you may expose yourself to anything from being blocked from the site to a lawsuit. Bypassing blocks might also be illegal, though you’d have to speak to a lawyer to know more about that.

Shin@piefed.social · 2 days ago

That’s the point, my focus is on the “Europe” as a general place, since they need to sync the “law” to some degree, there is different levels, but the base line are the same.

Most public data, like all the music in Spotify don’t require a cookie. So I could in theory scrape all the Spotify music to “listem later”. This wouldn’t be “illigal”, but if that’s the case Annas Archive should be “fine”… (I know that they are distributing, and this is the fight)

But, if they scrapped the music, and I scrape we would have the same “dataset”, so if I download the Annas “dataset”, would it be different from mine? So if I prefer to download the Anna’s dataset instead of scrape myself, would this be illigal? They aren’t selling (on the contrary of Google).

There is way to many questions in my head :(

TehPers@beehaw.org · 1 day ago

This wouldn’t be “illigal”, but if that’s the case Annas Archive should be “fine”… (I know that they are distributing, and this is the fight)

I don’t know much about European law, but redistribution changes things a lot here in the US. At least here, it then gets into copyright law, and you’d be reproducing copyrighted works without authorization (the Internet Archive attempted to get around this with books by getting legitimate copies of the books, digitizing them, then “lending” the digital copies of those books).

So if I prefer to download the Anna’s dataset instead of scrape myself, would this be illigal?

No idea in Europe. In the US, it might be, depending on what the contents of the work are. I believe Anna’s Archive would count as piracy in this case, though scraping directly from Spotify might not be because they are redistributing the music with authorization from the copyright holder. It gets pretty confusing, honestly.

Regardless, if you aren’t doing things at large scale, even if you are breaking a law by downloading pirated content, it’s unlikely anyone will care. People usually only really start caring if you start redistributing stuff, so as long as you aren’t hosting what you’re scraping, you’re unlikely to run into any trouble.

Are we doing crimes when scrapping data online? For example public available music?

Are we doing crimes when scrapping data online? For example public available music?

When crime is legal