Stupid question, couldn’t instances just say they don’t allow scraping specifically from Facebook in their ToS and then report them for GDPR violations if they do?
As in say that have the ToS says that “we’ll give your data to other instances because that’s how the Fediverse works, we won’t give your data to Facebook” and also “Facebook is not allowed to federate, and is not allowed to pull data”.
Then just say that your data subjects don’t consent to any data pulling by Facebook, and Facebook scraping your system even through ActivityPub is a violation of GDPR.
I browse Reddit only for one sub, a country-specific one that is reasonably niche. Right when the API migration happened, there seemed to be a very visible migration of Facebook/Instagram people migrating over to Reddit. Posts asking where to find Instagram/Facebook functionality came in daily, and the overall quality of both comments and posts degraded a lot, suddenly posts had a ton of comments with one word and a ton of emojis.