Even worse is when they strip the plus sign out after the fact and then you can’t log in anymore because you didn’t realize that’s what has happened.
Even worse is when they strip the plus sign out after the fact and then you can’t log in anymore because you didn’t realize that’s what has happened.
I assume that the gitea instance itself was being hit directly, which would make sense. It has a whole rendering stack that has to reach out to a database, get data, render the actual webpage through a template…etc
It’s a massive amount of work compared to serving up static files from say Nginx or Caddy. You can stick one of these in front of your servers, and cache http responses (to some degree anyways, that depends on gitea)
Benchmarks like this show what kind of throughput you can expect on say a 4 core VM just serving up cached files: https://blog.tjll.net/reverse-proxy-hot-dog-eating-contest-caddy-vs-nginx/#10-000-clients
90-400MB/s derived from the stats here on 4 cores. Enough to saturate a 3Gb/s connection. And caching intentionally polluted sites is crazy easy since you don’t care if it’s stale or not. Put a cloudflair cache on front of it and even easier.
You could dedicate an old Ryzen CPU (Say a 2700x) box to a proxy, and another RAM heavy device for the servers, and saturate 6Gb/s with thousands and thousands of various software instances that feed polluted data.
Hell, if someone made it a deployable utility… Oof just have self hosters dedicate a VM to shitting on LLM crawlers, make it a party.
This is assuming aggressively cached, yes.
Also “Just text files” is what every website is sans media. And you can still, EASILY get 10+ MB pages this way between HTML, CSS, JS, and JSON. Which are all text files.
A gitea repo page for example is 400-500KB transferred (1.5-2.5MB decompressed) of almost all text.
A file page is heavier, coming in around 800-1000KB (Additional JS and CSS)
If you have a repo with 150 files, and the scraper isn’t caching assets (many don’t) then you just served up 135MB of HTMl/CSS/JS alongside the actual repository assets.


Fair fair. I missed that
I can get a 50Gb/s residential link where I am, and have a whole rack of servers.
Sounds like a good opportunity to crowd fund thousands and thousands of common scrapeable instances that have random poisoning.


Low key win for kink communities.


Yeah but that was before you had billionaires of this size able to manipulate entire markets in this capacity.


Controversial opinion’
This is kind of a valid take or use I suppose.
And it’s something I struggle with as well.
I know how to program and I can make games with really shitty assets that no one would want to play because it looks like crap. I’ve tried many times and I don’t seem to have the artistic skill set to make it happen. I’ve tried dozens of times to find and pay people on sites like fiverr, with extremely disappointing results.
And as a hobby I can’t just afford to pay thousands of dollars to have someone make passable art either.
And someone like this as a student obviously doesn’t have the money to pay someone to build all their assets.
So what do??? It seems reasonable to have a desire to finish your passion project in some manner


This is also a strategic way to prevent resistance from Americans against an authoritarian regime.
Drones would be a significant part of that.


Yeah, it should inflate to 15TB or more I think


It’s literally says in the link. Go to the link and it’s the title.
We’ve got 6 furry companions!
They are all proximity cats and we love it. They each have their thing when we go to bed. Usually 2-3 sleep with us, it’s a treat when we get 4 or 5. And they each have their own specific “spots” they gravitate towards.
Thankfully I’m a any of a cat behavior nerd and have gotten them to all leave us alone in the morning regarding food, there is one that will annoy me in the morning, but it’s because she’s wants cuddles, aggressively (She’ll straight up chomp me, actually bite me, if I ignore her sometimes)


Really with they would take security vulnerabilities seriously 😞
Because they are significant, and broad reaching.


Yeah but where do you download the whole raw data set?
The website itself shows you a table but doesn’t give you a raw download.
The download link links to a 404 page


In the US
Notice how it’s not Us it’s US and the sentence asserts that it is a place?
I’m not sure if you are acting dumb or not, if you are, it’s embarrassing.


It’s literally in the title. What else do you want?


deleted by creator


Usually DNSBL will do this, yes.
It’s prism. A multi-launcher for Minecraft Java edition.