

Due to the federated nature of Lemmy, and the need to prevent one account voting multiple times, and the ActivityPub protocol it’s all built on… all votes are effectively public, though Lemmy does try to obscure this a bit. But anyone can set up an instance, link to other instances, and be sent all the votes and who did them.
Lucky for you there is a site https://lemvotes.org/ that saves you this hassle, it’s in the form of looking up votes for a post rather than bulk data.







As someone with a public facing website, there are significant volumes of scraping still happening. But largely this appears to come out of South East Asia and South America and they take steps to hide who they are so it’s not clear who is doing it or why, but like you say it doesn’t appear to be OpenAI, Google, etc.
It doesn’t appear to be web search indexing, the scraping is aggressive and the volume will bring down a Lemmy server no matter how powerful the hardware.