For example, projects trying to detect artifacts in data generated by a neural network, using a “simple” algorithm. The same way compression can be seen when data is analyzed. Anything that isn’t “our neural network detects other neural networks” and that isn’t some proprietary bullshit.
Projects trying to block scrapers as best they can or feed them garbage data.
Some collaborative networks for detecting and storing in a database famous data like images or text which has very likely been generated by a neural network. Only if the methods of detection are explained and can be verified of course, otherwise anybody can claim anything.
It would be nice to have a updating pinned post or something with links to research or projects trying to untangle this mess.
The only project I can think of now: https://xeiaso.net/blog/2025/anubis/
For artists the first thing that came to mind was Nightshade: https://nightshade.cs.uchicago.edu/whatis.html
Yep that’s nice, although it seems to be proprietary which isn’t ideal, it’s the last thing we need now. Companies exploited the hell out of everything, now companies/universities exploiting the solutions too, when there’s absolutely nothing stopping it from being open.
It’s interesting but can’t work outside of a lab. The example they gave was watermarking a picture of a cow with a purse. If every cow picture has a hidden purse watermark, an AI will be trained into categorizing a cow as a purse.
But to make that work in the real world would require every artist and especially stock photo sites to agree to watermark every cow with a purse. If everyone doesn’t pick a consistent watermark of a purse for a cow, then it becomes noise that is trained out. Just like training a to identify a cow sometimes has a farmhouse in the picture, other times grass, other times birds. It learns cow because that’s the consistent part. Without a purse watermarked into the majority of every cow photo everywhere, the ai will learn cow.
~~LLMs are not neural networks, though. ~~
Turns out they absolutely are. Not all neural networks are LLMs, though.
I’m going to link this at the current revision, so that it makes sense in the future: https://en.wikipedia.org/w/index.php?title=Transformer_(deep_learning)&oldid=1333135164
Read the first line from the link, I’ll add it here if you’re lazy: “In deep learning, the transformer is an artificial neural network…”
Do you know what “GPT” stands for? “Generative Pre-trained Transformers”
What were you thinking LLMs use? They’re literally just neural networks stacked as much as possible. That’s why they require all of those data centers, because their only solution to the problem is adding more neural nets and more data which means more hardware, at this point it’s borderline brute forcing. Sure, you can mention the “”“clever”“” tricks they use to “tokenize” words at the beginning, but that’s still a neural net in itself. Don’t get confused by their terminology, every single bit of the “technology” has impressive sounding names until you see how they actually work and smack your forehead so hard it leaves a mark forever.
Oh, you’re absolutely right. I didn’t realize that GPTs are of course an ANN variant, I always envisioned them as essentially very large and boring vector databases.
I might want to rephrase: not all neutral networks are LLMs.
I personally hate the current “AI” scam with all my heart and I’m so very aware of the extremely limited utility and unsustainable resource demands of the GPT approach. But I have no problem with the more abstract concept of neural networks per se. I expect them to be quite fundamental to any attempt at “real” AI, if we ever get past the current craze.
attempt at “real” AI
I’m going to argue that there’s no such thing as “real AI”. We are going to create replicas of brains once we understand them fundamentally. I mean to the point we can explain them the same way we know how a CPU architecture works. Right now I think we’re insanely far from that. We barely understand brain diseases or how neurotransmitters work exactly, let alone big structures of neurons.
My argument is, we don’t even know what “real AI” means, because we don’t know what “I” means yet.
Whatever actual “AI” would look like, we can agree GPTs are not it.
What’s funny about current GPTs is how much manual adjustments they’re doing on them, when the whole idea of making them is so that they “adjust themselves” which of course was total bullshit from the start.
My idea would be to implement PGP like encryption into everything. A single user reading a Lemmy thread would require a little extra computation delay. But that computation load would become cost prohibitive for a scraper.
Well that’s what the project I linked does, although I’m not sure it solves all of the issues right now, it’s definitely a start.
https://severian-poisonous-shield-for-images.static.hf.space/index.html
I have this one saved away to check out once I had some time to look into it. Not sure how effective it is. Heard nightshade didn’t work anymore or something along those lines but don’t quote me on that.
Cool, but again, seems proprietary, which is not ideal. Also isn’t it a bit backwards to add artifacts, instead of look for ways to detect artifacts in generated images, so that we catch early and avoid AI in the first place?


