sapient [they/them]

sapient [they/them]@sh.itjust.works · 3 years ago

I’m not sure they’ll succeed in extinguishing linux. But I do get the worry, especially with WSL.

What I am more worried about is them potentially extinguishing git via their control of github. In particular, with their github cli tool and such >.<

sapient [they/them]@sh.itjust.works · 3 years ago

I’ve always thought of “blob” in yerms of ot being opaque and hard to understand, like a blob of putty with little structure you can dig into to get at it, you just have to take it as one solid barely understandable mass to use it.

Never thought of it as Binary Large OBject ;p

sapient [they/them]@sh.itjust.works · 3 years ago

Something that might be useful long term is trying to train an AI and release weights to identify CSAM that admins can use to check images. The main problem is finding a way to do this without storing those kinds of images or video :/

My understanding is that right now, the main mechanisms involved use several central databases which use perceptual hashes of known CSAM material. The problem is that this ends up being a whackamole solution, and at least in theory governments could use these databases to censor copyrighted or more general “unapproved” content, though i imagine such a db would lose trust quickly and I’m not aware of this being an issue in practise.

One potential solution is “opportunistic training” where, when new CSAM material gets identified and submitted to the FBI or these databases by various server admins, a small amount of training is done on the AI weights before the image or video is deleted and only a perceptual hash remains. Furthermore, if a picture is reported as “known CSAM” by these dbs, then you do the same thing with that image before it gets deleted.

To avoid false positives, you also train the AI on general non-CSAM content.

Ideally this process would be fully automated so no-one has to look at that shit - over time, ypu’d theoretically get a neural net capable of identifying CSAM reliably with few or no false positives or false negatives ^.. Admins could also try for some kind of distributed training, where each contributes weight deltas from local training, or each builds up LoRA-style improvement modules and people combine them to reduce bandwidth for modification sharing.

sapient [they/them]@sh.itjust.works · 3 years ago

A lot of phones can relay to wifi they are connected to, rather than just using phone signal. That is, instead of using mobile data to provide internet, it forwards connections through the wifi the phone is connected to, essentially acting as a mini router :)