@scruiser I have to ask: Does anybody realize that an LLM is still a thing that runs on hardware? Like, it both is completely inert until you supply it computing power, *and* it's essentially just one large matrix multiplication on steroids?
If you keep that in mind you can do things like https://en.wikipedia.org/wiki/Ablation/_(artificial/_intelligence) which I find particularly funny: You isolate the vector direction of the thing you don't want it to do (like refuse requests) and then subtract that vector from all weights.
@scruiser I have to ask: Does anybody realize that an LLM is still a thing that runs on hardware? Like, it both is completely inert until you supply it computing power, *and* it's essentially just one large matrix multiplication on steroids?
If you keep that in mind you can do things like https://en.wikipedia.org/wiki/Ablation/_(artificial/_intelligence) which I find particularly funny: You isolate the vector direction of the thing you don't want it to do (like refuse requests) and then subtract that vector from all weights.