![](/static/8d95373c/assets/icons/icon-96x96.png)
![](https://sh.itjust.works/pictrs/image/6efb2f51-566c-4c5c-9e4f-d8800c61f3b9.png)
More importantly, what was the recipe and was it any good?
More importantly, what was the recipe and was it any good?
Thanks for your answer. I think to be clear, what I’m looking for is a kind of masked fine-tuning. You see, I want to “steer” a particular output instead of providing complete examples, which are costly to create.
The steering would be something like this:
What I would like to do is train the model based on these corrections I give it, where many corrections might be part of the same overall generation. Conceptually I think each correction must have some training value. I don’t know much about masking, but what I mean here is that I don’t want it to train on a few tens or hundreds of (incomplete) samples but rather thousands of (masked) “steers” that correct the course of the rest of the sample’s generated text.
Thank you so much, that exactly answers my question with the official response (that guy works at Meta) that confirms it’s the same base model!
I was concerned primarily because in the release notes it strangely didn’t mention it anywhere, and I thought it would have been important enough to mention.
On Lemmy, everything is a bit leftist at the moment.
Thanks for the answer! I hadn’t thought about asking for recipes based on the specific ingredients you have left.