Can you fine-tune on localized steering of an LLM?

hok@lemmy.dbzer0.com · 12 days ago

Thanks for the answer! I hadn’t thought about asking for recipes based on the specific ingredients you have left.

hok@lemmy.dbzer0.com · 12 days ago

More importantly, what was the recipe and was it any good?

hok@lemmy.dbzer0.com · edit-2 1 month ago

Thanks for your answer. I think to be clear, what I’m looking for is a kind of masked fine-tuning. You see, I want to “steer” a particular output instead of providing complete examples, which are costly to create.

The steering would be something like this:

I have an LLM generate a sequence.
I find exactly where the LLM goes “off track” and correct it there (for only maybe 10-20 tokens instead of correcting the rest of the generation manually).
The LLM continues “on track” until it goes off track again.

What I would like to do is train the model based on these corrections I give it, where many corrections might be part of the same overall generation. Conceptually I think each correction must have some training value. I don’t know much about masking, but what I mean here is that I don’t want it to train on a few tens or hundreds of (incomplete) samples but rather thousands of (masked) “steers” that correct the course of the rest of the sample’s generated text.

hok@lemmy.dbzer0.com · edit-2 2 months ago

Can you fine-tune on localized steering of an LLM?

hok@lemmy.dbzer0.com · 2 months ago

Thank you so much, that exactly answers my question with the official response (that guy works at Meta) that confirms it’s the same base model!

I was concerned primarily because in the release notes it strangely didn’t mention it anywhere, and I thought it would have been important enough to mention.

hok@lemmy.dbzer0.com · 2 months ago

Llama 3.3 70b - End of open-weight pretrained models from Meta or just a better Llama 3.1 405b finetune?

hok@lemmy.dbzer0.com · 2 months ago

On Lemmy, everything is a bit leftist at the moment.

hok@lemmy.dbzer0.com · 4 months ago

What models can we use for img2img today?