My take on AI is obvious, but I have not seen it anywhere.

niva@discuss.tchncs.de · 1 year ago

My take on AI is obvious, but I have not seen it anywhere.

iopq@lemmy.world · 1 year ago

What’s the point of talking to yourself? Can you get better output by running it by yourself?

niva@discuss.tchncs.de · 1 year ago

Well, me as a human, yes! We all constantly have an inner dialog that helps us to solve problems. And LLMs could do this as well. It is in principle not so much different from playing chess against yourself. As far as I know, these chess NN are playing against older versions of themself to learn. So it doesn’t have to play against the exact copy of itself.

Some of the training of image generators is done by two different AIs. AI-1 learns to differentiate between generated and real images and AI-2 tries to trick AI-1 by generate images that AI-1 can’t differentiate from real images. They both train each other! And the result is that AI-2 can create images that are very close to real images. All without any human interaction. But they do need real images as training data.

iopq@lemmy.world · 1 year ago

There are two steps:

Play chess against the best known version, with both sides being the stronger version. You have a new version baking that’s learning from this.
Test the new version when it’s got enough games learned against the known best and see if it’s winning more matches to become the new best.

But how does it learn from watching? It has a predictive NN that tries to predict the best next move simply by looking at the board. The next move is generated by thinking a long time about a bunch of positions, so if you can reliably get the next move by just doing one board position, it would be great. It also has the ability to guess who’s winning and by how much (either percentage or material)

It increases this ability by comparing its output to the positions/win rates read out by the strongest version. You either improved or you didn’t, there’s a metric you can check and you can also do some test matches once you stop improving so quickly.

It’s not clear what metric you want to optimize