When I learned that it could factor primes, I got it to write me a simple python GUI that would calculate a shitload of primes, then pick big ones at random, then multiply them, then spit out to clipboard a prompt asking ChatGPT to factor the result. I spent an afternoon feeding it these giant numbers and making it factor them back to their constituent primes.
But don’t LLMs not do math, but just look at how often tokens show up next to each other? It’s not actually doing any prime number math over there, I don’t think.
If I fed it a big enough number, it would report back to me that a particular python math library failed to complete the task, so it must be neralling it’s answer AND crunching the numbers using sympy on its big supercomputer
Is it running arbitrary python code server side? That sounds like a vector to do bad things. Maybe they constrained it to only run some trusted libraries in specific ways or something.
They do math, just in a very weird (and obviously not super reliable) way. There is a recent paper by anthropic that explains it, I can track it down if you’d be interested.
Broadly speaking, the weights in a model will form sorts of “circuits” which can perform certain tasks. On something hard like factoring numbers the performance is probably abysmal but I’d guess the model is still trying to approximate the task somehow.
You could probably just say “thank you” over and over. Neural networks aren’t traditional programs that exit early for trivial inputs. If you get a traditional program to sort a list, the first thing it’ll do is check to see if the input is already sorted and exit if it is. The first thing AI does is convert the list into starting values for variables in a giant equation with billions of variables. Getting an answer requires calculating the entire thing.
Maybe these larger models have some preprocessing of inputs by a traditional program to filter stuff, but seeing as they all seem to need a nuclear power plant and 10,000 GPUs to run, I’m guessing there isn’t much optimization.
When I learned that it could factor primes, I got it to write me a simple python GUI that would calculate a shitload of primes, then pick big ones at random, then multiply them, then spit out to clipboard a prompt asking ChatGPT to factor the result. I spent an afternoon feeding it these giant numbers and making it factor them back to their constituent primes.
Polluting the atmosphere to own the cons.
This is the left’s “rolling coal” lmao
But don’t LLMs not do math, but just look at how often tokens show up next to each other? It’s not actually doing any prime number math over there, I don’t think.
If I fed it a big enough number, it would report back to me that a particular python math library failed to complete the task, so it must be neralling it’s answer AND crunching the numbers using sympy on its big supercomputer
Is it running arbitrary python code server side? That sounds like a vector to do bad things. Maybe they constrained it to only run some trusted libraries in specific ways or something.
They do math, just in a very weird (and obviously not super reliable) way. There is a recent paper by anthropic that explains it, I can track it down if you’d be interested.
Broadly speaking, the weights in a model will form sorts of “circuits” which can perform certain tasks. On something hard like factoring numbers the performance is probably abysmal but I’d guess the model is still trying to approximate the task somehow.
You could probably just say “thank you” over and over. Neural networks aren’t traditional programs that exit early for trivial inputs. If you get a traditional program to sort a list, the first thing it’ll do is check to see if the input is already sorted and exit if it is. The first thing AI does is convert the list into starting values for variables in a giant equation with billions of variables. Getting an answer requires calculating the entire thing.
Maybe these larger models have some preprocessing of inputs by a traditional program to filter stuff, but seeing as they all seem to need a nuclear power plant and 10,000 GPUs to run, I’m guessing there isn’t much optimization.