Offline LLMs exist but tend to have a few terabytes of base data just to get started (e.g. before LORAs)
Could you crunch an LLM into 700Mb that was still functional? Cause this looks like a fun thing to actually do as a joke.
Edit, I bet I could get https://huggingface.co/distilbert/distilgpt2 to run off a CD. How many tps am I gonna get guys 🤣
Qwen3-0.6B is about 400 MB at Q4 and is surprisingly coherent for what it is.
Wow, just popped it onto my very slow desktop and this little model rips haha. I really think tiny LLMs with a good LoRA on top are going to be a huge deal going forward
That’s so crazy that an LLM capable of doing anything at all can be that small! That’s leaves room for like an entire .avi episode of family guy at dvd resolution on there, which is the natural choice for the remaining space of course
there’s also tinyllama, which is somewhere around 600MB. it’s hilariously inept. it’s like someone jpeg-compressed a robot.
also you’re only gonna load off of that cd once so it’ll perform fine.
Does anyone know of any OSS LLMs that can search the web the way ChatGPT can?
It’s not the LLM that does the web searching, but the software stack around it. On its own, an LLM is just a text completer. What you’d need a frontend like OpenWebUI or Perplexica that would ask the LLM for, say five internet search queries that could return useful information for the prompt, throw those queries into SearxNG, and then pipe the results into the LLM’s context for it to be used.
As for the models themselves, any decently-sized one that was released fairly recently would work. If you’re looking specifically for open-source rather than open-weight models (meaning that the training data and methodologies were also released rather than just the model weights), GPT-OSS 20B/120B and the OLMo models are recent standouts there. If not, the Qwen3 series are pretty good. (There are other good models out there, this is just what I remember off the top of my head.)
Depends. Does ChatGPT ignore robots.txt too?
FCKGW-RHQQ2-YXRKT-8TG6W-2B7Q8
make sure to disconnect the internet first
CrAcKeD
You can get offline versions of LLMs.
I’ve been toying with Qwen3.
On my steam deck.
8 bil param model runs stably.
Its’s opensource too!
Alpaca is a neat little flatpak that containerizes everything and makes running local models so easy that I can literally do it without a mouse or keyboard.
And gpt-oss is an offline version of chatgpt
I mean, most people have a local LLM in their pocket right now.
First thing that came to mind: GPT4All
It’s just audio of French farting cats.
Le pfffft.
If we assume a CD, you can probably fit a 256M parameters model in it. But it will LOAD.
DVDs exist. They can fit approx. 7B params, enough to be somewhat productive.
That’s just Dr Sbaitso.
It reminds me of the Britannica Encyclopedia on CD.
Encarta 95
Isn’t it possible to download all of wikipedia, and it being surprisenly a small file size? Can it fit on a CD?
It could fit on a BDXL disc.
You can fit text-only wikipedia on a normal Blu Ray as it’s only about 24GB. You can also easily fit Llama 3.1 or any of the other open, offline capable ai models as they’re only about 4GB.
could also store it on a flashdrive or micro sd card
No
(English) 24,05GB without media. Adding media adds 428,36TB.
Can you give me the text only version link? I found only a version that is like 43gb
The sizes I mentioned are from around 2023-2024, from https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia
https://dumps.wikimedia.org/enwiki/ (https://en.wikipedia.org/wiki/Wikipedia:Database_download)
I suggest the happy medium called Kiwix, directly from the programme you can download all of Wikipedia with medium-sized pictures for a hundred gigabytes or so.
KiwiX on mobile gives 111.1GB Wikipedia download, It also has a bunch of diff categories if you don’t want the super large one.
500TB is still surprisingly reasonable for what is essentially a library of human (surface level) knowledge.
It would be interesting to know how large the file would be including all text form references (i’d imagine anything else such as videos would completely blow the proportions)
The full 2025-04 English-only ZIM dump is about 120 GB. That includes reduced-size images as well as all articles. I think the text-only version is in the 40-60 GB range.
There are smaller ZIM versions in the ~4 GB range that would fit on a DVD, but they’re only a subset for specific topics or for a list of the most popular topics.
No, you really can’t; It’s like 43 gb the text only version
So gonna need like 2 CDs then
kiwix? that’s compressed (afaik), and when i tried, it took up half of my disk space and needed ethernet
Maybe they meant GTA?


















