Yes. GLM 4.5 is excellent. I mostly use that along with Qwen3 and DeepSeek 3.1 Terminus (Thinking) these days.

If you’re wanting a similar style and most of the capability though probably somewhat lower quality components due to the affordable price, I bought the similarly-styled Ride1UP Portola and have been loving it. It’s even been mistaken for a Brompton before and has gotten a lot of compliments. And it’s a fraction of the price. I wish it had a torque instead of cadence sensor to trigger the electric assist but it has plenty of power and range for me. Check out the ‘warm gray’ colorway for the classic Brompton style.


Thanks. I may give an updated system prompt like this a shot. Not sure where mine went wrong other than maybe it wasn’t being honored or seen by OpenRouter (I’m not running 120b locally, it’s too large for my set up). I’m actually a bit confused on how to set parameters with OpenRouter.


Yes, I do local host several models. Mostly the Qwen3 family stuff like 30b a3b etc. Have been trying GLM 4.5 a bit through OpenRouter and I’ve been liking the style pretty well. Interesting to know I could just pop in some larger RAM dimms potentially and run even larger models locally. The thing is OR is so cheap for many of these models and with zero data retention policies I feel a bit stupid for even buying a 24 GB VRAM GPU to begin with.


Is this Grok Code fast 1? I’ve noticed it’s hitting tops on OR for programming as of recently. I was going to try it out but it won’t respect my zero data retention preference unsurprisingly.


Honestly it has been good enough until recently when I’ve been struggling specifically with docker networking stuff and it’s been on the struggle bus with that. Yes, I’m using OpenRouter via OpenWebUI. I used to run a lot of stuff locally (mostly 4-b it quant 32b and smaller since I only have a single 3090) but lately I’ve been trying more larger models out on OpenRouter since many of the non proprietary ones are super cheap. Like fractions of a penny for a response… Many are totally free to a point as well.


I tried that! I literally told it to be concise and to limit its response to a certain number of words unless strictly necessary and it seemed to completely ignore both.


I definitely have been looking out for this for a while. Wanting to replicate GPT deep research but not seeing a great way to do this. I did see that there was a OWUI tool for this but it didn’t seem particularly battle-tested so I hadn’t checked it out yet. I’ve been curious about how the new Tongyi Deep Research might be…
That said, specifically for troubleshooting somewhat esoteric (or at least quite bespoke in terms of configuration) software problems, I was hoping the larger coder focused models would have enough built-in knowledge to suss out the issues. Maybe I should be having them consistently augment their responses with web searches if this isn’t the case? I have not been clicking that button typically.
I do generally try to paste in or link as much of the documentation for whatever software I’m troubleshooting though.


The coder model (480B). I initially mistakenly said the 235b one but edited that. I didn’t know you could customize quant on OpenRouter (and I thought the differences between most modern 4 bit quants and 8-bit was minimal as well…) I have tried GPT OSS 120 a bunch of time and though it seems quote unquote ‘intelligent’ enough it is just too talkative and verbose for me (plus I can’t remember the last time it responded without somehow working an elaborate comparison table into the response) and it makes it too hard to parse through things.


Appreciate you sharing your experience. With this being the case and it being an order of magnitude more $$$ than Qwen3 coder, I think I’ll mostly steer clear for now. Not sure why this model seems to have such mindshare and dominance with programmers these days honestly. Other than many in the west seem somewhat biased against Chinese models.


Even if this isn’t going to solve the issue of the quality of the LLM’s advice and help, it would massively simplify my current workflow which is copy/pasting logs and command responses and everything into the OWUI window. I’ll check it out. Can you use OpenRouter with VSCode to have access to more models or?


Yeah, it’s a great game. I didn’t quite do the full completionist thing, but I told myself I’d get at least 95% of all of the banandium gems for each level and that would be good enough haha.
I actually do like roguelikes and might check this out because if it does have a good gameplay loop it will have a ton of replayability as opposed to a DLC that just adds another world or two and you’re done with it in four hours. It’s cool they’re offering a demo as well.


In my opinion, Qwen3-30B-A3B-2507 would be the best here. Thinking version likely best for most things as long as you don’t mind a slight penalty to speed for more accuracy. I use the quantized IQ4_XS models from Bartowski or Unsloth on HuggingFace.
I’ve seen the new OSS-20B models from OpenAI ranked well in benchmarks but I have not liked the output at all. Typically seems lazy and not very comprehensive. And makes obvious errors.
If you want even smaller and faster the Qwen3 Distill of DeepSeek R1 0528 8B is great for its size (esp if you’re trying to free up some VRAM to use larger context lengths)


I used to think this would be sufficient as well but as you’ve probably noticed it’s damned near impossible to get our “democratic” governments to even consider something like this since they have been so thoroughly captured by the wealthy elite. You don’t have to call it revolution or socialism if those are scary words or just don’t seem possible but somehow we have to overcome that reality and I don’t see how it can be done without dispossessing these elites of their power so we can dispossess them of their wealth.


The problem is that wealth = power and influence in society and this has led to essentially all modern neoliberal ‘democracies’ having their governments captured and controlled by the wealthy elite. So the only way you would be able to implement something like this would be via a revolution replacing capitalism with some form of socialism that would make private wealth a thing of the past and provide a decent standard of living for all.
I’m certainly not trying to say that other claims about Donald Trump’s association and friendship with Epstein aren’t true but the specifics of this one at least seem questionable.


Not to mention battery life…
Much of this reads like what the Democrats’ strategy has already been for the last 10 years or so with some additional calls towards a “moderate” “centrism” that has not proven to be as popular as left populist policies with most Americans. And a lot of it seems confused or just wrong. For example, Medicare for All is not an unpopular policy. Last I checked it it was polling at around a 60% approval.
Even if your politics are more centrist, I don’t see how this represents a substantive shift in any way. It’s minor tinkering around the edges or slightly altering messaging. That’s what the moment calls for? That’s the key to success and the way to fight back against an increasingly overt and ascendant strain of fascism in the country? If that’s what the Democratic leadership thinks then one must honestly wonder if the party is institutionally incapable of the change that would need to happen to actually win elections and improve their godawful approval ratings. And based on their donor base, this shouldn’t actually be surprising. The leadership is still utterly captured by and beholden to wealthy interests that would rather jump off a bridge than acknowledge that what Americans want is a party that will fight for meaningful material benefits in their standard of living as this implies meaningfully raising taxes and imposing costs or regulations on their businesses to ensure benefits for regular working class people who are struggling mightily.
.