/hn/comments_44850913.gph on codevoid.de

	_______ __ _______
	\| \| \|.---.-..----.\| \|--..-----..----. \| \| \|.-----..--.--.--..-----.
	\| \|\| _ \|\| __\|\| < \| -__\|\| _\| \| \|\| -__\|\| \| \| \|\|__ --\|
	\|___\|___\|\|___._\|\|____\|\|__\|__\|\|_____\|\|__\| \|__\|____\|\|_____\|\|________\|\|_____\|
	on Gopher (inofficial)
	Visit Hacker News on the Web


	COMMENT PAGE FOR:
	How I code with AI on a budget/free


	Pugsy99 wrote 5 hours 22 min ago:
	This is really useful, thank you so much

	Liftyee wrote 5 hours 35 min ago:
	Sidenote: On the second page, the 2nd and 4th paragraphs seem
	duplicated?

	heraldgeezer wrote 6 hours 24 min ago:
	Kinda reads like old style blog but worse but good tips.

	Also grok disclaimer lol

	personjerry wrote 10 hours 50 min ago:
	It's another advertisement article

	zwnow wrote 10 hours 53 min ago:
	All the AI corps have a free model, thats enough to use it for free no?

	DrSiemer wrote 19 hours 56 min ago:
	Ha, I'm working on a similar tool: [1] Glad to see I'm not the only one
	who prefers to work like that. I don't need many different models
	though, the free version of Gemini 2.5 Pro is usually enough for me.
	Especially the 1.000.000 token context length is really useful. I can
	just keep dumping full code merges in.

	I'll have a look at the alternatives mentioned though. Some questions
	just seem to throw certain models into logic loops.

	[1]: https://github.com/DrSiemer/codemerger

	iLoveOncall wrote 20 hours 23 min ago:
	This is nightmarish, whether or not you like LLMs.

	Just use Amazon Q Dev for free which will cover every single area that
	you need in every context that you need (IDE, CLI, etc.).

	matrixhelix wrote 21 hours 27 min ago:
	[1] [2] [3] [4] [5] [6] [7] [8]

	[1]: https://claude.ai
	[2]: https://chat.z.ai
	[3]: https://chatgpt.com
	[4]: https://chat.qwen.ai
	[5]: https://chat.mistral.ai
	[6]: https://chat.deepseek.com
	[7]: https://gemini.google.com
	[8]: https://dashboard.cohere.com
	[9]: https://copilot.microsoft.com

	imasl42 wrote 21 hours 43 min ago:
	You might find this repo helpful, it compares popular coding tools by
	hours with top-tier LLMs like Claude Sonnet:

	[1]: https://github.com/inmve/free-ai-coding

	gkoos wrote 23 hours 23 min ago:
	Looks like somebody is a tad bit over reliant on these tools but other
	than that there is a lot of value in this article

	scosman wrote 1 day ago:
	The qwen coder CLI gives you 1000 free requests per day to the qwen
	coder model (405b). Probably the best free option right now.

	faangguyindia wrote 1 day ago:
	Qwen cli uses whole file edit format which is slow and burns credits
	fast same is issue with gemini cli.

	indigodaddy wrote 23 hours 58 min ago:
	Do opencode/crush also have this problem?

	chrismustcode wrote 7 hours 47 min ago:
	I use opencode and never experienced it going through tokens at
	the speed I experienced Gemini CLI go through.

	Cant speak to qwen or crush as I have not used them

	Imustaskforhelp wrote 1 day ago:
	Ai studio using [1] is unlimited.

	I also use kiro which I got access for completely free because I was
	early on seeing kiro and actually trying it out because of hackernews!

	Sometimes I use cerebras web ui to get insanely fast token generation
	of things like gpt-oss or qwen 480 b or qwen in general too.

	I want to thank hackernews for kiro! I mean, I am really grateful to
	this platform y'know. Not just for free stuff but in general too.
	Thanks :>

	[1]: https://aistudio.google.com/

	codeclimber wrote 1 day ago:
	Nice write-up, especially the point about mixing different models for
	different stages of coding.
	Iâve been tracking which IDE/CLI tools give free or semi-free access
	to pro-grade LLMs (e.g., GPT-5, Claude code, Gemini 2.5 Pro) and how
	generous their quotas are. Ended up putting them side-by-side so itâs
	easier to compare hours, limits, and gotchas:

	[1]: https://github.com/inmve/free-ai-coding

	jug wrote 1 day ago:
	Itâs not free FREE but if you deposit at least $10 on OpenRouter, you
	can use their free models without credit withdrawals. And those models
	are quite powerful, like DeepSeek R1. Sometimes, they are rate limited
	by the provider due to their popularity but it works in a pinch.

	PufPufPuf wrote 1 day ago:
	Actually nowadays they allow unlimited usage of free models without
	depositing anything.

	5kyn3t wrote 1 day ago:
	Why is Mistral not mentioned. Is there any reason? I have the
	impression that they are often ignored by media, bloggers, devs when it
	comes to comparing or showcasing LLM thingies.
	Comes with free tier and quality is quite good. (But I am not an AI
	power user)

	[1]: https://chat.mistral.ai/chat

	sunaookami wrote 1 day ago:
	Becase Mistral is very bad, Qwen, Kimi and GLM are just better.

	epolanski wrote 1 day ago:
	Off topic but I use Mistral in production for various one shot tasks
	(mostly summarizing), it's incredibly cheap, fast and effective.

	Bonus: it's European, kinda tired of giving always money to the
	American overlords.

	3036e4 wrote 1 day ago:
	Maybe optimistic, but reading posts like this makes me hopeful that
	AI-assisted coding will drive people to design more modular and sanely
	organized code, to reduce the amount of context required for each task.
	Sadly pretty much all code I have worked with have been giant messes of
	everything being connected to everything else, causing the entire
	project to be potential context for anything.

	aitchnyu wrote 6 hours 39 min ago:
	Guess the name: In 2015 I was preaching that ____ simplfies the
	mental model of your web app, makes everything performant, the api
	will dominate and endure for decades. Answer: React. Soon, we were
	feeling the induced demand for features and timelines till we
	regressed to our natural "hair on fire" state. AI is (not just) a
	better footgun.

	bongodongobob wrote 20 hours 10 min ago:
	It's really very good at that. Frequently, I'll have something I've
	been working on over the years that has turned into an interconnected
	mess. "Split this code into modules of separated concerns". Bam,
	done. I used Claude for the first time last week and gave it a 2k
	line PowerShell script and it neatly pulled it apart into 5 working
	modules on the first try. Worked exactly the same, and ended up with
	better comments too.

	mattmanser wrote 18 hours 58 min ago:
	So I've done that sort of refactoring a lot, albeit on real code in
	much bigger systems, not a script. Lots of coders won't do this,
	they'll just keep adding to the crap, crazy big module.

	I always end up with a vastly smaller code base. Like 2000 lines
	turns into 800 lines or something like that.

	Did that happen too or did the AI just do a glorified 'extract
	method', that any decent IDE can already do without AI?

	I use AI, I'm not anti it, but on the other hand I keep seeing
	these gushing posts where I'm like 'but your ide could already do
	that, just click the quick refactoring button'.

	bongodongobob wrote 16 hours 37 min ago:
	It ended up being less, yeah, but not by that much, maybe 15%.
	Thing is, there was no "extract methods" possible. It was an old
	user script creation tool that had been modified over the last 10
	years. If it were that easy, I would have just done it myself.

	What this shows me is that it truly understands all the things
	this script was supposed to do and was able to organize it
	better, while not breaking any functionality.

	epolanski wrote 1 day ago:
	It does, you're essentially forced to write good coding guidelines
	and documentation.

	mathiaspoint wrote 1 day ago:
	LLMs will write code this way if you ask but you have to know to ask.

	casparvitch wrote 1 day ago:
	At that(/what) point does it become harder for a human to grok a
	project?

	saratogacx wrote 19 hours 18 min ago:
	It depends if you're willing to drop the $30 for the super
	version :)

	mathiaspoint wrote 1 day ago:
	That's always how it works no matter how good the model is. I'm
	surprised people keep forgetting this. If no one has the theory
	then the artifacts are almost unmaintainable.

	You can end up doing this with entirely human written code too.
	Good software devs can see it from a mile away.

	Oras wrote 1 day ago:
	OP must be a master of context switching! I canât imagine opening
	that number of tabs and still focus

	funkydata wrote 1 day ago:
	Also, well, I mean... If there's all that time/effort involved...
	Just get yourself some tea, coffee, doodle on some piece of paper, do
	some push-ups, some yoga, prey, meditate, breathe and then... Code,
	lol!

	brokegrammer wrote 1 day ago:
	These tricks are a little too much for me. I'd rather just write the
	code myself instead of opening 20 tabs with different LLM chats each.

	However, I'd like to mention a tool called repomix ( [1] ), which will
	pack your code into a single file that can be fed to an LLM's web chat.
	I typically feed it to Qwen3 Coder or AI Studio with good results.

	[1]: https://repomix.com/

	jstummbillig wrote 1 day ago:
	Let's just be honest about what it is we actually do: The more people
	maximize what they can get for free, the more other people will have to
	shoulder the higher costs or limitations that follow. That's completely
	fine, not trying to pass judgement â but that's certainly not "free"
	unless you mean exactly "free for me, somebody else pays".

	hoppp wrote 1 day ago:
	The chatgpt free tier doesn't seem to expire unlike claude or mistral
	ai, they just downgrade it to a different model

	burgerone wrote 1 day ago:
	Why are people still drawn to using pointless AI assistants for
	everything? What time do we save by making the code quality worse
	overall?

	ashirviskas wrote 17 hours 0 min ago:
	The answers would be similar to the question "why is Javascript so
	popular". It was not fast to run, not safe, not optimized and poor in
	most areas except for being almost universal and having results
	faster either due to js developers availability, or due to it being a
	high level language, even if it did try to multiply a "dog" string by
	2 sometimes in some spaghetti codebase. It got better, but even
	before that this formula was "delivery > quality". It's also why
	almost no one writes assembly for production. Or C, and we get tons
	of bloated electron apps.

	(If it was not clear, I have no love for JS and I never really
	programmed in it, but you have to admit, it did allow us to have more
	stuff. Even if 99% of it should be torched by fire if evaluated
	purely from engineering perspective)

	xwolfi wrote 15 hours 23 min ago:
	JS is cool because the browser interprets it to show nice effects
	on website. AI agents are pointless

	NKosmatos wrote 1 day ago:
	Now all we need is a wrapper/UI/manager/aggregator for all these "free"
	AI tools/pages so that we can use them without going into the hassle of
	changing tabs ;-)

	precompute wrote 1 day ago:
	I only use LLMs as a substitute for stackexchange, and sometimes to
	write boilerplate code. The free chat provided by deepseek works very
	well for me, and I've never encountered any usage limits. V3 / R1 are
	mostly sufficient. When I need something better (not very often), I
	use Claude's free tier.

	If you really need another model / a custom interface, it's better to
	use openrouter: deposit $10 and you get 1000 free queries/day across
	all free models. That $10 will be good for a few months, at the very
	least.

	Weetile wrote 1 day ago:
	I'd love to see a thread that also takes advantage of student offers -
	for example, GitHub Copilot is free for university and college students

	nottorp wrote 1 day ago:
	Was the page done with AI? The scrolling is kinda laggy. Firefox/m3
	pro.

	radio879 wrote 1 day ago:
	yeah i tried fixing it - the websites were more of an afterthought or
	annoying thing i had to do and definitely did it way too fast

	chvid wrote 1 day ago:
	Slightly off topic: What are good open weight models for coding that
	run well on a macbook?

	worik wrote 1 day ago:
	A lot of work to evaluate these models. Thank you

	radio879 wrote 1 day ago:
	I don't like or love many things in life, but something about AI
	triggered that natural passion I had when I was first learning to
	code as a kid. Its just super fun. Coding without AI stopped being
	fun looong time ago. Unlucky brain or genetics maybe. AI sped up the
	dopamine feedback iteration loop to where my brain can really feel it
	again. I can get an idea in my head and just an hour later, have it
	80% done and functioning. That gives me motivation, I won't get bored
	of the idea before I write the code.. which is what would happen a
	lot. Halfway done, get bored, then don't wanna continue.. AI fixed
	that

	bambax wrote 1 day ago:
	As the post says, the problem with coding agents is they send a lot of
	their own data + almost your entire code base for each request: that's
	what makes them expensive. But when used in a chat the costs are so low
	as to be insignificant.

	I only use OpenRouter which gives access to almost all models.

	Sonnet was my favorite until I tried Gemini 2.5 Pro, which is almost
	always better. It can be quite slow though. So for basic questions /
	syntax reminders I just use Gemini Flash: super fast, and good for
	simple tasks.

	hoerzu wrote 1 day ago:
	To stop tab switching I built an extension to query all free models all
	at once:

	[1]: https://llmcouncil.github.io/llmcouncil/

	unixfox wrote 1 day ago:
	Is it possible to have the source code? I see that there is a github
	icon at the bottom of the page but it doesn't work.

	nolist_policy wrote 1 day ago:
	But isn't it in the extension store?

	sublinear wrote 1 day ago:
	This all sounds a lot more complicated and time consuming than just
	writing the damn code yourself.

	gexla wrote 1 day ago:
	Wow, there's a lot here that I didn't know about. Just never drilled
	that far into the options presented. For a change, I'm happy that I
	read the article rather than only the comments on HN. ;)

	And lots of helpful comments here on HN as well. Good job everyone
	involved. ;)

	tonyhart7 wrote 1 day ago:
	I replicate SDD from kiro code, it works wonder for multi switching
	model because I can just re fetch from specs folder

	qustrolabe wrote 1 day ago:
	I bet it's crazy to some people that others okay with giving up so much
	of their data for free tiers. Like yeah it's better to selfhost but it
	takes so much resources to run good enough LLM at home that I'd rather
	give up my code for some free usage, anyway that code eventually will
	end up open source

	jama211 wrote 1 day ago:
	And as far as Iâm concerned if my work is happy for me to use
	models to assist with code, then itâs not my problem

	yichuan wrote 1 day ago:
	I think thereâs huge potential for a fully local âCursor-likeâ
	stack â no cloud, no API keys, just everything running on your
	machine.

	The setup could be:
	â¢ Cursor CLI for agentic/dev stuff (example: [1] )
	â¢ A local memory layer compatible with the CLI â something like
	LEANN (97% smaller index, zero cloud cost, full privacy, [2] ) or
	Milvus (though Milvus often ends up cloud/token-based)
	â¢ Your inference engine, e.g. Ollama, which is great for running OSS
	GPT models locally

	With this, youâd have an offline, private, and blazing-fast personal
	dev+AI environment. LEANN in particular is built exactly for this kind
	of setup â tiny footprint, semantic search over your entire local
	world, and Claude Code/ Cursor âcompatible out of the box, the ollama
	for generation. I guess this solution is not only free but also does
	not need any API.

	But I do agree that this need some effort to set up, but maybe someone
	can make these easy and fully open-source

	[1]: https://x.com/cursor_ai/status/1953559384531050724
	[2]: https://github.com/yichuan-w/LEANN

	andylizf wrote 1 day ago:
	Yeah, this seems a really fantastic summary of our ideal local AI
	stack. A powerful, private memory layer has always felt like the
	missing piece for tools like Cursor or aider.

	The idea of this tiny, private index like what the LEANN project
	describes, combined with local inference via Ollama, is really
	powerful. I really like this idea about using it in programming, and
	a truly private "Cursor-like" experience would be a game-changer.

	oblio wrote 1 day ago:
	You should probably disclose everywhere you comment that you're
	advertising for Leann.

	airtonix wrote 1 day ago:
	it might be free, private, blazing fast (if you choose a model with
	appropriate parameters to match your GPU).

	but you'll quickly notice that it's not even close to matching the
	quality of output, thought and reflecting that you'd get from running
	the same model but significantly high parameter count on a GPU
	capable of providing over 128gb of actual vram.

	There isn't anything available locally that will let me load a 128gb
	model and provide anything above 150tps

	The only thing that local ai model makes sense for right now seems to
	be Home Assistant in order to replace your google home/alexis.

	happy to be proven wrong, but the effort to reward just isn't there
	for local ai.

	PeterStuer wrote 1 day ago:
	Because most of the people squeezing that highly quantized small
	model into their consumer gpu don't get how they have left no room
	for the activation weights, and are stuck with a measly small
	context.

	xvv wrote 1 day ago:
	As of today, what is the best local model that can be run on a system
	with 32gb of ram and 24gb of vram?

	fwystup wrote 22 hours 31 min ago:
	Qwen3-Coder-30B-A3B-Instruct-FP8 is a good choice ('qwen3-coder:30b'
	when you use ollama). I have also had good experiences with [1]
	(built under a collaboration between Mistral AI and All Hands AI)

	[1]: https://mistral.ai/news/devstral

	ethan_smith wrote 1 day ago:
	DeepSeek Coder 33B or Llama 3 70B with GGUF quantization (Q4_K_M)
	would be optimal for your specs, with Mistral Large 2 providing the
	best balance of performance and resource usage.

	v5v3 wrote 1 day ago:
	Start with Qwen of a size that fits in the vram.

	hgarg wrote 1 day ago:
	Just use Rovodev CLI. Gives you 20 million tokens for free per 24 hours
	and you can switch between sonnet 4 / gpt-5.

	sumedh wrote 1 day ago:
	What is the catch?

	ireadmevs wrote 1 day ago:
	> Beta technology disclaimer
	> Rovo Dev in the CLI is a beta product under active development.
	We can only support a certain number of users without affecting the
	top-notch quality and user experience we are known for providing.
	Once we reach this limit, we will create a waiting list and
	continue to onboard users as we increase capacity. This product is
	available for free while in beta.

	From

	[1]: https://community.atlassian.com/forums/Rovo-for-Software-T...

	indigodaddy wrote 22 hours 36 min ago:
	Isn't this only available to a current Jira cloud/service
	subscription?

	joshdavham wrote 1 day ago:
	> When you use AI in web chat's (the chat interfaces like AI Studio,
	ChatGPT, Openrouter, instead of thru an IDE or agent framework) are
	almost always better at solving problems, and coming up with solutions
	compared to the agents like Cline, Trae, Copilot.. Not always, but
	usually.

	I completely agree with this!

	While I understand that it looks a little awkward to copy and paste
	your code out of your IDE and into a web chat interface, I generally
	get better results that way than with GitHub copilot or cursor.

	saejox wrote 3 hours 42 min ago:
	I also agree with this.

	While the ai has less context, you have more context using the
	limited chat window. You know what you need from the ai.

	SV_BubbleTime wrote 1 day ago:
	100% opposite experience.

	Whether agentic, notâ¦ itâs all about context.

	Either agentic with access to your whole project, âlivesâ in
	GitHub, a fine tune, or RAG, or whateverâ¦ having access to all of
	the context drastically reduces hallucinations.

	There is a big difference between âwrite xâ and âwrite x for …
	in my style, with all y dependencies, and considering all z code that
	exists around itâ.

	Iâm honestly not understand a defense of copy and paste AI
	codingâ¦ this is why agents are so massively popular right now.

	chazhaz wrote 1 day ago:
	Agreed that itâs all about context â but my experience is that
	pasting into web chat allows me to manage context much more than if
	I drop the whole project/whole filesystem into context. With the
	latter approach the results tend to be hit-and-miss as the model
	tries to guess whatâs right. All about context!

	b2m9 wrote 1 day ago:
	Iâm also surprised by this take. I found copy/paste between
	editor and external chats to be way less helpful.

	That being said, I think everyone has probably different
	expectations and workflows. So if thatâs what works for them, who
	am I to judge?

	andrewmcwatters wrote 1 day ago:
	I jump between Claude Sonnet 4 on GitHub Copilot Pro and now GPT-5 on
	ChatGPT. That seems to get me pretty far. I have gpt-oss:20b installed
	with ollama, but haven't found a need to use it yet, and it seems like
	it just takes too long on an M1 Max MacBook Pro 64GB.

	Claude Sonnet 4 is pretty exceptional. GPT-4.1 asks me too frequently
	if it wants to move forward. Yes! Of course! Just do it! I'll reject
	your changes or do something else later. The former gets a whole task
	done.

	I wonder if anyone is getting better results, or comparable for cheaper
	or free. GitHub Copilot in Visual Studio Code is so good, I think it'd
	be pretty hard to beat, but I haven't tried other integrated editors.

	bravesoul2 wrote 1 day ago:
	Windsurf has a good free model. Good enough for autocomplete level work
	for sure (haven't tried it for more as I use Claude Code)

	b2m9 wrote 1 day ago:
	You mean SWE-1? I used it like a dozen times and I gave up because
	the responses were so bad. Not even sure whether itâs good enough
	for autocomplete because itâs the slowest model Iâve tested in a
	while.

	bravesoul2 wrote 1 day ago:
	Not my experience for slowness. For smartness I am typically using
	it for simple "not worth looking that up" stuff rather than even
	feature implementation. Got it to write some MySQL SQL today, for
	example.

	indigodaddy wrote 1 day ago:
	Assuming you have to at least be logged into a windsurf account
	though?

	bravesoul2 wrote 1 day ago:
	Yeah. I didn't see not logged in as a requirement.

	radio879 wrote 1 day ago:
	I am the person that wrote that. Sorry about the font. This is a bit
	outdated, AI stuff goes at high speed. More models so I will try to
	update that.

	Every month so many new models come out. My new fav is GLM-4.5... Kimi
	K2 is also good, and Qwen3-Coder 480b, or 2507 instruct.. very good as
	well. All of those work really well in any agentic environment/in agent
	tools.

	I made a context helper app ( [1] ) which is linked to from there which
	helps jump back and forth from all the different AI chat tabs i have
	open (which is almost always totally free, and I get the best output
	from those) to my IDE. The app tries to remove all friction, and
	annoyances, when you are working with the native web chat interfaces
	for all the AIs. Its free and has been getting great feedback,
	criticism welcome.

	It helps the going from IDE <----> web chat tabs. Made it for myself to
	save time and I prefer the UI (PySide6 UI so much lighter than a
	webview)

	Its got Preset buttons to add text that you find yourself typing very
	often, per-project state saves of window size of app and which files
	were used for context. So next time, it opens at same state.

	Auto scans for code files, guesses likely ones needed, prompt box that
	can put the text above and below the code context (seems to help make
	the output better). One of my buttons is set to: "Write a prompt for
	Cline, the AI coding agent, enclose the whole prompt in a single code
	tag for easy copy and pasting. Break the tasks into some smaller tasks
	with enough detail and explanations to guide Cline. Use search and
	replace blocks with plain language to help it find where to edit"

	What i do for problem solving, figuring out bugs: I'm usually in VS
	Code and i type aicp in terminal to open the app. Fine tune any files
	already checked, type what i am trying to do or what problem i have to
	fix, click Cline button, click Generate Context!. Paste into GLM-4.5,
	sometimes o3 or o4-mini, GPT-5, Gemini 2.5 Pro.. if its a super hard
	thing i'll try 2 or 3 models. I'll look and see which one makes the
	most sense and just copy and paste into Cline in VS Code - set to GPT
	4.1 which is unlimited/free.. 4.1 isn't super crazy smart or anything
	but it follows orders... it will do whatever you ask, reliably. AND, it
	will correct minor mistakes from the bigger model's output. The bigger
	smarter models can figure out the details, and they'll write a prompt
	that is a task list with how-to's and why's perfect for 4.1 to go and
	do in agent mode....

	You can code for free this way unlimited, and its the smartest the
	models will be. Anytime you throw some tools or MCPs at a model it
	dumbs them down.... AND you waste money on all the API costs having to
	use Claude 4 for everything

	[1]: https://wuu73.org/aicp

	faangguyindia wrote 4 hours 20 min ago:
	How does your tool differ from this one? [1] If i am not wrong, aider
	uses repomap to select the context and pack it efficiently for LLM?
	this coupled with auto copy paste to and from chat web UI of AI
	provider?

	[1]: https://aider.chat/docs/usage/copypaste.html

	frumplestlatz wrote 11 hours 58 min ago:
	It was a bit difficult to trust the source after seeing the phrase
	"Nazi-adjacent" used in relation to Grok.

	busymom0 wrote 22 hours 45 min ago:
	I built a relevant tool (approved by Apple this week) which may help
	reduce the friction of you having to constantly copy paste text
	between your app and the AI assistant in browser.

	It's called SelectToSearch and it reduces my friction by 85% by
	automating all those copy paste etc actions with a single keyboard
	shortcut:

	[1]: https://apps.apple.com/ca/app/select-to-search-ai-assistant/...

	VagabundoP wrote 1 day ago:
	I tried Cline with chatgpt 4.1 and I was charged - there are some
	free credits when you sign up for Cline that it used.

	Not sure how you got it for free?

	radio879 wrote 17 hours 19 min ago:
	look up LLM7, and Pollinations AI. Both offer free GPT 4.1, but I
	am not sure how limited it is. They have tons more models but the
	names are different (openai-large = gpt-4.1)

	Meta has free and generous APIs for the crappy Llama 4 models...
	they're okay at summarizing things but I have no idea if its any
	good for code. Prob not since no one even talks about those
	anymore.

	debian3 wrote 17 hours 25 min ago:
	GH Copilot is my guess. Not free, but $10 a month or free for
	students

	VagabundoP wrote 5 hours 1 min ago:
	Ahh I think this is mentioned in the blog post proper. Probably
	just ommited from the text comment.

	Iwill try the free Windsurf tier using the prompts created and
	see if it can do it similar to 4.1

	stuart73547373 wrote 1 day ago:
	(relevant self promotion) i wrote a cli tool called slupe that lets
	web based llm dictate fs changes to your computer to make it easier
	to do ai coding from web llms

	[1]: https://news.ycombinator.com/item?id=44776250

	tummler wrote 1 day ago:
	Anecdotal, but Grok seems to have just introduced pretty restrictive
	rate limits. Theyâre now giving free users access to Grok 4 with a
	low limit and then making it difficult to manually switch to Grok 3
	and continue. Will only allow a few more requests before pushing an
	upgrade to paid plans. Just started happening to me last night.

	maxiepoo wrote 1 day ago:
	do you really have 20+ tabs of LLMs open at a time?

	radio879 wrote 1 day ago:
	some days.. it varies but a whole browser window is dedicated to it
	and always open

	dcuthbertson wrote 1 day ago:
	FYI: the first AI you link to, " z.ai's GLM 4.5", actually links to
	zai.net, which appears to be a news site, instead of "chat.z.ai",
	which is what I think you intended.

	radio879 wrote 1 day ago:
	oops. was using AI trying to fix some of the bugs and update it
	real fast with some newer models, since this post was trending
	here. Hopefully its scrolling better. Link fixed. I know its still
	ridiculous looking with some of the page but at least its readable
	for now.

	battxbox wrote 1 day ago:
	Fun fact, zai[.]net seems to be an italian school magazine. As an
	italian I've never known about it, but the words pun got me
	laughing.

	zai[.]net -> zainet -> zainetto -> which is the italian word for
	"little school backback"

	cropcirclbureau wrote 1 day ago:
	Note that the website is scrolling very slow, sub1-fps on Firefox
	Android. I'm also unable to scroll the call-out about grok. Also,
	there's this strange large green button reading CSS loaded at the
	top.

	morsch wrote 1 day ago:
	Works fine, Firefox Android 142.0b9

	subscribed wrote 1 day ago:
	I scroll just fine on Vanadium, Duck browser and brave.

	oblio wrote 1 day ago:
	On Android?

	ya3r wrote 1 day ago:
	Have you seen Microsoft's copilot? It is essentially free openai
	models

	simonw wrote 1 day ago:
	Which of their many Copilot products do you mean?

	ya3r wrote 11 hours 16 min ago:
	The regular copilot, copilot.microsoft.com

	T4iga wrote 1 day ago:
	And to anyone who has ever used it, it appears more like opening
	smoothbrain.
	For a long time it was the only allowed model at work and even for
	basic cyber security questions it was sometimes completely useless.

	I would not recommend it to anyone.

	teiferer wrote 1 day ago:
	> You can code for free this way

	vs

	> If you set your account's data settings to allow OpenAI to use your
	data for model training

	So, it's not "for free".

	can16358p wrote 1 day ago:
	Many folks, especially if they are into getting things free, don't
	really care much about privacy narrative.

	So yes, it is free.

	johnnyanmac wrote 10 hours 42 min ago:
	if you consider watching a hour of Youtube and 30 minutes of ads
	to be "free videos", then be my guest. Not everything can be
	measured in a dollar value.

	throwaway83711 wrote 1 day ago:
	It's a transactionâa trade. You give them your personal data,
	and you get their services in exchange.

	So no, it's not free.

	coliveira wrote 1 day ago:
	Tech companies are making untold fortunes from unsophisticated
	people like you.

	1dom wrote 1 day ago:
	> So yes, it is free.

	This sounds pedantic, but I think it's important to spell this
	out: this sort of stuff is only free if you consider what you're
	producing/exchanging for it to have 0 value.

	If you consider what you're producing as valuable, you're giving
	it away to companies with an incentive to extract as much value
	from your thing as possible, with little regard towards your
	preferences.

	If an idiot is convinced to trade his house for some magic beans,
	would you still be saying "the beans were free"?

	motoxpro wrote 23 hours 50 min ago:
	I don't think that's true. It's not that has zero value, it's
	that it has zero monetizable value.

	Hackernews is free. The posts are valuable to me and I guess my
	posts are valuable to me, but I wouldn't pay for it and I
	definitely don't expect to get paid.

	For YC, you are producing content that is "valuable" that
	brings people to their site, which they monetize through people
	signing up for their program. They do this with no regard for
	what your preferences are when they choose companies to invest
	in.

	They sell ads (Launch, Hire, etc.) against the attention that
	you create. You ARE the product on HackerNews, and you're OK
	with it. As am I.

	Same as OpenAI, I dont need to monetize them training on my
	data, and I am happy for you to as I would like to use the
	services for free.

	johnnyanmac wrote 10 hours 29 min ago:
	>Hackernews is free. The posts are valuable to me and I guess
	my posts are valuable to me, but I wouldn't pay for it and I
	definitely don't expect to get paid.

	at this point, we may need future forums to be premium so we
	can avoid the deluge of AI bots plauging the internet. a
	small, one time cost is a guaranteed way to make such
	strategies untenable. SomethingAwful had a point decades ago.

	But like any other business, you need to follow the money and
	understand the incentives. Hackernews has ads, but ads for
	companies with us as the audience. It's also indirectly an ad
	for YCombinator itself as bringing awareness of the
	accelerator (note what "hackernews.com" redirects to).

	I'm fine with a company advertising itself; if I wasn't the
	idea of a company ceases to really function. And in this
	structure for companies, I can also get benefits by
	potentially getting jobs from here. So I don't mind that
	either. Everything aligns. I agree and support the structure.
	I can't say that about many other "free" websites.

	As for me. I do want to monetize my data one day. I can't
	stop the scraping the entire internet over (that's for the
	courts), but I sure as heck won't hand it to them on a silver
	platter.

	motoxpro wrote 9 hours 11 min ago:
	Definitely to each their own. I will never have a job at a
	YC company and I will also never apply to YC, so the ads
	are completely useless. I did discover some of my favorite
	shoes from an IG ad, though.

	It wouldn't ever be worth me getting $.0001431 dollars for
	my data and individual data will always be worthless on
	it's own because 1. taking away one individuals data from a
	model does not make the model worse. 2. the price of an
	individuals data will always be zero because you have
	people like me who are willing to give it away for free in
	exchange for a free service (aka hackernews or IG)

	One user's LTV on IG may be $34, but one user's data is
	worth $0. Which I think a lot of people struggle with.

	From a more moral standpoint, the best part about the
	advertising business model is that it makes the internet
	open to everyone, not just those who can pay for every site
	they use.

	johnnyanmac wrote 8 hours 12 min ago:
	I'm not sure if I'd ever have a job at YC (my industry
	isn't very "investor friendly"). But I like the idea of
	having a bunch of opportunities with such companies. It
	also encourages an environment of people I want to be
	around as well. So that indirectly serves my interests.

	I will even use an ad example with conventions and
	festivals. You can argue an event like Comic-con is
	simply a huge ad. And it is. But I'm there "for the ad"
	in that case. It gathers other people "for the ad". It
	collectively benefits all of us to gather and socialize
	among one another.

	Ads aren't bad, but many ads primarily exist to distract,
	not to facilitate an experience. And as a hot take, maybe
	we do need to gatekeep a bit more in this day and age. I
	don't want a "free intent" if it means 99% of my
	interactions are with bots instead of humans. If it means
	that corporations determine what is "worthy" of seeing
	instead of peers. If credit cards get to determine what I
	can spend my money on instead of my own personal (and
	legal) taste.

	>It wouldn't ever be worth me getting $.0001431 dollars
	for my data and individual data will always be worthless
	on it's own

	On top of being a software engineers who's contributed to
	millions on value with my data, I also strive to be an
	artist. An industry that has spent decades being
	extracted from but not as fortunate to be compensated a
	living wage most often. People can argue that "art is
	worthless" , yet it also props up multiple billion dollar
	industries on top of societal cultured. An artisan these
	days can even sustain themselves as a individual, with
	much faster turnaround than trying to program a website
	or app.

	By all metrics, its hard to argue this sector's value is
	zero. Maybe having that lens only strengthened my stance,
	as a precursor to what software can become if you don't
	push against abuse early on.

	bayarearefugee wrote 1 day ago:
	I understand the point people are trying to make with this
	argument, but we are so far into a nearly universal scam
	economy where corporations see small (relative to their costs
	of business) fines as just part of normal expenses that I also
	think anyone who really believes the AI companies aren't using
	their data to train models, even if it is against their terms,
	is wildly naive.

	radio879 wrote 1 day ago:
	I should add a section to the site/guide about privacy, just
	letting people know they have somewhat of a choice with that.

	As for sharing code, most of the parts of a
	project/app/whatever have already been done and if an
	experienced developer hears what your idea is, they could just
	make it and figure it out without any code. The code itself
	doesn't really seem that valuable (well.. sometimes). Someone
	can just look at a screenshot of my aicodeprep app and just
	make one and make it look the same too.

	Not all the time of course - If I had some really unique
	sophisticated algorithms that I knew almost no one else would
	or has figured out, I would be more careful.

	Speaking of privacy.. a while back a thought popped into my
	head about Slack, and all these unencrypted chat's businesses
	use. It kinda does seem crazy to do all your business
	operations over unencrypted chat, Slack rooms.. I personally
	would not trust Zuckerberg to not look in there and run lots of
	LLMs through all the conversations to find anything 'good'!
	Microsoft.. kinda doubt would do that on purpose but what's to
	stop a rogue employee from finding out some trade secrets etc..
	I'd be suprised if it hasn't been done. Security is not usually
	a priority in tech. They half-ass care about your personal
	info.

	johnnyanmac wrote 10 hours 38 min ago:
	>Someone can just look at a screenshot of my aicodeprep app
	and just make one and make it look the same too.

	To some extent. But without your codebase they will make
	different decisions in the back which will affect a myriad of
	factors. Some may actually be better than your app, others
	will end up adding tech debt or have performance impacts. And
	this isn't even to get into truly novel algorithms; sometimes
	just having the experience to make a scalable app with best
	practices can make all the difference.

	Or the audience doesn't care and they take the cheaper app
	anyway. It's not always a happy ending.

	astrobe_ wrote 1 day ago:
	Sophistry. "many" according to which statistic? And just because
	some people consider that a trade is very favorable for them,
	doesn't it is not a trade and it doesn't mean they are correct -
	who's so naÃ¯ve they can beat business people at their own game?

	astrobe_ wrote 23 hours 11 min ago:
	they +think they+ can beat business people

	barrell wrote 1 day ago:
	Plenty of people can also afford to subscribe to these without
	any issue. They donât even know the price, they probably
	wonât even cancel it when they stop using it as they might not
	even realize they have a subscription.

	By your logic, are the paid plans not sometimes free?

	throwaway83711 wrote 1 day ago:
	While it is true that sometimes you are the product even if
	you're paying, I don't think anyone is trying to argue that
	obviously paid plans are free.

	worik wrote 1 day ago:
	I understand. I get the point. I disagree

	Privacy absolutely does not matter, until it does, and then it is
	too late

	Wilder7977 wrote 1 day ago:
	This is not only a privacy concern (in fact, that might be a tiny
	part since the code might end up public anyway?).
	There is an element of disclosure of personal data, there are
	ownership issues in case that code was not - in fact - going to
	be public and more.

	In any case, not caring about the cost (at a specific time)
	doesn't make the cost disappear.

	greggsy wrote 1 day ago:
	The point they are making is, that some people know that, and
	are not as concerned as others about it.

	teiferer wrote 22 hours 55 min ago:
	Not being concerned doesn't make the statement "it's free"
	more true.

	bahmboo wrote 1 day ago:
	I was going to downvote you but you are adding to the discussion.
	In this context this is free from having to spend money. Many of us
	don't have the option to pay for models. We have to find some way
	to get the state of the art without spending our food money.

	johnnyanmac wrote 11 hours 2 min ago:
	>We have to find some way to get the state of the art without
	spending our food money.

	If it's not your job: Do we "have to" find this way? What's the
	oppotunity cost compared to a premium subscription or using
	not-state of the art tools?

	If it is your job: it's putting food on the table. So it should
	be a relatively microscopic cost to doing business. Maybe even a
	tax write-off.

	freeopinion wrote 15 hours 59 min ago:
	There is a company that is advertising like crazy for
	programmers, data scientists, etc. They are looking for college
	kids, etc. They are paying better than McDonalds.

	What are they building? A training corpus.

	Are people who responds to their ads getting the money for free?

	Handing your codebase to an AI company is not nothing.

	twelve40 wrote 9 hours 38 min ago:
	> Handing your codebase to an AI company is not nothing.

	it's a battle that's already lost a long time ago. Every crappy
	little service by now indexes everything. If you ever touch
	Github, Jira, Datadog, Glean (god forbid), Upwork, etc etc they
	each have their own shitty little "AI" thing which means what?
	Your project has been indexed, bagged and tagged. So unless you
	code from a cave without using any saas tools, you will be
	indexed no matter what.

	pwndByDeath wrote 7 hours 32 min ago:
	I feel like this was understood.
	SaaS has your data, and the pan is very hot. Two lessons
	that learn quickly with experience.

	teiferer wrote 22 hours 50 min ago:
	I appreciate your consideration, disagree != downvote.

	To your point, "free from having to spend money" is exactly it.
	It's paid for with other things, and I get that some folks don't
	care. But being more open about this would be nice. You don't
	typically hide a monetary cost either, and everybody trying to do
	that is rightfully called out on it by being called a scam. Doing
	that with non-monetary costs would be a nice custom.

	ta1243 wrote 1 day ago:
	I don't trust any AI company not to use and monetise my data,
	regardless how much I pay or regardless what their terms of
	service say. I know full well that large companies ignore laws
	with impunity and no accountability.

	simonw wrote 1 day ago:
	I would encourage you to rethink this position just a little
	bit. Going through life not trusting any company isn't a fun
	way to live.

	If it helps, think about those company's own selfish
	motivations. They like money, so they like paying customers. If
	they promise those paying customers (in legally binding
	agreements, no less) that they won't train on their data... and
	are then found to have trained on their data anyway, they wont
	just lose that customer - they'll lose thousands of others too.

	Which hurts their bottom line. It's in their interest not to
	break those promises.

	Applejinx wrote 4 hours 30 min ago:
	I can't agree with a 'companies won't be evil because they
	will lose business if people don't like their evilness!'
	argument.

	Certainly, going through life not trusting any company isn't
	a fun way to live. Going through life not trusting in
	general, isn't a fun way to live.

	Would you like to see my inbox?

	We as tech people made this reality through believing in an
	invisible hand of morality that would be stronger than power,
	stronger than the profit motives available through
	intentionally harming strangers a little bit (or a lot) at
	scale, over the internet, often in an automated way, if there
	was a chance we'd benefit from it.

	We're going to have to be the people thinking of what we
	collectively do in this world we've invented and are
	continuing to invent, because the societal arbitrage vectors
	aren't getting less numerous. Hell, we're inventing machines
	to proliferate them, at scale.

	I strongly encourage you to abandon this idea that the world
	we've created, is optimal, and this idea that companies of
	all things will behave ethically because they perceive
	they'll lose business if they are evil.

	I think they are fully correct in perceiving the exact
	opposite and it's on us to change conditions underneath them.

	simonw wrote 3 hours 40 min ago:
	My argument here is not that companies will lose customers
	if they are unethical.

	My argument is that they will lose paying customers if they
	act against those customer's interests in a way that
	directly violates a promise they made when convincing their
	customers to sign up to pay them money.

	"Don't train on my data" isn't some obscure concern. If you
	talk to large companies about AI it comes up in almost
	every conversation.

	My argument here is that companies are cold hearted
	entities that act in their self interest.

	Honestly, I swear the hardest problem in computer science
	in 2025 is convincing people that you won't train on your
	data when you say "we won't train on your data".

	I wrote about this back in 2023, and nothing has changed:

	[1]: https://simonwillison.net/2023/Dec/14/ai-trust-cri...

	johnnienaked wrote 9 hours 8 min ago:
	This is so naive

	johnnyanmac wrote 10 hours 53 min ago:
	>Going through life not trusting any company isn't a fun way
	to live.

	Isn't that the Hacker mindset, though? We want to trailblaze
	solutions and share it with everyone for free. Always in
	liberty and oftentimes in beer too. I think it's a good
	mentality to have, precisely because of your lens of selfish
	motivations.

	Wanting money is fine. If it was some flat $200 or even $2000
	with legally binding promises that I have an indefinitely
	license to use this version of the software and they won't
	extract anything else from me: then fine. Hackers can be
	cheap, but we aren't opposed to barter.

	But that's not the case. Wanting all my time and privacy and
	data under the veneer of something hackers would provide with
	no or very few strings is not. tricks to push into that model
	is all the worse.

	> If they promise those paying customers (in legally binding
	agreements, no less) that they won't train on their data...
	and are then found to have trained on their data anyway, they
	wont just lose that customer - they'll lose thousands of
	others too.

	I sure wish they did. In reality, they get a class action,
	pay off some $100m to lawyers after making $100b, and the
	lawyers maybe give me $100 if I'm being VERY generous, while
	the company extracted $10,000+ of value out of me. And the
	captured market just keeps on keeping on.

	Sadly, this is not a land of hackers. It is a market of
	passive people of various walks of life: of students who do
	not understand what is going on under the hood (I was here
	when Facebook was taking off), of businsessmen too busy with
	other stuff to understand the sausage in the factory, of
	ordinary people who just wants to fire and forget. This
	market may never even be aware of what occurred here.

	alpaca128 wrote 1 day ago:
	> they wont just lose that customer - they'll lose thousands
	of others too

	No, they won't. And that's the problem in your argument.
	Google landed in court for tracking users in incognito mode.
	They also were fined for not complying with the rules for
	cookie popups. Facebook lost in court for illegally using
	data for advertising. Did it lose them any paying customer?
	Maybe, but not nearly enough for them to even notice a
	difference. The larger outcome was that people are now more
	pissed at the EU for cookie popups that make the greed for
	data more transparent. Also in the case of Google most money
	comes from different people than the ones that have their
	privacy violated, so the incentives are not working as you
	suggest.

	> Going through life not trusting any company isn't a fun way
	to live

	Ignoring existing problems isn't a recipe for a happy life
	either.

	simonw wrote 1 day ago:
	Landing in court is an expensive thing that companies don't
	want to happen.

	Your examples also differ from what I'm talking about.
	Advertising supported business models have a different
	relationship with end users.

	People getting something for free are less likely to switch
	providers over a privacy concern compared with companies is
	paying thousands of dollars a month (or more) for a paid
	service under the understanding that it won't train on
	their data.

	johnnyanmac wrote 10 hours 45 min ago:
	>Landing in court is an expensive thing that companies
	don't want to happen.

	"If the penalty is a fine, it's legal for the rich".
	These businesses also don't want to pay taxes or even
	workers, but in the end they will take the path of least
	resistence. if they determine fighting in court for 10
	years is more profitable than following regulations, then
	they'll do it.

	Until we start jailing CEO's (a priceless action), this
	will continue.

	>companies is paying thousands of dollars a month (or
	more) for a paid service under the understanding that it
	won't train on their data.

	Sure, but are we talking about people or companies here?

	ta1243 wrote 7 hours 14 min ago:
	CEO says the action was against policy and they didn't
	know, so the blame passes down until you get to a
	scapegoat that can't defend themselves.

	The underlying problem is that we have companies with
	more power than sovereign states, before you even
	include the power over the state the companies have.

	At some point in the next few decades of continued
	transfer of wealth from workers to owners more and more
	workers will snap and bypass the courts. The is what
	happened with the original fall of feudalism and
	warlords. This wasn't guaranteed though -- if the
	company owners keep themselves and their allies rich
	enough they will be untouchable, same as drug lords.

	teiferer wrote 8 hours 8 min ago:
	> Until we start jailing CEO's (a priceless action)

	In the context of the original thread here: If all you
	need to do is go to jail then whatever that's for was
	"for free"!

	frankzander wrote 1 day ago:
	Hm why pay for something when I can get it for free? Being
	miserly is a skill that can save a lot of money.

	johnnyanmac wrote 10 hours 43 min ago:
	Remember: if you're not paying for the product, you ARE the
	product.

	If you're fine with compromising your privacy and having others
	extract wealth from you, you can go the "free" route.

	frankzander wrote 10 hours 38 min ago:
	You are the product no matter how much you pay tbh

	freeopinion wrote 15 hours 44 min ago:
	I built a simple little CRUD app for somebody the other day.
	They were very appreciative of the free app. So they bought me
	a pizza.

	I got a free pizza just for coding a little app. That saved me
	a lot of money.

	hx8 wrote 1 day ago:
	I live a pretty frugal life, and reached the FI part of FIRE in
	my early 30s as an averagely compensated software engineer.

	I am very skeptical anytime something is 'free'. I specifically
	avoid using a free service when the company profits from my use
	of the service. These arrangements usually start mutually
	beneficial, and almost always become user hostile.

	Why pay for something when you can get it for free? Because
	the exchange of money for service sets clear boundaries and
	expectations.

	PeterStuer wrote 1 day ago:
	Very nice article and thx for the update.

	I would be very interested in an in dept of your experiences of
	differences between Roo Code and Cline if you feel you can share
	that. I've only tried Roo Code (with interesting but mixed results)
	thus far.

	pyman wrote 1 day ago:
	Just use lmstudio.ai, it's what everyone is using nowadays

	simonw wrote 1 day ago:
	LM Studio is great, but it's a very different product from an
	AI-enabled IDE or a Claude Code style coding agent.

	pyman wrote 1 day ago:
	LM Studio is awesome

	racecar789 wrote 1 day ago:
	Small recommendation: The diagrams on [ [1] ] are helpful, but
	clicking them does not display the fullâresolution images; they
	appear blurry. This occurs in both Firefox and Chrome. In the GitHub
	repository, the same images appear sharp at full resolution, so the
	issue may be caused by the JavaScript rendering library.

	[1]: https://wuu73.org/aicp

	radio879 wrote 1 day ago:
	thx - i did not know that. Will try to fix.

	PeterStuer wrote 1 day ago:
	Another data point: On Android Chrome they render without problem.

	hgarg wrote 1 day ago:
	Qwen is totally useless any serious dev work.

	simonw wrote 1 day ago:
	Which Qwen? They have over a dozen models now.

	b2m9 wrote 1 day ago:
	Itâs really hit and miss for me. Well defined small tasks seem
	ok. But every time I try some âagentic codingâ, it burns
	through millions of tokens without producing anything working.

	indigodaddy wrote 1 day ago:
	Is glm-4.5 air useable? I see it's free on Openrouter. Also pls
	advise what you think is the current best free openrouter model for
	coding. Thanks!

	radio879 wrote 1 day ago:
	Well, if you download Qwen Code [1] it is free up to 2000 api calls
	a day.

	Not sure if GLM-4.5 Air is good, but non-Air one is fabulous. I
	know for free API access there is pollinations ai project. Also
	llm7. If you just use the web chat's you can use most of the best
	models for free without API. There are ways to 'emulate' an API
	automatically.. I was thinking about adding this to my
	aicodeprep-gui app so it could automatically paste and then cut.
	Some MCP servers exist that you can use and it will automatically
	paste or cut from those web chat's and route it to an API
	interface.

	OpenAI offers free tokens for most models, 2.5mil or 250k depending
	on model. Cerebras has some free limits, Gemini... Meta has
	plentiful free API for Llama 4 because.. lets face it, it sucks,
	but it is okay/not bad for stuff like summarizing text.

	If you really wanted to code for exactly $0 you could use
	pollinations ai, in Cline extension (for VS Code) set to use
	"openai-large" (which is GPT 4.1). If you plan using all the best
	web chat's like Kimi K2, z.ai's GLM models, Qwen 3 chat, Gemini in
	AI Studio, OpenAI playground with o3 or o4-mini. You can go forever
	without being charged money. Pollinations 'openai-large' works fine
	in Cline as an agent to edit files for you etc.

	[1]: https://github.com/QwenLM/qwen-code

	tonyhart7 wrote 1 day ago:
	bro you are final boss of free tier users lol

	radio879 wrote 1 day ago:
	damn right !!!!

	indigodaddy wrote 1 day ago:
	Very cool, a lot to chew on here. Thanks so much for the
	feedback!

	chromaton wrote 1 day ago:
	If you're looking for free API access, Google offers access to Gemini
	for free, including for gemini-2.5-pro with thinking turned on. The
	limit is... quite high, as I'm running some benchmarking and haven't
	hit the limit yet.

	Open weight models like DeepSeek R1 and GPT-OSS are also made available
	with free API access from various inference providers and hardware
	manufacturers.

	chiwilliams wrote 1 day ago:
	I'm assuming it isn't sensitive for your purposes, but note that
	Google will train on these interactions, but not if you pay.

	bongodongobob wrote 20 hours 8 min ago:
	I don't care. From what I understand of LLM training, there's
	basically 0 chance a key or password I might send it will ever be
	regurgitated. Do you have any examples of an LLM actually doing
	anything like this?

	devjab wrote 1 day ago:
	I think it'll be hard to find a LLM that actually respects your
	privacy regardless whether or not you pay. Even with the "privacy"
	enterprise Co-Pilot from Microsoft with all their promises of
	respecting your data, it's still not deemed safe enough by
	leglislation to be used in part of the European energy sector. The
	way we view LLM's on any subscription is similar to how I imagine
	companies in the USA views Deepseek. Don't put anything into them
	you can't afford to share with the world. Of course with the
	agents, you've probably given them access to everything on your
	disk.

	Though to be fair, it's kind of silly how much effort we go through
	to protect our mostly open source software from AI agents, while at
	the same time, half our OT has build in hardware backdoors.

	unnouinceput wrote 1 day ago:
	I agree, Google is definitely the champion of respecting your
	privacy. Will definitely not train their model on your data if you
	pay them. I mean you should definitely just film yourself and give
	them everything, access to your files, phone records, even bank
	accounts. Just make sure to pay them those measly $200 and
	absolutely they will not share that data with anybody.

	lern_too_spel wrote 1 day ago:
	You're thinking of Facebook. A lot of companies run on Gmail and
	Google Docs (easy to verify with `dig MX [bigco].com`), and they
	would not if Google shared that data with anybody.

	wat10000 wrote 1 day ago:
	Big companies can negotiate their own terms and enforce them
	with meaningful legal action.

	d1sxeyes wrote 1 day ago:
	Itâs not really in either Meta or Googleâs interests to
	share that data. What they do is to build super detailed
	profiles of you and what youâre likely to click on, so they
	can charge more money for ad impressions.

	Applejinx wrote 4 hours 21 min ago:
	Honestly, there are plenty of more profitable things to do
	with such information. I think ad impressions being the sole
	motivator for anybody, is sorta two decades ago.

	grumbelbart2 wrote 10 hours 32 min ago:
	LLMs add a new thread model. If trained on your data, they
	might very well leak some of its information in some future
	chat.

	Meta, Alphabet might not want that, but it is impossible to
	completely avoid with current architectures.

	lern_too_spel wrote 1 day ago:
	Meta certainly shares the data internally.

	[1]: https://www.techradar.com/computing/cyber-security/f...

	gooosle wrote 1 day ago:
	Gemini 2.5 pro free limit is 100 requests per day.

	[1]: https://ai.google.dev/gemini-api/docs/rate-limits

	panarky wrote 1 day ago:
	I'm getting consistently good results with Gemini CLI and the free
	100 requests per day and 6 million tokens per day.

	Note that you'll need to either authorize with a Google Account or
	with an API key from AI Studio, just be sure the API key is from an
	account where billing is disabled.

	Also note that there are other rate limits for tokens per request
	and tokens per minute on the free plan that effectively prevent you
	from using the whole million token context window.

	It's good to exit or /clear frequently so every request doesn't
	resubmit your entire history as context or you'll use up the token
	limits long before you hit 100 requests in a day.

	tomrod wrote 1 day ago:
	Doesn't it swap to a lower power model after that?

	acjacobson wrote 1 day ago:
	Not automatically but you can switch to a lower power model and
	access more free requests. I think Gemini 2.5 Flash is 250
	requests per day.

	reactordev wrote 1 day ago:
	To the OP: I highly recommend you look into Continue.dev and
	ollama/lmstudio and running models on your own. Some of them are really
	good at autocomplete-style suggestions while others (like gpt-oss) can
	reason and use tools.

	It's my goto copilot.

	AstroBen wrote 1 day ago:
	I've found Zed to be a step up from continue.dev - you can use your
	own models there also

	radio879 wrote 1 day ago:
	really - no monthly subscriptions? i hate those but i am fine with
	bringing my own API URLs etc and paying. I'm building a router that
	will track all the free tokens from all the different providers and
	auto rotate them when daily tokens or time limits run out.

	Continue and Zed.. gonna check them out, prompts in Cline are too
	long. I was thinking of just making my own VS Code extension but I
	need to try Claude Code with GLM 4.5 (heard it pairs nicely)

	reactordev wrote 1 day ago:
	Zed is supreme but I have a need that Zed canât scratch so Iâm
	in VSCode :(

	indigodaddy wrote 1 day ago:
	Can you use your GH Copilot subscription with Zed to leverage the
	Copilot subscription-provided models?

	nechuchelo wrote 1 day ago:
	Yes, you can. IIRC both for the assistant/agent and code
	completions.

	navbaker wrote 1 day ago:
	Same! Iâve been using Continue in VSCode and found most of the
	bigger Qwen models plus gpt-oss-120b to be great in agentic mode!

	indigodaddy wrote 22 hours 44 min ago:
	Do you use openrouter models with continue?

	andai wrote 1 day ago:
	My experience lines up with the article. The agentic stuff only works
	with the biggest models. (Well, "works"... OpenAI Codex took 200
	requests with o4-mini to change like 3 lines of code...)

	For simple changes I actually found smaller models better because
	they're so much faster. So I shifted my focus from "best model" to
	"stupidest I can get away with".

	I've been pushing that idea even further. If you give up on agentic,
	you can go surgical. At that point even 100x smaller models can handle
	it. Just tell it what to do and let it give you the diff.

	Also I found the "fumble around my filesystem" approach stupid for my
	scale, where I can mostly fit the whole codebase into the context. So I
	just dump src/ into the prompt. (Other people's projects are a lot more
	boilerplatey so I'm testing ultra cheap models like gpt-oss-20b for
	code search. For that, I think you can go even cheaper...)

	Patent pending.

	tunesmith wrote 13 hours 27 min ago:
	For those who don't know, OpenAI Codex CLI will now work with your
	ChatGPT plus or pro account. They barely announced it but it's on
	their github page. You don't have to use an api key.

	mathiaspoint wrote 1 day ago:
	I use a 500 million parameter model for editor completions because I
	want those to nearly instantaneous and the plugin makes 50+
	completion requests every session.

	badlogic wrote 1 day ago:
	Can you share which model you are using?

	ghxst wrote 1 day ago:
	What editor do you use, and how did you set it up? I've been
	thinking about trying this with some local models and also with
	super low-latency ones like Gemini 2.5 Flash Lite. Would love to
	read more about this.

	mathiaspoint wrote 1 day ago:
	Neovim with the llama.cpp plugin and heavily quantized
	qwen2.5-coder with 500 (600?) million parameters. It's almost
	plug and play although the default ring context limit is way too
	large if you don't have a GPU.

	myflash13 wrote 1 day ago:
	Which model and which plugin, please?

	seunosewa wrote 1 day ago:
	You should try GLM 4.5; it's better in practice than Kimi K2 and
	Qwen3 Coder, but it's not getting much hype.

	chewz wrote 1 day ago:
	I agree. I find even Haiku good enough at managing the flow of the
	conversation and consulting larger models - Gemini 2.5 Pro or GPT-5 -
	for programming tasks.

	Last few days I am experimenting with using Codex (via MCP ${codex
	mcp}) from Gemini CLI and it works like a charm. Gemini CLI is mostly
	using Flash underneath but this is good enough for formulating
	problems and re-evaluating answers.

	Same with Claude Code - I am asking (via MCP) for consulting with
	Gemini 2.5 Pro.

	Never had much success of using Claude Code as MCP though.

	The original idea comes of course from Aider - using main, weak and
	editor models all at once.

	hpincket wrote 1 day ago:
	I am developing the same opinion. I want something fast and
	dependable. Getting into a flow state is important to me, and I just
	can't do that when I'm waiting for an agentic coding assistant to
	terminate.

	I'm also interested in smaller models for their speed. That, or a
	provider like Cerebras.

	Then, if you narrow the problem domain you can increase the
	dependability. I am curious to hear more about your "surgical" tools.

	I rambled about this on my blog about a week ago:

	[1]: https://hpincket.com/what-would-the-vim-of-llm-tooling-look-...

	radio879 wrote 1 day ago:
	well, most of the time, I just dump the entire codebase in if the
	context window is big and its a good model. But there are plenty of
	times when I need to block one folder in a repo or disable a few
	files because the files might "nudge" it in a wrong direction.

	The surgical context tool (aicodeprep-gui) - there are at least 30
	similar tools but most (if not all) are CLI only/no UI. I like UIs,
	I work faster with them for things like choosing individual files
	out of a big tree (at least it is using PySide6 library which is
	"lite" (could go lighter maybe), i HATE that too many things use
	webview/browsers. All the options on it are there for good reasons,
	its all focused on things that annoy me..and slow things down: like
	doing something repeatedly (copy paste copy paste or typing the
	same sentence over and over every time i have to do a certain thing
	with the AI and my code.

	If you have not run 'aicp' (the command i gave it, but also there
	is a OS installer menu that will add a Windows/Mac/Linux right
	click context menu in their file managers) in a folder before, it
	will try to scan recursively to find code files, but it skips
	things like node_modules or .venv. but otherwise assumes most types
	of code files will probably be added so it checks them. You can
	fine tune it, add some .md or txt files or stuff in there that
	isn't code but might be helpful. When you generate the context
	block it puts the text inside the prompt box on the top AND/OR
	bottom - doing both can get better responses from AI.

	It saves every file that is checked, and saves the window size,
	other window prefs, so you don't have to resize the window again.
	It saves the state of which files are checked so its less work /
	time next time. I have been just pasting the output from the LLMs
	into an agent like Cline but I am wondering if I should add browser
	automation / browser extension that does the copy pasting and also
	add option to edit / change files right after grabbing the output
	from a web chat. Its probably about good enough as it is though,
	not sure I want to make it into a big thing.

	---
	Yeah I just keep coming back to this workflow, its very reliable. I
	have not tried Claude Code yet but I will soon to see if they
	solved any of these problems.

	Strange this thing has been at the top of hacker news for hours and
	hours.. weird! My server logs are just constant scrolling

	indigodaddy wrote 1 day ago:
	Have you seen this?

	[1]: https://github.com/robertpiosik/CodeWebChat

	hpincket wrote 1 day ago:
	aicodeprep-gui looks great. I will try it out

	dist-epoch wrote 1 day ago:
	Thanks for the article. I'm also doing a similar thing, here are
	my tips:

	- [1] - 200 requests per day if you deposit (one-time) $5 for top
	open weights models - GLM, Qwen, ...

	- [2] - around 10 requests per day to o3, ... if you have the $10
	GitHub Copilot subsciption

	- [3] - I open all the LLM webapps here as separate "apps", my
	one place to go to talk with LLMs, without mixing it with regular
	browsing

	- [4] - chat API frontend, you can use it instead of the default
	webpages for services which give you free API access - Google,
	OpenRouter, Chutes, Github Models, Pollinations, ...

	I really recommend trying a chat API frontend, it really
	simplifies talking with multiple models from various providers in
	a unified way and managing those conversations, exporting to
	markdown, ...

	[1]: https://chutes.ai
	[2]: https://github.com/marketplace/models/
	[3]: https://ferdium.org
	[4]: https://www.cherry-ai.com

	knowaveragejoe wrote 12 hours 32 min ago:
	With chutes.ai, where do you see a one-time $5 for 200
	requests/day?

	wahnfrieden wrote 1 day ago:
	They don't allow model switching below GPT-5 in codex cli anymore
	(without API key), because it's not recommended. Try it with
	thinking=high and it's quite an improvement from o4-mini. o4-mini is
	more like gpt-5-thinking-mini but they don't allow that for codex.
	gpt-5-thinking-high is more like o1 or maybe o3-pro.

	SV_BubbleTime wrote 1 day ago:
	> (Well, "works"... OpenAI Codex took 200 requests with o4-mini to
	change like 3 lines of code...)

	Letâs keep something in reason, I have multiple times in my life
	spent days on what would end up to be maybe three lines of code.

	statenjason wrote 1 day ago:
	Aider as a non-agentic coding tool strikes a nice balance on the
	efficiency vs effectiveness front. Using tree-sitter to create a repo
	map of the repository means less filesystem digging. No MCP, but
	shell commands mean it can use utilities I myself am familiar with.
	Combined with Cerebras as a provider, the turnaround on prompts is
	instant; I can stay involved rather than waiting on multiple rounds
	of tool calls. It's my go-to for smaller scale projects.

	stillsut wrote 1 day ago:
	Just added a fork of aider that does do agentic commands: [1] In
	testing I've found it to be underwhelming at being an agent
	compared to claude code, wrote up some case-studies on it here:

	[1]: https://github.com/sutt/agent-aider
	[2]: https://github.com/sutt/agro/blob/master/docs/case-studies...

	mathiaspoint wrote 1 day ago:
	It's a shame MCP didn't end up using a sandboxed shell (or
	something similar, maybe even simpler.) All the pre-MCP agents I
	built just talked to the shell directly since the models are
	already trained to do that.

	Havoc wrote 1 day ago:
	For anyone else confused - there is a page 2 and 3 in the post that you
	need to access via arrow thing at bottom.

	cammikebrown wrote 1 day ago:
	I wonder how much energy this is wasting.

	sergiotapia wrote 1 day ago:
	who cares. we can build more. energymaxx or the us will become like
	germany.

	yen223 wrote 1 day ago:
	Probably not as much as you think: [1] You are better off worrying
	about your car use and your home heating/cooling efficiency, all of
	which are significantly worse for energy use.

	[1]: https://www.sustainabilitybynumbers.com/p/ai-energy-demand

	kasabali wrote 1 day ago:
	> Youâll notice that this figure is for 2022, and weâve had a
	major AI boom since then

	I might as well read LLM gibberish instead of this article.

	robotsquidward wrote 1 day ago:
	Right - free to you maybe.

	bravesoul2 wrote 1 day ago:
	Untradable carbon tax (or carbon price for people who hate the T
	word) is needed.

	GaggiX wrote 1 day ago:
	OpenAI offering 2.5M free tokens daily small models and 250k for big
	ones (tier 1-2) is so useful for random projects, I use them to learn
	japanese for example (by having a program that list informations about
	what the characters are just saying: vocabulary, grammar points,
	nuances).

	CjHuber wrote 1 day ago:
	Without tricks google aistudio definitely has limits, though pretty
	high ones. gemini.google.com on the other hand has less than a handful
	of free 2.5 pro messages for free


	<- back to front page