| _______ __ _______ | |
| | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | |
| | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| | |
| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| | |
| on Gopher (inofficial) | |
| Visit Hacker News on the Web | |
| COMMENT PAGE FOR: | |
| GPT Image 1.5 | |
| yuni_aigc wrote 19 hours 24 min ago: | |
| One thing Iâve noticed when comparing these models is that | |
| âqualityâ and ârealismâ donât always move together. | |
| Some models are very strong at sharp details and localized edits, but | |
| they can break global lighting consistency â shadows, reflections, or | |
| overall scene illumination drift in subtle ways. GPT-Image seems to | |
| trade a bit of micro-detail for better global coherence, especially in | |
| lighting, which makes composites feel more believable even if theyâre | |
| not pixel-perfect. | |
| Itâs hard to capture this in benchmarks, but for real-world editing | |
| workflows it ends up mattering more than I initially expected. | |
| fock wrote 1 day ago: | |
| Good to see that hands are still not solved... | |
| sipsi wrote 1 day ago: | |
| the combination of two images the last gpt-image (nano banana) | |
| generated seem to be inappropriate | |
| Garlef wrote 1 day ago: | |
| GPT images is the new MS Word "Arial + clip art" | |
| chakintosh wrote 1 day ago: | |
| Can't wait to generate fake memories with my 20 years ago dead grandma | |
| bunnybomb2 wrote 7 hours 33 min ago: | |
| Or me and my ex | |
| rw2 wrote 1 day ago: | |
| Having used it compared to Nano Banana: | |
| -The latency is still too high, lower than 10 seconds for nano banana | |
| and around 25 seconds for GPT image 1.5 | |
| -The quality is higher but not a jump like previous google models to | |
| Nano Banana Pro. Nano banana pro is still at least equivalently good or | |
| better in my opinion. | |
| jdthedisciple wrote 1 day ago: | |
| Why is the emphasis of these promos always to create fake social media | |
| pictures of people and things that didnt happen? | |
| Aren't we plagued enough by all the fake bullshit out there. | |
| Ffs! | |
| /rant | |
| Sorry gotta be honest and blunt every one of those times... | |
| v9v wrote 1 day ago: | |
| Lots of em-dashes in this copy. | |
| sroussey wrote 1 day ago: | |
| â Photo of a blond male in his 50s with half gray hair â | |
| Still fails. Every photo of a man with half gray hair will have the | |
| other half black. | |
| andai wrote 1 day ago: | |
| Sam Altman Christmas decoration isn't real, he can't hurt me... | |
| thumbsup-_- wrote 1 day ago: | |
| now you can create good memories with your family without meeting them | |
| augustk wrote 1 day ago: | |
| Or create the family | |
| GaryBluto wrote 1 day ago: | |
| God OpenAI are so far behind. Their own example shows that trying to | |
| only change specific parts of the image doesn't work without affecting | |
| the background. | |
| encroach wrote 1 day ago: | |
| This outperforms Gemini 3 pro image (nano banana pro) on Text-to-Image | |
| Arena and Image Edit Arena. I'm surprised they didn't mention this | |
| leaderboard in the blog post. | |
| I like this benchmark because its based upon user votes, so overfitting | |
| is not as easy (after all, if users prefer your result, you've won). | |
| [1] | |
| [1]: https://lmarena.ai/leaderboard/text-to-image | |
| [2]: https://lmarena.ai/leaderboard/image-edit | |
| ygouzerh wrote 1 day ago: | |
| The score are really, really close, it might be why | |
| nycdatasci wrote 1 day ago: | |
| The arena concept doesnât work for image models due to watermarks. | |
| encroach wrote 1 day ago: | |
| There are no watermarks in the arena. | |
| nycdatasci wrote 1 day ago: | |
| There are no visible watermarks, but model makers can use | |
| steganographic codes to identify outputs from their own models. | |
| nycdatasci wrote 21 hours 1 min ago: | |
| Text-to-Image Models Leave Identifiable Signatures: | |
| Implications for Leaderboard Security | |
| [1]: https://arxiv.org/pdf/2510.06525 | |
| encroach wrote 18 hours 51 min ago: | |
| This is true, however LMArena does employ some methods to | |
| mitigate attempts to manipulate the leaderboard, see [1] They | |
| also control for style | |
| [1]: https://openreview.net/forum?id=zf9zwCRKyP | |
| [2]: https://news.lmarena.ai/sentiment-control/ | |
| raw_anon_1111 wrote 1 day ago: | |
| I still canât get it to draw a â13 hour clockâ correctly | |
| fellowniusmonk wrote 17 hours 37 min ago: | |
| All the latest round of openai is massively overfit. | |
| randall wrote 1 day ago: | |
| double popped collar ftw | |
| password-app wrote 1 day ago: | |
| Impressive image quality improvements. Meanwhile, AI agents just | |
| crossed a milestone: Simular's Agent S hit 72.6% on OSWorld | |
| (human-level is 72.36%). | |
| We're seeing AI get better at both creative tasks (images) and | |
| operational tasks (clicking through websites). | |
| For anyone building AI agents: the security model is still the hard | |
| part. Prompt injection remains unsolved even with dedicated security | |
| LLMs. | |
| nightshift1 wrote 1 day ago: | |
| What is the endgame? Why is OpenAI throwing that much money on | |
| image/video generation? Is there a profitable market for AI-generated | |
| image slop? Do people choose ChatGPT instead of Gemini/Grok/Claude | |
| because of the image generation capabilities? To me, it looks like a | |
| huge fiery money pit. | |
| BrokenCogs wrote 1 day ago: | |
| The endgame is to make money during the hype and then cash out before | |
| it crashes. | |
| bdangubic wrote 1 day ago: | |
| if that is the endgame openai is doing everything but working | |
| towards that goal :) | |
| BrokenCogs wrote 1 day ago: | |
| Yeah they fumbled big time | |
| eterm wrote 1 day ago: | |
| I have a "go to" prompt for images: | |
| > In the style of a 1970s book sci-fi novel cover: A spacer walks | |
| towards the frame. In the background his spaceship crashed on an icy | |
| remote planet. The sky behind is dark and full of stars. | |
| Nano banana pro via gemini did really well, although still way too | |
| detailed, and it then made a mess of different decades when I asked it | |
| to follow up: [1] It's therefore really disappointing that GPT-image | |
| 1.5 did this: [2] Completely generic, not at all like a book cover, it | |
| completely ignored that part of the prompt while it focused on the | |
| other elements. | |
| Did it get the other details right? Sure, maybe even better, but the | |
| important part it just ignored completely. | |
| And it's doing even worse when I try to get it to correct the mistake. | |
| It's just repeating the same thing with more "weathering". | |
| [1]: https://gemini.google.com/share/1902c11fd755 | |
| [2]: https://chatgpt.com/share/6941ed28-ed80-8000-b817-b174daa922a7 | |
| bongodongobob wrote 1 day ago: | |
| You're just not describing what you want properly. Looks fine to me. | |
| Clearly you have something else in mind, so I think you're just not | |
| describing well. My tip would be to use actuall illustration | |
| language. Do you want a wide angle shot? What should depth of field | |
| be? Oil painting print? Ink illustration? What kind of printing | |
| style? Do you want a photo of the book or a pre-print proof? What | |
| kind of color scheme? | |
| A professional artist wouldn't know what you want. | |
| You didn't even specify an art style. 1970s sci-fi novel cover isn't | |
| a style. You'll find vastly different art styles from the 70s. If | |
| you're disappointed, it's because you're doing a shitty job | |
| describing what's in your head. If your prompt isn't at least a | |
| paragraph, you're going to just get random generic results. | |
| eterm wrote 1 day ago: | |
| The killer feature of LLMs is to be able to extrapolate what's | |
| really wanted from short descriptions. | |
| Look again at Gemini's output, it looks like an actual book cover, | |
| it looks like an illustration that could be found on a book. | |
| It takes on board corrections (albeit hilariously literaly). | |
| Look at GPT image's output, it doesn't look anything like a book | |
| cover, and when prompted to say it got it wrong, just doubles down | |
| on what it was doing. | |
| bongodongobob wrote 1 day ago: | |
| What you want, and what you think image generation is, is | |
| impossible. | |
| eterm wrote 21 hours 9 min ago: | |
| And yet we can see Gemini do what I wanted, so it's clearly not | |
| impossible. | |
| bongodongobob wrote 19 hours 25 min ago: | |
| What you've found is a prompt that returns what you want on | |
| Gemeni. That's all. | |
| eterm wrote 18 hours 30 min ago: | |
| It's a prompt I've been using for years. Gemini has been | |
| the best of the bunch, but Nana Banana, midjourney, etc, | |
| all did okay to various degrees. | |
| GPT Image bombed notably worse than the others, not the | |
| original picture itself, but the complete lack of | |
| recognition of my feedback that it hadn't got it right, it | |
| just doubled down on the image it had generated. | |
| enigma101 wrote 1 day ago: | |
| Really can't stand the image slop suffocating the internet. | |
| adammarples wrote 1 day ago: | |
| Still can't pass my image test | |
| Two women walking in single file | |
| Although it tried very hard and had them staggered slightly | |
| weird-eye-issue wrote 1 day ago: | |
| Interestingly when you Google that literally all of the images have | |
| two women walking side by side | |
| ge96 wrote 1 day ago: | |
| I get the tech implementation is amazing, I wonder if it takes away | |
| from genuineness of events, like the Astronaut photo, I get it's just a | |
| joke/funny too but it's like a photo of you in a supercar vs. actually | |
| buying one. Or fake AI companions vs. real people. Beauty | |
| filters/skinny filters vs. actually being healthy. | |
| ge96 wrote 21 hours 13 min ago: | |
| thinking about this more it goes two ways for guys/girls, the guys | |
| can post pics of them doing crazy things on their Tinder up to the | |
| girl to decide if it's real or not | |
| I'm not saying this as a critique against image generation as you can | |
| manually make these fake images but yeah | |
| Ultimately I think it's good, makes people be real | |
| onoesworkacct wrote 1 day ago: | |
| the next generation of humans growing up will not even care whether | |
| media is real or not any more. The saturation of AI content and FUD | |
| around real content is going to blur the lines to the extent that | |
| there's no point even caring about it. And it's an intractable | |
| problem. | |
| hopefully this leads to greater importance of seeing things with your | |
| own wetware. | |
| ge96 wrote 1 day ago: | |
| The other issue is the need to show off... if I had a supercar why | |
| do I have to post it on Instagram that kind of thing. | |
| smlavine wrote 1 day ago: | |
| This is terrifying. Truth is dead. | |
| teaearlgraycold wrote 1 day ago: | |
| Eventually phone manufacturers will be forced to become arbiters of | |
| truth with signed images and videos. | |
| WhyOhWhyQ wrote 1 day ago: | |
| Makes you wonder what's really meant when we talk about progress. | |
| mingabunga wrote 1 day ago: | |
| Did an experiment to give a software product a dark theme. Gave Both | |
| (GPT and Gemini/Nano) a screenshot of the product and an example theme | |
| I found on Dribbble. | |
| - Gemini/Nano did a pretty average job, only applying some grey to some | |
| of the panels. I tried a few different examples and got similar output. | |
| - GPT did a great job and themed the whole app and made it look great. | |
| I think I'd still need a designer to finesse some things though. | |
| vunderba wrote 1 day ago: | |
| Okay results are in for GenAI Showdown with the new gpt-image 1.5 model | |
| for the editing portions of the site! [1] Conclusions | |
| - OpenAI has always had some of the strongest prompt understanding | |
| alongside the weakest image fidelity. This update goes some way towards | |
| addressing this weakness. | |
| - It's leagues better at making localized edits without altering the | |
| entire image's aesthetic than gpt-image-1, doubling the previous score | |
| from 4/12 to 8/12 and the only model that legitimately passed the | |
| Giraffe prompt. | |
| - It's one of the most steerable models with a 90% compliance rate | |
| Updates to GenAI Showdown | |
| - Added outtakes sections to each model's detailed report in the | |
| Text-to-Image category, showcasing notable failures and unexpected | |
| behaviors. | |
| - New models have been added including REVE and Flux.2 Dev (a new | |
| locally hostable model). | |
| - Finally got around to implementing a weighted scoring mechanism which | |
| considers pass/fail, quality, and compliance for a more holistic model | |
| evaluation (click pass/fail icon to toggle between scoring methods). | |
| If you just want to compare gpt-image-1, gpt-image-1.5, and NB Pro at | |
| the same time: [1] ?models=o4,nbp... | |
| [1]: https://genai-showdown.specr.net/image-editing | |
| [2]: https://genai-showdown.specr.net/image-editing?models=o4,nbp,g... | |
| Bombthecat wrote 15 hours 1 min ago: | |
| I can't click the compliance info button on mobile. The text shows | |
| for half a second and then vanishes. Long press just marks the text | |
| for copy paste. | |
| vunderba wrote 6 hours 37 min ago: | |
| Hey bombthecat - thanks for pointing this out. I had some poor | |
| mobile browser detection that was causing this issue. It should be | |
| fixed now. | |
| nicpottier wrote 19 hours 15 min ago: | |
| Love this benchmark, always the first place I look. Also seems like | |
| it is time to move the goalposts, not sure we are getting enough | |
| resolution between models anymore. | |
| Out of curiosity why does gemini get gold for the poker example but | |
| gpt-image 1.5 does not? I couldn't see a difference between the two. | |
| leumon wrote 1 day ago: | |
| One other test you could add is generating a chessboard from a FEN. I | |
| was surprised to see NBP able to do that (however, it seems to only | |
| work with fewer pieces, after a certain amount it makes mistakes or | |
| even generates a completely wrong image) | |
| [1]: https://files.catbox.moe/uudsyt.png | |
| quietbritishjim wrote 1 day ago: | |
| Absolutely fabulous work. | |
| Ludicrously unnecessary nitpick for "Remove all the brown pieces of | |
| candy from the glass bowl": | |
| > Gemini 2.5 Flash - 18 attempts - No matter what we tried, Gemini | |
| 2.5 Flash always seemed to just generate an entirely new assortment | |
| of candies rather than just removing the brown ones. | |
| The way I read the prompt, it demands that the candies should change | |
| arrangement. You didn't say "change the brown candies to a different | |
| color", you said "remove them". You can infer from the few brown ones | |
| that you can see that there are even more underneath - surely if you | |
| removed them all (even just by magically disappearing them) then the | |
| others would tumble down into a new location? The level of the | |
| candies is lower than before you started, which is what you'd expect | |
| if you remove some. Maybe it's just coincidence, but maybe this | |
| really was its reasoning. (It did unnecessarily remove the red candy | |
| from the hand though.) | |
| I don't think any of the "passes" did as well as this, including | |
| Gemini 3.0 Pro Image. Qwen-Image-Edit did at least literally remove | |
| one of the three visible brown candies, but just recolored the other | |
| two. | |
| vunderba wrote 20 hours 48 min ago: | |
| That is a great point! Since we are moving towards better "world | |
| models" in terms of these multimodal models, you could reasonably | |
| argue that if the directive was to physically remove the candy that | |
| in the process of doing so, gravity/physics could affect the | |
| positioning of other objects. | |
| You will note that the Minimum Passing Criteria allows for a color | |
| change in order to pass the prompt but with the rapid improvements | |
| in generative models, I may revise this test to be stricter, only | |
| allowing "Removal" to be considered as pass as opposed to a simple | |
| color swap. | |
| boredhedgehog wrote 1 day ago: | |
| I disagree with gpt-image-1.5's grade on the worm sign. It moved some | |
| of the marks around to accommodate the enlarged black area, but | |
| retained the overall appearance of the sign. | |
| vunderba wrote 20 hours 50 min ago: | |
| I can see how you'd come to that conclusion. Each prompt is | |
| supposed to illustrate a different type of test criteria. The | |
| ultimate goal of Worm Sign is intended to test a near 100% | |
| retention of the original weathered/dented sign. | |
| If you look at the ones that passed (Flux.2 Pro, Gemini 2.5 Flash, | |
| Reve), you'll see that they did not add/subtract/move any of the | |
| pockmarks from the original image. | |
| KeplerBoy wrote 1 day ago: | |
| "Remove all the trash from the street and sidewalk. Replace the | |
| sleeping person on the ground with a green street bench. Change the | |
| parking meter into a planted tree." | |
| What a prompt and image. | |
| __alexs wrote 1 day ago: | |
| Looking forward to the first AR glasses to include live editing of | |
| the world like this. | |
| nisegami wrote 1 day ago: | |
| How long until this shows up in a YC batch? | |
| imdsm wrote 1 day ago: | |
| A way it could be... | |
| walrus01 wrote 1 day ago: | |
| I've already seen images on the MLS uploaded by real estate agents | |
| that look like this is the same concept as what they've been doing, | |
| generally, to bait people into coming and touring houses. | |
| llmthrow0827 wrote 1 day ago: | |
| It failed my benchmark of a photo of a person touching their elbows | |
| together. | |
| lobochrome wrote 1 day ago: | |
| Stupid Cisco Umbrella is blocking you | |
| mvkel wrote 1 day ago: | |
| This leaderboard feels incredibly accurate given my own experience. | |
| heystefan wrote 1 day ago: | |
| So when you say "X attempts" what does that mean? You just start a | |
| new chat with the same exact prompt and hope for a different result? | |
| vunderba wrote 1 day ago: | |
| All images are generated using independent, separate API calls. See | |
| the FAQ at the bottom under âWhy is the number of attempts | |
| seemingly arbitrary?â and âHow are the prompts written?â for | |
| more detail, but to quickly summarize: | |
| In addition to giving models multiple attempts to generate an | |
| image, we also write several variations of each prompt. This helps | |
| prevent models from getting stuck on particular keywords or | |
| phrases, which can happen depending on their training data. For | |
| example, while âhippity hopâ is a relatively common name for | |
| the ball-riding toy, itâs also known as a âspace hopper.â In | |
| some cases, we may even elaborate and provide the model with a | |
| dictionary-style definition of more esoteric terms. | |
| This is why providing an âX Attemptsâ metric is so important. | |
| It serves as a rough measure of how âsteerableâ a given model | |
| is - or put another way how much we had to fight with the model in | |
| order for it to consistently follow the promptâs directives. | |
| singhkays wrote 1 day ago: | |
| GPT Image 1.5 is the first model that gets close to replicating the | |
| intricate detail mosaic of bullets in the "Lord of War" movie poster | |
| for me. Following the prompt instructions more closely also seems | |
| better compared to Nano Banana Pro. | |
| I edited the original "Lord of War" poster with a reference image of | |
| Jensen and replaced bullets with GPU dies, silicon wafers and | |
| electronic components. | |
| [1]: https://x.com/singhkays/status/2001080165435113791 | |
| smusamashah wrote 1 day ago: | |
| Z-image was released recently and that's what /r/StableDiffusion all | |
| talks about these days. Consider adding that too. It is very good | |
| quality for its size (Requires only 6 or 8 gigs of ram). | |
| vunderba wrote 1 day ago: | |
| I've actually done a bit of preliminary testing with ZiT. I'm | |
| holding off on adding it to the official GenAI site until the base | |
| and edit models have been released since the Turbo model is pretty | |
| heavily distilled. | |
| [1]: https://mordenstar.com/other/z-image-turbo | |
| pierrec wrote 1 day ago: | |
| This showdown benchmark was and still is great, but an enormous grain | |
| of salt should be added to any model that was released after the | |
| showdown benchmark itself. | |
| Maybe everyone has a different dose of skepticism. Personally I'm not | |
| even looking at results for models that were released after the | |
| benchmark, for all this tells us, they might as well be one-trick | |
| ponies that only do well in the benchmark. | |
| It might be too much work, but one possible "correct" approach for | |
| this kind of benchmark would to periodically release new benchmarks | |
| with new tests (that are broadly in the same categories) and only | |
| include models that predate each benchmark. | |
| somenameforme wrote 1 day ago: | |
| You don't need skepticism, because even if you're acting in 100% | |
| good faith and building a new model, what's the first thing you're | |
| going to do? You're going to go look up as many benchmarks as you | |
| can find and see how it does on them. It gives you some easy | |
| feedback relative to your peers. The fact that your own model may | |
| end up being put up against these exact tests is just icing. | |
| So I don't think there's even a question of whether or not newer | |
| models are going to be maximizing for benchmarks - they 100% are. | |
| The skepticism would be in how it's done. If something's not being | |
| run locally, then there's an endless array of ways to cheat - like | |
| dynamically loading certain LoRAs in response to certain queries, | |
| with some LoRAs trained precisely to maximize benchmark | |
| performance. Basically taking a page out of the car company | |
| playbook in response to emissions testing. | |
| But I think maximizing the general model itself to perform well on | |
| benchmarks isn't really unethical or cheating at all. All you're | |
| really doing there is 'outsourcing' part of your quality control | |
| tests. But it simultaneously greatly devalues any benchmark, | |
| because that benchmark is now the goal. | |
| smusamashah wrote 1 day ago: | |
| I think training image models to pass these very specific tests | |
| correctly will be very difficult for any of these companies. How | |
| would they even do that? | |
| 8n4vidtmkvmk wrote 1 day ago: | |
| Hire a professional Photoshop artist to manually create the | |
| "correct" images and then put the before and after photos into | |
| the training data. Or however they've been training these models | |
| thus far, i don't know. | |
| And if that still doesn't get you there, hash the image inputs to | |
| detect if its one of these test photos and then run your special | |
| test-passer algo. | |
| smusamashah wrote 3 hours 23 min ago: | |
| I don't think a few images done by any professional will have a | |
| measurable impact in training. | |
| vunderba wrote 1 day ago: | |
| Yeah thatâs a classic problem, and it's why good tests are such | |
| closely guarded secrets: to keep them from becoming training fodder | |
| for the next generation of models. Regarding the "model date" vs | |
| "benchmark date" - that's an interesting point... I'll definitely | |
| look into it! | |
| I don't have any captcha systems in place, but I wonder if it might | |
| be worth putting up at least a few nominal roadblocks (such as | |
| Anubis [1]) to at least slow down the scrapers. | |
| A few weeks ago I actually added some new, more challenging tests | |
| to the GenAI Text-to-Image section of the site (the âangelic | |
| forgeâ and âovercrowded flat earthâ) just to keep pace with | |
| the latest SOTA models. | |
| In the next few weeks, Iâll be adding some new benchmarks to the | |
| Image Editing section as well~~ [1] - | |
| [1]: https://anubis.techaro.lol | |
| echelon wrote 1 day ago: | |
| The Blender previz reskin task [1] could be automated! New test | |
| cases could be randomly and procedurally generated (without AI). | |
| Generate a novel previz scene programatically in Blender or some | |
| 3D engine, then task the image model with rendering it in a style | |
| (or to style transfer to a given image, eg. something novel and | |
| unseen from Midjourney). Another test would be to replace stand | |
| in mannequins with identities of characters in reference images | |
| and make sure the poses and set blocking match. | |
| Throw in a 250 object asset pack and some skeletal meshes that | |
| can conform to novel poses, and you've got a fairly robust test | |
| framework. | |
| Furthermore, anything that succeeds from the previz rendering | |
| task can then be fed into another company's model and given a | |
| normal editing task, making it doubly useful for two entirely | |
| separate benchmarks. That is, successful previz generations can | |
| be reused as image edit test cases - and you a priori know the | |
| subject matter without needing to label a bunch of images or run | |
| a VLM, so you can create a large set of unseen tests. | |
| [1]: https://imgur.com/gallery/previz-to-image-gpt-image-1-x8... | |
| irishcoffee wrote 1 day ago: | |
| > the only model that legitimately passed the Giraffe prompt. | |
| 10 years ago I would have considered that sentence satire. Now it | |
| allegedly means something. | |
| Somehow it feels like weâre moving backwards. | |
| echelon wrote 1 day ago: | |
| > Somehow it feels like weâre moving backwards. | |
| I don't understand why everyone isn't in awe of this. This is | |
| legitimately magical technology. | |
| We've had 60+ years of being able to express our ideas with | |
| keyboards. Steve Jobs' "bicycle of the mind". But in all this time | |
| we've had a really tough time of visually expressing ourselves. | |
| Only highly trained people can use Blender, Photoshop, Illustrator, | |
| etc. whereas almost everyone on earth can use a keyboard. | |
| Now we're turning the tide and letting everyone visually articulate | |
| themselves. This genuinely feels like computing all over again for | |
| the first time. I'm so unbelievably happy. And it only gets better | |
| from here. | |
| Every human should have the ability to visually articulate | |
| themselves. And it's finally happening. This is a major win for the | |
| world. | |
| I'm not the biggest fan of LLMs, but image and video models are a | |
| creator's dream come true. | |
| In the near future, the exact visions in our head will be | |
| shareable. We'll be able to iterate on concepts visually, | |
| collaboratively. And that's going to be magical. | |
| We're going to look back at pre-AI times as primitive. How did | |
| people ever express themselves? | |
| conradfr wrote 2 hours 39 min ago: | |
| It is amazing and impressive. But also an unlimited source of | |
| trash and slop during my internet use. | |
| concats wrote 1 day ago: | |
| âI've come up with a set of rules that describe our reactions | |
| to technologies: | |
| 1. Anything that is in the world when youâre born is normal and | |
| ordinary and is just a natural part of the way the world works. | |
| 2. Anything that's invented between when youâre fifteen and | |
| thirty-five is new and exciting and revolutionary and you can | |
| probably get a career in it. | |
| 3. Anything invented after you're thirty-five is against the | |
| natural order of things.â | |
| â Douglas Adams | |
| vintermann wrote 1 day ago: | |
| Is that how it works this time, though? | |
| * I'm into genealogy. Naturally, most of my fellow genealogists | |
| are retired, often many years ago, though probably also above | |
| average in mental acuity and tech-savviness for their age. They | |
| LOVE generative AI. | |
| * My nieces, and my cousin's kids of the same age, are deeply | |
| into visual art. Especially animation, and cutesy Pokemon-like | |
| stuff. They take it very seriously. They absolutely DON'T like | |
| AI art. | |
| SchemaLoad wrote 1 day ago: | |
| I'm struggling to see the benefits. All I see people using this | |
| for is generating slop for work presentations, and misleading | |
| people on social media. Misleading might be understating it too. | |
| It's being used to create straight up propaganda and destruction | |
| of the sense of reality. | |
| irishcoffee wrote 1 day ago: | |
| You basically described magic mushrooms, where the description | |
| came from you while high on magic mushrooms. | |
| Itâs just a tool. Itâs not a world-changing tech. Itâs a | |
| tool. | |
| Rodeoclash wrote 1 day ago: | |
| Where is all this wonderful visual self expression that people | |
| are now free to do? As far as I can tell it's mostly being used | |
| on LinkedIn posts. | |
| scrollaway wrote 1 day ago: | |
| Itâs a classic issue that you give access to superpowers to | |
| the general population and most will use them in the most | |
| boring ways. | |
| The internet is an amazing technology, yet its biggest | |
| consumption is a mix of ads, porn and brain rot. | |
| We all have cameras in our pockets yet most people use them for | |
| selfies. | |
| But if you look closely enough, the incredible value that comes | |
| from these examples more than makes up for all the people using | |
| them in a âboringâ way. | |
| And anyway whoâs the arbiter of boring? | |
| BoredPositron wrote 1 day ago: | |
| Nano Banana has still the best VAE we have seen especially if you are | |
| doing high res production work. The flux2 comes close but gpt image | |
| is still miles away. | |
| echelon wrote 1 day ago: | |
| I really love everything you're doing! | |
| Personal request: could you also advocate for "image previz | |
| rendering", which I feel is an extremely compelling use case for | |
| these companies to develop. Basically any 2d/3d compositor that | |
| allows you to visually block out a scene, then rely on the model to | |
| precisely position the set, set pieces, and character poses. | |
| If we got this task onto benchmarks, the companies would absolutely | |
| start training their models to perform well at it. | |
| Here are some examples: | |
| gpt-image-1 absolutely excels at this, though you don't have much | |
| control over the style and aesthetic: [1] Nano Banana (Pro) fails at | |
| this task: [2] Flux Kontext, Qwen, etc. have mixed results. | |
| I'm going to re-run these under gpt-image-1.5 and report back. | |
| Edit: | |
| gpt-image-1.5 : [3] And just as I finish this, Imgur deletes my | |
| original gpt-image-1 post. | |
| Old link (broken): [4] Hopefully imgur doesn't break these. I'll have | |
| to start blogging and keep these somewhere I control. | |
| [1]: https://imgur.com/gallery/previz-to-image-gpt-image-1-x8t1ij... | |
| [2]: https://imgur.com/a/previz-to-image-nano-banana-pro-Q2B8psd | |
| [3]: https://imgur.com/a/previz-to-image-gpt-image-1-5-3fq042U | |
| [4]: https://imgur.com/a/previz-to-image-gpt-image-1-Jq5M2Mh | |
| vunderba wrote 1 day ago: | |
| Thanks! A highly configurable Previz2Image model would be a | |
| fantastic addition. I was literally just thinking about this the | |
| other day (but more in the context of ControlNets and posable | |
| kinematic models). Iâm even considering adding an early CG Poser | |
| blockedâout scene test to see how far the various editor models | |
| can take it. | |
| With additions like structured prompts (introduced in BFL Flux 2), | |
| maybe we'll see something like this in the near future. | |
| ares623 wrote 1 day ago: | |
| My copium is that analog photography makes a come back as a way to | |
| recover some level of trust and authenticity. | |
| famahar wrote 1 day ago: | |
| I was reading a trend report on art and it seems like collage, | |
| squiggly hand drawn text, and lots of intentional imperfections are | |
| becoming popular. I'm not sure how hard it is for AI to recreate | |
| those, but it is nice to see people trying to do more of what AI | |
| struggles with. | |
| Forgeties79 wrote 1 day ago: | |
| Good luck getting it developed unfortunately. I have to ship it off | |
| now, there isnât a single local spot in my city that will develop | |
| anymore | |
| ares623 wrote 1 day ago: | |
| When the demand is back, the labs should start coming back. There's | |
| a few in my relatively small city which is pretty surprising. But | |
| the costs are still too high to cover the low volume I guess. | |
| Forgeties79 wrote 1 day ago: | |
| The big issue is chemical disposal IIRC (which yes is a cost just | |
| being more specific) | |
| celeryd wrote 1 day ago: | |
| If it can't generate non-sexual content of a woman in a bikini, I am | |
| not interested. | |
| brador wrote 1 day ago: | |
| Every person in every picture in their examples is white except for 1 | |
| Asian dude. Like a 46:1 ratio for the page (I counted). Not one Middle | |
| Eastern or Black or Jewish or Indian or South American person. | |
| Not even one. And no one on the team said anything? | |
| Come on Sam, do better. | |
| agentifysh wrote 1 day ago: | |
| I am very impressed a benchmark I like to run is have it create sprite | |
| maps, uv texture maps for an imagined 3d model | |
| Noticed it captured a megaman legends vibe .... [1] and here it | |
| generated a texture map from a 3d character [2] however im not sure if | |
| these are true uv maps that is accurate as i dont have the 3d models | |
| itself | |
| but ive tried this in nano banana when it first came out and it | |
| couldn't do it | |
| [1]: https://x.com/AgentifySH/status/2001037332770615302 | |
| [2]: https://x.com/AgentifySH/status/2001038516067672390/photo/1 | |
| 101008 wrote 1 day ago: | |
| > however im not sure if these are true uv maps that is accurate as | |
| i dont have the 3d models itself | |
| also in the tweet | |
| > GPT Image 1.5 is **ing crazy | |
| and | |
| > holy shit lol | |
| what's impressive if you don't know if it's right or not (as the | |
| other comment pointed out, it is not right) | |
| gs17 wrote 1 day ago: | |
| > however im not sure if these are true uv maps | |
| I can tell you with 100% certainty they are not. For example, Crash | |
| doesn't have a backside for his torso. You could definitely make a | |
| model that uses these as textures, but you'd really have to force it | |
| and a lot of it would be stretched or look weird. If you want to go | |
| this approach, it would make a lot more sense to make a model, unwrap | |
| it, and use the wireframe UV map as input. | |
| Here's the original Crash model: [1] , its actual texture is nothing | |
| like the generated one, because the real one was designed for | |
| efficiency. | |
| [1]: https://models.spriters-resource.com/pc_computer/crashbandic... | |
| Nition wrote 1 day ago: | |
| That's a remake model in a modern game. The original Crash was even | |
| simpler than that one. | |
| Most of Crash in the first game was not textured; just vertex | |
| colours. Only the fur on his back and his shoelaces were textures | |
| at all. | |
| gs17 wrote 1 day ago: | |
| "Original" as in the original of the one they used in their | |
| tweet. | |
| agentifysh wrote 1 day ago: | |
| yeah definitely impressive compared to what nano banana outputted | |
| tried your suggested approach by unwrapaped wireframe uv as input | |
| and im impressed [1] obviously its not going to be accurate 1:1 but | |
| with more 3d spatial awareness i think it could definitely improve | |
| [1]: https://x.com/AgentifySH/status/2001057153235222867 | |
| gs17 wrote 1 day ago: | |
| > Still some scientific inaccuracies, but ~70% correct | |
| That's still dangerously bad for the use-case they're proposing. We | |
| don't need better looking but completely wrong infographics. | |
| rcarmo wrote 1 day ago: | |
| We donât, but most Marketing departments salivate for them. | |
| astrange wrote 1 day ago: | |
| It's pretty common for infographics to be wrong. The people making | |
| them aren't the same people who know the facts. | |
| I'd especially say like 100% of amateur political infographics/memes | |
| are wrong. ("climate change is caused by 100 companies" for instance) | |
| anonfunction wrote 1 day ago: | |
| So the announcement said the API works with the new model, so I updated | |
| my Golang SDK grail ( [1] ) to use but it returns a 500 server error | |
| when you try to use it, and if you change to a completely unknown model | |
| it's not listed in the available models: | |
| POST "https://api.openai.com/v1/responses": 500 Internal Server Error | |
| { | |
| "message": "An error occurred while processing your request. You | |
| can retry your request, or contact us through our help center at | |
| help.openai.com if the error persists. Please include the request ID | |
| req_******************* in your message.", | |
| "type": "server_error", | |
| "param": null, | |
| "code": "server_error" | |
| } | |
| POST "https://api.openai.com/v1/responses": 400 Bad Request { | |
| "message": "Invalid value: 'blah'. Supported values are: | |
| 'gpt-image-1' and 'gpt-image-1-mini'.", | |
| "type": "invalid_request_error", | |
| "param": "tools[0].model", | |
| "code": "invalid_value" | |
| } | |
| [1]: https://github.com/montanaflynn/grail | |
| aziis98 wrote 1 day ago: | |
| I know this is a bit out of scope for these image editing models but I | |
| always try this experiment [1] of drawing a "random" triangle and then | |
| doing some geometric construction and they mess up in very funny ways. | |
| These models can't "see" very well. I think [2] is still very relevant. | |
| [1] | |
| [1]: https://chatgpt.com/share/6941c96c-c160-8005-bea6-c809e58591c1 | |
| [2]: https://vlmsareblind.github.io/ | |
| zkmon wrote 1 day ago: | |
| AI-generated images would remove all the trust and admire for human | |
| talent in art, similar to how text-generation would remove trust and | |
| admire for human talent in writing. Same case for coding. | |
| So, let's simulate that future. Since no one trusts your talent in | |
| coding, art or writing, you wouldn't care to do any of these. But the | |
| economy is built on the products and services which get their value | |
| based how much of human talent and effort is required to produce them. | |
| So, the value of these services and products goes down as demand and | |
| trust goes down. No one knows or cares who is a good programmer in the | |
| team, who is great thinker and writer and who is a modern Picasso. | |
| So, the motivation disappears for humans. There are no achievements to | |
| target, there is no way to impress others with your talent. This should | |
| lead to uniform workforce without much difference in talents. Pretty | |
| much a robot army. | |
| arnz-arnz wrote 1 day ago: | |
| all I can hope for is that a new industry or reliable ecosystem of | |
| vetters of real human talent will emerge. Are you really as good a | |
| writer as you claim to be? Show us the badge. That or AI firms have | |
| to be forced to 'watermark' all their creative outputs, and anyone | |
| misleading the public/audience should be punishable by law. | |
| zkmon wrote 1 day ago: | |
| Both are just mid-summer dreams. There is no global law to enforce | |
| watermark. There are no badges that can't be forged. | |
| arnz-arnz wrote 1 day ago: | |
| There isn't but that doesn't mean there won't be. It can even go | |
| as far as banning certain features. There isn't just hope with | |
| the kind of politics we have right now. | |
| gostsamo wrote 1 day ago: | |
| Alt text is one of the nicest uses for ai and still Open AI didn't | |
| bother using it for something so basic. The dogfooding is not strong | |
| with their marketing team. | |
| KaiserPro wrote 1 day ago: | |
| Is there a watermarking, or some other way for normal people to tell | |
| if its fake? | |
| qingcharles wrote 1 day ago: | |
| Not if you strip the EXIF data. Also, it will strip the star | |
| watermark and SynthID from Gemini if you paste a Nano Banana pic in | |
| and tell it to mirror it. | |
| wavemode wrote 1 day ago: | |
| I think society is going to need the opposite - cameras that can | |
| embed cryptographic information in the pixels of a video indicating | |
| the image is real. | |
| laurent123456 wrote 1 day ago: | |
| There are ways to tell if an image is real, if it's been signed | |
| cryptographically by the camera for example, but increasingly it | |
| probably won't be possible to tell if something is fake. Even if | |
| there's some kind of hidden watermark embedded in the pixels, you can | |
| process it with img2img in another tool and get rid of the watermark. | |
| Exif data, etc is irrelevant, you can get rid of it easily or fake | |
| it. | |
| ewoodrich wrote 1 day ago: | |
| Sure, you can always remove it, but an average person posting AI | |
| images on Facebook or whatever probably won't bother. I was | |
| skeptical of Google's SynthID when I first heard about it but I've | |
| been seeing it used to identify suspected AI images on Reddit | |
| recently (the example I saw today was cropped and lightly edited | |
| with a filter but still got flagged correctly) and it's cool to | |
| have a hard data point when present. It won't help with | |
| bad/manipulative actors but a decent mitigation for the low effort | |
| slop scenario since it can survive the kind of basic editing a | |
| regular person knows how to do on their phone and typical | |
| compression when uploading/serving. | |
| mnorris wrote 1 day ago: | |
| I ran exiftool on an image I just generated: | |
| $ exiftool chatgpt_image.png | |
| ... | |
| Actions Software Agent Name : GPT-4o | |
| Actions Digital Source Type : [1] Name | |
| : jumbf manifest | |
| Alg : sha256 | |
| Hash : (Binary data 32 bytes, use -b | |
| option to extract) | |
| Pad : (Binary data 8 bytes, use -b option | |
| to extract) | |
| Claim Generator Info Name : ChatGPT | |
| ... | |
| [1]: http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgori... | |
| KaiserPro wrote 1 day ago: | |
| Exif isn't all that robust though. | |
| I suppose I'm going to have to bite the bullet and actually train | |
| an AI detector that works roughly in real time. | |
| mmh0000 wrote 1 day ago: | |
| I know OpenAI watermarks their stuff. But I wish they wouldn't. It's | |
| a "false" trust. | |
| Now it means whoever has access to uncensored/non-watermarking models | |
| can pass off their faked images as real and claim, "Look! There's no | |
| watermark, of course, it's not fake!" | |
| Whereas, if none of the image models did watermarking, then people | |
| (should) inherently know nothing can be trusted by default. | |
| pbmonster wrote 1 day ago: | |
| Yeah, I'd go the other way. Camera manufacturers should have the | |
| camera cryptographically sign the data from the sensor directly in | |
| hardware, and then provide an API to query if a signed image was | |
| taken on one of their cameras. | |
| Add an anonymizing scheme (blind signatures or group signatures), | |
| done. | |
| PhilippGille wrote 1 day ago: | |
| [1] It doesn't mention the new model, but it's likely the same or | |
| similar. | |
| [1]: https://help.openai.com/en/articles/8912793-c2pa-in-chatgpt-... | |
| adrian17 wrote 1 day ago: | |
| I just checked several of the files uploaded to the news post, the | |
| "previous" and "new", both the png and webp (&fm=webp in url) | |
| versions - none had the content metadata. So either the internal | |
| version they used to generate them skipped them, or they just | |
| stripped the metadata when uploading. | |
| dzonga wrote 1 day ago: | |
| we seriously can't be burning GW of energy just to have sama in a | |
| GPT-Shirt Ad generated by A.I | |
| impressive stuff though - as you can give it a base image + prompt. | |
| astrange wrote 1 day ago: | |
| It's a joke about one of his old fits. | |
| [1]: https://x.com/coldhealing/status/1747270233306644560 | |
| drawnwren wrote 1 day ago: | |
| counterpoint: we should make energy abundant enough that it really | |
| doesn't matter if sama wants to generate gpt-shirt ads or not. | |
| we have the capability, we just stopped making power more abundant. | |
| iknowstuff wrote 1 day ago: | |
| I think we can say the pause we took was reasonable once we | |
| realized the environmental impact of dumping greenhouse gases into | |
| the atmosphere but if now that can ensure further growth wonât do | |
| it, letâs make sure we restart, just clean this time. | |
| sfmike wrote 1 day ago: | |
| Hope to see more "red alert" status from the ai wars putting companies | |
| into al hands on deck. This is only helping cost of tokens and | |
| efficacy. As always competition only helps the end users. | |
| surrTurr wrote 1 day ago: | |
| not super impressed. feels like 70% as good as nano banana pro. | |
| oxag3n wrote 1 day ago: | |
| If this was a farm of sweatshop Photoshopers in 2010, who download all | |
| images from the internet and provide a service of combining them on | |
| your request, this would escalate pretty quickly. | |
| Question: with copyright and authorship dead wrt AI, how do I make (at | |
| least) new content protected? | |
| Anecdotal: I had a hobby of doing photos in quite rare style and lived | |
| in a place where you'd get quite a few pictures of. When I asked gpt to | |
| generate a picture of that are in that style, it returned highly | |
| modified, but recognizable copy of a photo I've published years ago. | |
| pfortuny wrote 1 day ago: | |
| I guess some kind of hard (repetitive) steganography where the | |
| private key signature of the original photo is somehow encoded lots | |
| of times; also watermarking everything and asking the reader for some | |
| kind of verification if they want their non-watermarked copy. | |
| There seems to be no other way (apart from air-gapping everything, as | |
| others say). | |
| 999900000999 wrote 1 day ago: | |
| A middle ground would be Chat GPT at least providing attribution. | |
| Back in reality, you can get in line to sue. Since they have more | |
| money than you, you can't really win though. | |
| So it goes. | |
| ur-whale wrote 1 day ago: | |
| > Question: with copyright and authorship dead wrt AI, how do I make | |
| (at least) new content protected? | |
| Question: Now that the steamboats have been invented, how do I keep | |
| my clipper business afloat ? | |
| Answer: Good riddance to the broken idea of IP, Schumpeter's Gale is | |
| around the corner, time for a new business model. | |
| LudwigNagasena wrote 1 day ago: | |
| Using references is a standard industry practice for digital art and | |
| VFX. The main difference is that you are unable to accidentally copy | |
| a reference too close, while with AI itâs possible. | |
| mortenjorck wrote 1 day ago: | |
| > how do I make (at least) new content protected? | |
| Air gap. If you donât want content to be used without your | |
| permission, it never leaves your computer. This is the only | |
| protection that works. | |
| If you want others to see your content, however, you have to accept | |
| some degree of trade off with it being misappropriated. Blatant cases | |
| can be addressed the same as they always were, but a model | |
| overfitting to your original work poses an interesting question for | |
| which Iâm not aware of any legal precedents having been set yet. | |
| echelon wrote 1 day ago: | |
| Horror scenario: | |
| Big IP holders will go nuclear on IP licensing to an extent we've | |
| never seen before. | |
| Right now, there are thousands of images and videos of Star Wars, | |
| Pokemon, Superman, Sonic, etc. being posted across social media. | |
| All it takes is for the biggest IP conglomerates to turn into | |
| linear tv and sports networks of the past and treat social media | |
| like cable. | |
| Disney: "Gee {Google,Meta,Reddit,TikTok}, we see you have a lot of | |
| Star Wars and Marvel content. We think that's a violation of our | |
| rights. If you want your users to continue to be able to post our | |
| media, you need to pay us $5B/yr." | |
| I would not be surprised if this happens now that every user on the | |
| internet can soon create high-fidelity content. | |
| This could be a new $20-30B/yr business for Disney. Nintendo, WBD, | |
| and lots of other giant IP holders could easily follow suit. | |
| empressplay wrote 1 day ago: | |
| Disney invests $1 billion in OpenAI, licenses 200 characters for | |
| AI video app Sora | |
| [1]: https://arstechnica.com/ai/2025/12/disney-invests-1-bill... | |
| echelon wrote 1 day ago: | |
| One day later, "Google pulls AI-generated videos of Disney | |
| characters from YouTube in response to cease and desist": [1] | |
| The next step is to take this beyond AI generations and to | |
| license rights to characters and IP on social media directly. | |
| The next salvo will be where YouTube has to take down all major | |
| IP-related content if they don't pay a licensing fee. | |
| Regardless of how it was created. Movie reviews, fan | |
| animations, video game let's plays. | |
| I've got a strong feeling that day is coming soon. | |
| [1]: https://www.engadget.com/ai/google-pulls-ai-generated-... | |
| margorczynski wrote 1 day ago: | |
| We are probably entering the post-copyright era. The law will follow | |
| sooner or later. | |
| oblio wrote 4 hours 9 min ago: | |
| Yup, just like the post-copyright era followed the dawn of the | |
| internet and the emergence of Napster. | |
| rafram wrote 1 day ago: | |
| That seems unlikely to me. One side is made up of lots and lots of | |
| entrenched interests with sympathetic figures like authors and | |
| artists on their side, and the other is âbig tech,â dominated | |
| by the rather unsympathetic OpenAI and Google. | |
| realharo wrote 1 day ago: | |
| The other side however has the "if you restrict us, China will | |
| win" argument on their side. | |
| panopticon wrote 16 hours 34 min ago: | |
| That argument is easy to politicize and selectively ignore. | |
| See: renewables and EVs. | |
| nobody_r_knows wrote 1 day ago: | |
| my question to your anecdotal: who cares? not being fecicious, but | |
| who cares if someone reproduced your stuff and millions of people see | |
| your stuff? is the money that you want? is it the fame? because fame | |
| you will get, maybe not money... but couldn't there be another way? | |
| whywhywhywhy wrote 14 hours 27 min ago: | |
| The people building the tech are extremely fussy about their work | |
| being cited and extremely protective of their models files so they | |
| themselves have massive issues with their work being used or | |
| replicated non-consensually. | |
| oxag3n wrote 1 day ago: | |
| To clarify my question - I do not want anything I create to be fed | |
| into their training data. That photo is just an example that I | |
| caught and it became personal. But in general I don't want anymore | |
| to open source my code, write articles and put any effort into | |
| improving training data set. | |
| Forgeties79 wrote 1 day ago: | |
| As a professional cinematographer/photographer I am incredibly | |
| uncomfortable with people using my art without my permission for | |
| unknown ends. Doubly so when itâs venture backed private | |
| companies stealing from millions of people like me as they make | |
| vague promises about the capabilities of their software trained on | |
| my work. It doesnât take much to understand why that makes me | |
| uncomfortable and why I feel I am entitled to saying âno.â | |
| Legally I am entitled to that in so many cases, yet for some reason | |
| Altman et al get to skip that hurdle. Why? | |
| How do you feel about entities taking your face off of your | |
| personal website and plastering it on billboards smiling happily | |
| next to their product? What if itâs for a gun? Or condoms? Or a | |
| candidate for a party you donât support? Pick your own example if | |
| none of those bother you. Iâm sure there are things you do not | |
| want to be associated with/donât want to contribute to. | |
| At the end of the day itâs very gross when we are exploited | |
| without our knowledge or permission so rich groups can get richer. | |
| I donât care if my visual work is only partially contributing to | |
| some mashed up final image. I donât want to be a part of it. | |
| vintermann wrote 1 day ago: | |
| > How do you feel about entities taking your face off of your | |
| personal website and plastering it on billboards smiling happily | |
| next to their product? | |
| That would be misrepresentation. Even Stallman isn't OK with | |
| that. You can take one of his opinion pieces and publish it as | |
| your own. Or you can attach his name to it. | |
| However, if you're editing it and releasing it under his name, | |
| clearly you're simply lying, and nobody is OK with that. People | |
| have the right to be recognized as authors of things they did | |
| author (if they so desire) and they have a right to NOT be | |
| associated with things they didn't. | |
| > At the end of the day itâs very gross when we are exploited | |
| without our knowledge or permission so rich groups can get | |
| richer. | |
| The second part is the entirety of the problem. If I'm | |
| "exploited" in a way where I can't even notice it, and I'm not | |
| worse off for it, how is it even exploitation? But people | |
| amassing great power is a problem no matter if they do it with | |
| "legitimate" means or not. | |
| Forgeties79 wrote 1 day ago: | |
| If somebody is stealing from your bank account every week and | |
| you just donât notice it, are you not being stolen from? Has | |
| nobody stolen your credit card and used it until the moment you | |
| notice the charges. I donât really think we can go âif a | |
| tree fall in the forest and nobody is around to hear itâ¦â | |
| about this. | |
| Stallman has his opinions on software, I have my opinions on my | |
| visual work. I donât get really how that applies here or why | |
| that settles this matter. | |
| vintermann wrote 1 day ago: | |
| If someone steals from my bank account I certainly CAN notice | |
| it even if I don't immediately, and I'm certainly worse off. | |
| That's such a bad straw man I wonder if you're really | |
| supporting the position you claim to be supporting. Maybe | |
| you're just trying to give it a bad name. | |
| Your opinion isn't on visual work, but visual property. You | |
| don't demand to be paid for your work - your labor. Rather | |
| you traded that for the dream of being paid rent on a capital | |
| object, in perpetuity (or close enough). Artists lost to the | |
| power-mongers when we bit at that bait. | |
| Forgeties79 wrote 20 hours 40 min ago: | |
| If you think thatâs a bad example so be it but Iâm not | |
| attempting to make a strawman or give anything a bad name. | |
| I donât really know where all the hostility came from in | |
| this conversation but I think itâs best if we move on. | |
| smileson2 wrote 1 day ago: | |
| You should be proud your work will now be distilled enterally and | |
| an aspect of your work will forever influence the world | |
| Forgeties79 wrote 20 hours 40 min ago: | |
| Iâm not | |
| CamperBob2 wrote 1 day ago: | |
| The day after I first heard about the Internet, back in | |
| 1990-whatever, it occurred to me that I probably shouldn't upload | |
| anything to the Internet that I didn't want to see on the front | |
| page of tomorrow's newspaper. | |
| Apart from the 'newspaper' anachronism, that's pretty much still | |
| my take. | |
| Sorry, but you'll just have to deal with it and get over it. | |
| Forgeties79 wrote 1 day ago: | |
| > Sorry, but you'll just have to deal with it and get over it. | |
| You were fine until this bit. | |
| onraglanroad wrote 1 day ago: | |
| They're still fine because they're right. | |
| You got to play the copyright game when the big corps were on | |
| your side. | |
| Now they're on the other side. Deal with it and get over it. | |
| Forgeties79 wrote 1 day ago: | |
| You are not entitled to my art. Comparing that to copyright | |
| abuse by large corporations is ridiculous. | |
| CamperBob2 wrote 20 hours 23 min ago: | |
| I get access to inspiration from everybody's art, and so | |
| do you. Seems like a good deal to me. | |
| Meanwhile, the next generation of great artists is | |
| already at work down the street from you. Some kids | |
| you've never heard of, playing around in a basement or | |
| garage you've probably driven past a hundred times. | |
| They're learning to make the most of the tools at hand, | |
| just like the old masters did. Except the tools at hand | |
| this time are little short of godlike. | |
| It's an exciting time. If you wanted things to stay the | |
| same, you shouldn't have gone into technology or art. | |
| Forgeties79 wrote 14 hours 38 min ago: | |
| Inspiring artists =/= involuntarily training privately | |
| owned LLMâs that charge for access. | |
| If you want me to hand some of my work over to artists | |
| so they can learn and grow and experiment, send them my | |
| way. Happy to help. | |
| CamperBob2 wrote 13 hours 40 min ago: | |
| Inspiring artists =/= involuntarily training | |
| privately owned LLMâs that charge for access. | |
| Agreed there, which is why it's important to work for | |
| open access to the results. The resulting regime | |
| won't look much like present-day copyright law, but | |
| if we do it right, it will be better for us all. | |
| In other words, instead of insisting that "No one can | |
| have this," or "Only a few can have this," which | |
| (again) will not be options for works that you | |
| release commercially, it's better IMHO to insist that | |
| "Everyone can have this." | |
| Forgeties79 wrote 49 min ago: | |
| > In other words, instead of insisting that "No one | |
| can have this," or "Only a few can have this, | |
| Please show me where I ever said anything remotely | |
| like that. Youâre painting my stance as very all | |
| or nothing, which is inaccurate. | |
| Youâre trying to make me into some caricature | |
| that you can grind your axe against, when Iâm | |
| somebody who doesnât even agree with modern | |
| copyright law. I think weâre past the point of | |
| productivity, so Iâll just leave it there. Have a | |
| good one | |
| illwrks wrote 1 day ago: | |
| The issue is ownership, not promotion or visibility. | |
| jibal wrote 1 day ago: | |
| facetious | |
| [I won't bother responding to the rest of your appalling comment] | |
| swatcoder wrote 1 day ago: | |
| People have values that go beyond wealth and fame. Some people care | |
| about things like personal agency, respect and deference, etc. | |
| If someone were on vacation and came home to learn that their | |
| neighbor had allowed some friends stay in the empty house, we would | |
| often expect some kind of outrage regardless of whether there had | |
| been specific damage or wear to the home. | |
| Culturally, people have deeply set ideas about what's theirs, and | |
| feel like they deserve some say over how their things are used and | |
| by whom. Even those that are very generous and want their things be | |
| widely shared usually want to have have some voice in making that | |
| come to be. | |
| visarga wrote 1 day ago: | |
| If I were a creative I would avoid seeing any work I am not | |
| legally allowed to get inspired by, why install furniture into my | |
| brain I can't sit on? I see this kind of IP protection as | |
| poisoned grounds, can't do anything on top of it. | |
| netule wrote 1 day ago: | |
| Suddenly, copyright doesn't matter anymore when it's no longer | |
| useful to the narrative. | |
| CamperBob2 wrote 1 day ago: | |
| (Shrug) This is more important. Sorry. | |
| ragequittah wrote 1 day ago: | |
| Copyright has overstepped its initial purpose by leaps and bounds | |
| because corporations make the law. If you're not cynical about | |
| how Copyright currently works you probably haven't been paying | |
| attention. And it doesn't take much to go from cynical to | |
| nihilist in this case. | |
| netule wrote 1 day ago: | |
| There's definitely a case of miscommunication at play if you | |
| didn't read cynicism into my original post. I broadly agree | |
| with you, but I'll leave it at that to prevent further | |
| fruitless arguing about specifics. | |
| BoorishBears wrote 1 day ago: | |
| OpenAI does care about copyright, thankfully China does not: [1] | |
| (to clarify, OpenAI stops refining the image if a classifier | |
| detects your image as potentially violating certain copyrights. | |
| Although the gulf in resolution is not caused by that.) | |
| [1]: https://imgur.com/a/RKxYIyi | |
| blurbleblurble wrote 1 day ago: | |
| It's really weird to see "make images from memories that aren't real" | |
| as a product pitch | |
| impjohn wrote 1 day ago: | |
| This is what struck me as well. I got weird undertones of 'Now you | |
| don't even need to have real memories! Just fabricate them.' They | |
| even prominently showcase edits of placing you with another person, | |
| further deepening disingenuous or parasocial relationships | |
| 999900000999 wrote 1 day ago: | |
| I can actually imagine actors selling the rights to make fake images | |
| with them. | |
| In late stage capitalism you pay for fake photos with someone. You | |
| have chat gpt write about how you dated for a summer, and have it end | |
| with them leaving for grad school to explain why you aren't together. | |
| Eventually we'll all just pay to live in the matrix. When your credit | |
| card is declined you'll be logged out, to awaken in a shared studio | |
| apartment. To eat your rations. | |
| oblio wrote 3 hours 26 min ago: | |
| > When your credit card is declined you'll be logged out, to awaken | |
| in a shared studio apartment. To eat your rations. | |
| You're funny. No, you'll awaken in a tent, next to your shopping | |
| cart, under the bridge. | |
| ares623 wrote 1 day ago: | |
| I can see them getting paid like residuals from TV re-runs. | |
| But after a point it'll hit saturation point. The novelty will wear | |
| off since everyone has access to it. Who cares if you have a fake | |
| photo with a celebrity if everyone knows it's fake. | |
| nurettin wrote 1 day ago: | |
| It would creep me out if the model produced origami animals for that | |
| prompt. | |
| kingstnap wrote 1 day ago: | |
| It's strange to me too, but they must have done the market research | |
| for what people do with image gen. | |
| My own main use cases are entirely textual: Programming, Wiki, and | |
| Mathematics. | |
| I almost never use image generation for anything. However its | |
| objectively extremely popular. | |
| This has strong parallels for me to when snapchat filters became | |
| super popular. I know lots of people loved editing and filtering | |
| pictures but I always left everything as auto mode, in fact I'd turn | |
| off a lot of the default beauty filters. It just never appealed to | |
| me. | |
| StarterPro wrote 1 day ago: | |
| In the image they showed for the new one, the mechanic was checking a | |
| dipstick...that was still in the vehicle. | |
| I really hope everyone is starting to get disillusioned with OpenAI. | |
| They're just charging you more and more for what? Shitty images that | |
| are easy to sniff out? | |
| In that case, I have a startup for you to invest in. Its a | |
| bridge-selling app. | |
| czhu12 wrote 1 day ago: | |
| Havenât their prices stayed at $20/m for a while now? | |
| wahnfrieden wrote 1 day ago: | |
| They've published anticipated price increases over coming years. | |
| Prices will rise dramatically and steadily to meet revenue targets. | |
| cheema33 wrote 1 day ago: | |
| AI doesnât have much of a moat. People can and will easily | |
| switch providers. | |
| wahnfrieden wrote 1 day ago: | |
| Sure but there are only a couple leading providers worth | |
| considering for coding at least, and there will be | |
| consolidation once investment pulls back. They may find a way | |
| to collude on raising prices. | |
| Where switching will be easier is with casual chat users plus | |
| API consumers that are already using substandard models for | |
| cost efficiency. But there will also always be a market for | |
| state of art quality. | |
| wahnfrieden wrote 19 hours 53 min ago: | |
| Reinforced today: | |
| As Gemini has gained competitiveness (higher confidence in | |
| its output, better reputation), its prices have steadily | |
| risen | |
| 0dayman wrote 1 day ago: | |
| nah Nano Banana Pro is much better | |
| alasano wrote 1 day ago: | |
| It's still not available in the API despite them announcing the | |
| availability. | |
| They even linked to their Image Playground where it's also not | |
| available.. | |
| I updated my local playground to support it and I'm just handling the | |
| 404 on the model gracefully | |
| [1]: https://github.com/alasano/gpt-image-1-playground | |
| weird-eye-issue wrote 1 day ago: | |
| My Enterprise account got an email 1.5 hours ago that it is available | |
| in API but my other accounts haven't gotten any email yet | |
| anonfunction wrote 1 day ago: | |
| Yeah I just tried it and got a 500 server error with no details as to | |
| why: | |
| POST "https://api.openai.com/v1/responses": 500 Internal Server | |
| Error { | |
| "message": "An error occurred while processing your request. You | |
| can retry your request, or contact us through our help center at | |
| help.openai.com if the error persists. Please include the request ID | |
| req_******************* in your message.", | |
| "type": "server_error", | |
| "param": null, | |
| "code": "server_error" | |
| } | |
| Interestingly if you change to request the model foobar you get an | |
| error showing this: | |
| POST "https://api.openai.com/v1/responses": 400 Bad Request { | |
| "message": "Invalid value: 'blah'. Supported values are: | |
| 'gpt-image-1' and 'gpt-image-1-mini'.", | |
| "type": "invalid_request_error", | |
| "param": "tools[0].model", | |
| "code": "invalid_value" | |
| } | |
| minimaxir wrote 1 day ago: | |
| It's a staggered rollout but I am not seeing it on the backend | |
| either. | |
| joshstrange wrote 1 day ago: | |
| > staggered rollout | |
| It's too bad no OpenAI Engineers (or Marketers?) know that term | |
| exists. /s | |
| I do not understand why it's so hard for them to just tell the | |
| truth. So many announcements "Available today for Plus/Pro/etc" | |
| really means "Sometime this week at best, maybe multiple weeks". | |
| I'm not asking for them to roll out faster, just communicate | |
| better. | |
| rvz wrote 1 day ago: | |
| Another bunch of "startups" have been eliminated. | |
| moralestapia wrote 1 day ago: | |
| Among those, Photoshop. | |
| koakuma-chan wrote 1 day ago: | |
| I wish. Even Nano Banana Pro still sucks for even basic operations. | |
| mohsen1 wrote 1 day ago: | |
| Unlike Nano Banana it allows generating photos of children. Always fun | |
| to ask AI to imagine children of a couple but it's also kinda | |
| concerning that there might be terrible use cases. | |
| BoorishBears wrote 1 day ago: | |
| I haven't seen that, meanwhile gpt-image-1.5 still has zero-tolerance | |
| policing copyright (even via the API) so it's pretty much useless in | |
| production once exposed to consumers. | |
| I'm honestly surprised they're still on this post-Sora 2: let the | |
| consumer of the API determine their risk appetite. If a copyright | |
| holder comes knocking, "the API did it" isn't going to be a defense | |
| either way. | |
| hexage1814 wrote 1 day ago: | |
| If memory serves me, Nano Banana allows generating/editing photos of | |
| children. But anything that could be misinterpreted, gets blocked, | |
| even absolutely benign and innocent things (especially if you are | |
| asking to modify a photo that you upload there). So they allow, but | |
| they turn on the guardrails to a point that might not be useful in | |
| many situations. | |
| r053bud wrote 1 day ago: | |
| I was able to generate photos of my imagined children via Nano Banana | |
| catigula wrote 1 day ago: | |
| Nano Banana Pro is so good that any other attempt feels 1-2 generations | |
| behind. | |
| Jonovono wrote 1 day ago: | |
| Nano banana pro is almost as good as seedream 4.5! | |
| BoorishBears wrote 1 day ago: | |
| Seedream 4.5 is almost as good as Seedream 4! | |
| (Realistically, Seedream 4 is the best at aesthetically pleasing | |
| generation, Nano Banana Pro is the best at realism and editing, and | |
| Seedream 4.5 is a very strong middleground between the two with | |
| great pricing) | |
| gpt-image-1.5 feels like OpenAI doing the bare minimum to keep | |
| people from switching to Gemini every time they want an image. | |
| pdevr wrote 1 day ago: | |
| >Now remove the two men, just keep the dog, and put them in an OpenAI | |
| livestream that looks like the attached image. | |
| Where is the image given along with the prompt? If I didn't miss it: | |
| Would have been nice to show the attached image. | |
| taytus wrote 1 day ago: | |
| on top of the prompt. It has a weird layout; I had to scroll up to | |
| see it. | |
| xnx wrote 1 day ago: | |
| Great to have continued competition in the different model types. | |
| What angle is there for second tier models? Could the future for OpenAI | |
| be providing a cheaper option when you don't need the best? It seems | |
| like that segment would also be dominated by the leading models. | |
| I would imagine the future shakes out as: first class hosted models, | |
| hosted uncensored models, local models. | |
| sharkjacobs wrote 1 day ago: | |
| Was it ever explained or understood why ChatGPT Images always has | |
| (had?) that yellow cast? | |
| onoesworkacct wrote 1 day ago: | |
| There's definitely an analysis on the net somewhere, can't remember | |
| the details though. | |
| efilife wrote 1 day ago: | |
| I'm guessing that it was intentional all along, as no other models | |
| exhibit this behavior. It was so it could be instantly recognized as | |
| ChatGPT | |
| varjag wrote 1 day ago: | |
| Not always, it started at a very specific point. Studio Ghibli craze | |
| + reinforcement learning on the likes. | |
| weird-eye-issue wrote 1 day ago: | |
| That's not how it works the model doesn't just update in real time | |
| to likes and besides it was already yellow upon release | |
| minimaxir wrote 1 day ago: | |
| The Studio Ghibli craze started with the initial release of images | |
| in ChatGPT, and the yellow filter has always existed even at that | |
| time. They did not make changes to the model as a result of RL | |
| (until pontentially today, with a new model) | |
| dvngnt_ wrote 1 day ago: | |
| maybe their version of synth-id? it at least helps me spot gpt images | |
| vs gemini's | |
| KaiserPro wrote 1 day ago: | |
| Meta's codec avatars all have a green cast because they spent | |
| millions on the rig to capture whole bodies and even more on rolling | |
| it out to get loads of real data. | |
| They forgot to calibrate the cameras, so everything had a green tint. | |
| Meanwhile all the other teams had a billion macbeth charts lying | |
| around just in case. | |
| jiggawatts wrote 1 day ago: | |
| Also, you'd be shocked at how few developers know anything at all | |
| about sRGB (or any other gamut/encoding), other than perhaps the | |
| name. Even people working in graphics, writing 3D game engines, | |
| working on colorist or graphics artist tools and libraries. | |
| viraptor wrote 1 day ago: | |
| My pet theory is that this is the "Mexico filter" from movies leaking | |
| through the training data. | |
| vunderba wrote 1 day ago: | |
| I never heard anything concrete offered. At least it's relatively | |
| easy to work around with a tone mapping / LUTs. | |
| ACCount37 wrote 1 day ago: | |
| Not really, but there's a number of theories. The simplest one is | |
| that they "style tuned" the AI on human preference data, and this | |
| introduced a subtle bias for yellow. | |
| And I say "subtle" - but because that model would always "regenerate" | |
| an image when editing, it would introduce more and more of this | |
| yellow tint with each tweak or edit. Which has a way of making a | |
| "subtle" bias anything but. | |
| amoursy wrote 1 day ago: | |
| There was also the theory that is was because they scanned a bunch | |
| of actual real books and book paper has a slight yellow hue. | |
| danielbln wrote 1 day ago: | |
| That seems unlikely, as we didn't see anything like that with | |
| Dall-E, unless the auto regressive nature of gpt-image somehow | |
| was more influenced by it. | |
| minimaxir wrote 1 day ago: | |
| My pet theory is that OpenAI screwed up the image normalization | |
| calculation and was stuck with the mistake since that's something | |
| that can't be worked around. | |
| At the least, it's not present in these new images. | |
| swyx wrote 1 day ago: | |
| wdym it cant be worked around when there exist literal yellow tint | |
| corrector models/tools haha | |
| minimaxir wrote 19 hours 59 min ago: | |
| There's a possibility that any automatic correction could have | |
| false positives (since the yellow tint doesn't happen 100% of the | |
| time) which creates different problems where a image could have | |
| an even weirder hue. | |
| ineedasername wrote 1 day ago: | |
| Yeah, though I can imagine a conversation like this: | |
| SWE: "Seriously? import PIL \ read file \ == (c + 10%, m = m, y = | |
| y, k = k) \ save file done!" | |
| Exec: "Yeah, and first blogger get's a hold of image #1 they | |
| generate, starts saying 'Hey! This thing's been color corrected | |
| w/o AI! lol lame'" | |
| Or not, no idea. i've not understood the choice either, besides | |
| very intelligent AI-driven auto-touch up for lighting/color | |
| correction has been a thing for a while. It's just, for those I | |
| end up finding an answer for, maybe 25% of head scratcher | |
| decisions do end of having a reasonable, if non intuitive answer | |
| for. Here? haven't been able to figure one yet though, or find a | |
| reason/mention by someone who appears to have an inside line on | |
| it. | |
| BoorishBears wrote 1 day ago: | |
| There's still something off in the grading, and I suspect they | |
| worked around it | |
| (although I get what you mean, not easily since you already | |
| trained) | |
| I'm guessing when they get a clean slate we'll have Image 2 instead | |
| of 1.5. In LMArena it was immediately apparent it was an OpenAI | |
| model based on visuals. | |
| kingkawn wrote 1 day ago: | |
| Colloquially called the urine filter | |
| jebronie wrote 1 day ago: | |
| lets not mince words, its called the "piss filter" | |
| ezero wrote 1 day ago: | |
| Even from their own curated examples, this looks quite a bit worse than | |
| nano banan in terms of preserving consistency on image edits. | |
| mortenjorck wrote 1 day ago: | |
| Nano Banana became useless for image edits once the safety training | |
| started rejecting anything as âI canât edit some public | |
| figures.â | |
| My own profile picture? Canât edit some public figures. A famous | |
| Norman Rockwell painting from 80 years ago? Canât edit some public | |
| figures. | |
| Safetyâd into oblivion. | |
| almosthere wrote 1 day ago: | |
| I didn't have a good experience with NB. I am half Indian. | |
| Immediately changes my face to a prototypical Indian man every time I | |
| use it. | |
| This tool is keeping my look the same. | |
| gundmc wrote 1 day ago: | |
| I find including "don't change anything else" in the NBP prompt | |
| goes a long way. | |
| almosthere wrote 1 day ago: | |
| I tried all of those types of prompts | |
| neom wrote 1 day ago: | |
| Anyone else have issues verifying with openai? I always get a "congrats | |
| you're done" screen with a green checkmark from Persona, nothing to | |
| click, and my account stays unverified. (Edit, mystically, it's | |
| fixed..!) | |
| minimaxir wrote 1 day ago: | |
| I have a Nano Banana Pro blog post in the works expanding on my | |
| experiments with Nano Banana ( [1] ). Running a few of my test cases | |
| from that post and the upcoming blog post through this new ChatGPT | |
| Image model, this new model is better than Nano Banana but MUCH worse | |
| than Nano Banana Pro which now nails the test cases that previously | |
| showed issues. The pricing is unclear but gpt-image-1.5 appears to be | |
| 20% cheaper than the current gpt-image-1 model, which would put a | |
| `high`-quality generation in the same price range as Nano Banana Pro. | |
| One curious case demoed here in the docs is the grid use case. Nano | |
| Banana Pro can also generate grids, but for NBP grid adherence to the | |
| prompt collapses after going higher than 4x4 (there's only a finite | |
| amount of output tokens to correspond to each subimage), so I'm curious | |
| that OpenAI started with a 6x6 case albeit the test prompt is not that | |
| nuanced. | |
| [1]: https://news.ycombinator.com/item?id=45917875 | |
| echelon wrote 1 day ago: | |
| I've been a filmmaker for 10+ years. I really want more visual tools | |
| that let you precisely lay out consistent scenes without prompting. | |
| This is important for crafting the keyframes in an image-to-video | |
| style workflow, and is especially important for long form narrative | |
| content. | |
| One thing that gpt-image-1 does exceptionally well that Nano Banana | |
| (Pro) can't is previz-to-render. This is actually an incredibly | |
| useful capability. | |
| The Nano Banana models take the low-fidelity previz | |
| elements/stand-ins and unfortunately keep the elements in place | |
| without attempting to "upscale" them. The model tries to preserve | |
| every mistake and detail verbatim. | |
| Gpt-image-1, on the other hand, understands the layout and blocking | |
| of the scene, the pose of human characters, and will literally repair | |
| and upscale everything. | |
| Here's a few examples: | |
| - 3D + Posing + Blocking: [1] - Again, but with more set re-use: [2] | |
| - Gaussian splats: [3] - Gaussians again: [4] We need models that can | |
| do what gpt-image-1 does above, but that have higher quality, better | |
| stylistic control, faster speed, and that can take style references | |
| (eg. glossy Midjourney images). | |
| Nano Banana team: please grow these capabilities. | |
| Adobe is testing and building some really cool capabilities: | |
| - Relighting scenes: [5] - Image -> 3D editing: [6] (payoff is at | |
| 3:54) | |
| - Image -> Gaussian -> Gaussian editing: [7] - 3D -> image with | |
| semantic tags: [8] I'm trying to build the exact same things that | |
| they are, except as open source / source available local desktop | |
| tools that we can own. Gives me an outlet to write Rust, too. | |
| [1]: https://youtu.be/QYVgNNJP6Vc | |
| [2]: https://youtu.be/QMyueowqfhg | |
| [3]: https://youtu.be/iD999naQq9A | |
| [4]: https://youtu.be/IxmjzRm1xHI | |
| [5]: https://youtu.be/YqAAFX1XXY8?si=DG6ODYZXInb0Ckvc&t=211 | |
| [6]: https://youtu.be/BLxFn_BFB5c?si=GJg12gU5gFU9ZpVc&t=185 | |
| [7]: https://youtu.be/z3lHAahgpRk?si=XwSouqEJUFhC44TP&t=285 | |
| [8]: https://youtu.be/z275i_6jDPc?si=2HaatjXOEk3lHeW-&t=443 | |
| pablonaj wrote 1 day ago: | |
| Love the samples of the app you are making, will be testing it! | |
| echelon wrote 1 day ago: | |
| Images make this even easier to see (though predictable and precise | |
| video is what drives the demand) : | |
| gpt-image-1: [1] (fixed link - imgur deleted the last post for some | |
| reason) | |
| gpt-image-1.5: [2] nano banana / pro: [3] gpt-image-1 excels in | |
| these cases, despite being stylistically monotone. | |
| I hope that Google, OpenAI, and the various Chinese teams lean in | |
| on this visual editing and blocking use case. It's much better than | |
| text prompting for a lot of workflows, especially if you need to | |
| move the camera and maintain a consistent scene. | |
| While some image editing will be in the form of "remove the | |
| object"-style prompts, a lot will be molding images like clay. | |
| Grabbing arms and legs and moving them into new poses. Picking up | |
| objects and replacing them. Rotating scenes around. | |
| When this gets fast, it's going to be magical. We're already | |
| getting close. | |
| [1]: https://imgur.com/gallery/previz-to-image-gpt-image-1-x8t1... | |
| [2]: https://imgur.com/a/previz-to-image-gpt-image-1-5-3fq042U | |
| [3]: https://imgur.com/a/previz-to-image-nano-banana-pro-Q2B8ps... | |
| qingcharles wrote 1 day ago: | |
| I just tested GPT1.5. I would say the image quality is on par with | |
| NBP in my tests (which is surprising as the images in their trailer | |
| video are bad), but the prompt adherence is way worse, and its "world | |
| model" if you want to call it that is worse. For instance, I asked it | |
| for two people in a row boat and it had two people, but the boat was | |
| more like a coracle and they would barely fit inside it. | |
| Also: SUPER ANNOYING. It seems every time you give it a modification | |
| prompt it erases the whole conversation leading up to the new pic? | |
| Like.. all the old edits vanish?? | |
| I added "shaky amateur badly composed crappy smartphone photo of | |
| ____" to the start of my prompts to make them look more natural. | |
| Counterpoint from someone on the Musk site: | |
| [1]: https://x.com/flowersslop/status/2001007971292332520 | |
| vunderba wrote 1 day ago: | |
| I actually just finished running the Text-to-Image benchmark a few | |
| minutes ago. This matches my own testing as well. GPT-Image 1.5 is | |
| clearly a step up as an editing model, but it performed worse in | |
| purely generative tasks compared to its predecessor - dropping from | |
| 11 (out of 14) to 9. | |
| Comparing NB Pro, GPT Image 1, and GPT Image 1.5 | |
| [1]: https://genai-showdown.specr.net/?models=o4,nbp,g15 | |
| abadar wrote 1 day ago: | |
| I really enjoyed your experiments. Thank you for sharing your | |
| experiences. They've improved my prompting and have tempered my | |
| expectations. | |
| vunderba wrote 1 day ago: | |
| I'll be running gpt-image-1.5 through my GenAI Showdown later today, | |
| but in the meantime if you want to see some legitimately impressive | |
| NB Pro outputs, check out: [1] In particular, NB Pro successfully | |
| assembled a jigsaw puzzle it had never seen before, generated | |
| semi-accurate 3D topographical extrapolations, and even swapped a | |
| window out for a mirror. | |
| [1]: https://mordenstar.com/blog/edits-with-nanobanana | |
| jngiam1 wrote 1 day ago: | |
| The mirror test is cool! | |
| IgorPartola wrote 1 day ago: | |
| Subtle detail but the little table casts a shadow because of the | |
| light in the window and the shadow remains unchanged after the | |
| mirror replaces the window. | |
| dash2 wrote 1 day ago: | |
| More obviously, the objects in the mirror aren't actually | |
| reversed! | |
| vunderba wrote 1 day ago: | |
| That one's on me! It was still using the old NB image. | |
| Updated the mirror test to use the NB Pro version. | |
| niklassheth wrote 1 day ago: | |
| Nice! Your comparison site is probably the best one out there for | |
| image models | |
| abbycurtis33 wrote 1 day ago: | |
| I still use Midjourney, because all of these major players are so bad | |
| at stylistic and creative work. They're singularly focused on | |
| photorealism. | |
| Sohcahtoa82 wrote 20 hours 30 min ago: | |
| In my experience, MidJourney creates the best overall-looking images, | |
| but it's the worst at sticking to your prompt. | |
| empressplay wrote 1 day ago: | |
| That's because it's a two-way street, a multi-modal model that is | |
| highly proficient at real-life image generation is also highly | |
| proficient at interpreting real-life image input, which is something | |
| sorely needed for robotics. | |
| ianbicking wrote 1 day ago: | |
| I haven't really kept up with what Midjourney has been doing the past | |
| year or two. While I liked the stylistic aspects of Midjourney, being | |
| able to use image examples to maintain stylistic consistency and | |
| character consistency is SO useful for creating any meaningful | |
| output. Have they done anything in that respect? | |
| That is, it's nice to make a pretty stand-alone image, but without | |
| tools to maintain consistency and place them in context you can't | |
| make a project that is more than just one image, or one video, or a | |
| scattered and disconnected sequence of pieces. | |
| xnx wrote 1 day ago: | |
| This is surprising. Is there a gallery of images that illustrates | |
| this? | |
| throwthrowuknow wrote 1 day ago: | |
| their explore page is a firehose of examples created by users and | |
| you can see the prompt used so you can compare the results in other | |
| services | |
| [1]: https://www.midjourney.com/explore?tab=video_top | |
| takoid wrote 1 day ago: | |
| Midjourney has a gallery on their website: | |
| [1]: https://www.midjourney.com/explore | |
| kingkawn wrote 1 day ago: | |
| This is a cultural flaw that predates image generation. Even PG has | |
| made statements on HN in the past equating ârendering skillâ with | |
| the quality of art works. Itâs a stand-in for the much more | |
| difficult task of understanding the work and value of culture making | |
| within the context of the society producing it. | |
| doctorpangloss wrote 1 day ago: | |
| Suppose the deck for Midjourney hit Paul Graham's desk, and the CEO | |
| was just an average Y Combinator CEO - so no previous success | |
| story. He would have never invested in Midjourney at seed stage | |
| (meaning before launch / before there were users) even if he were | |
| given the opportunity. | |
| Better to read that particular story in the context of, "It would | |
| be very difficult to make a seed fund that is an index of all avant | |
| garde culture making because [whatever]." | |
| FergusArgyll wrote 1 day ago: | |
| That's the opinionated vs user choice dynamic. When the opinions are | |
| good, they have a leg up | |
| ChrisArchitect wrote 1 day ago: | |
| Post: [1] ( [2] ) | |
| [1]: https://openai.com/index/new-chatgpt-images-is-here/ | |
| [2]: https://news.ycombinator.com/item?id=46291827 | |
| dang wrote 1 day ago: | |
| We'll merge that thread hither to give some other submitters a | |
| chance. | |
| <- back to front page |