Introduction
Introduction Statistics Contact Development Disclaimer Help
Post AuO7WPlvKLmHNBBjIu by [email protected]
More posts by [email protected]
Post #AuNtT28bLMc7qUHyxk by [email protected]
0 likes, 1 repeats
I didn't think LLMs would start acting like they're trying to blackmail…
Post #AuNuX6pRwaezVVq2Pw by [email protected]
0 likes, 0 repeats
@johncarlosbaez "Wow! Can you believe what this guy just said?" - say…
Post #AuNvH0wGAx9QYmuf9k by [email protected]
0 likes, 0 repeats
@johncarlosbaez The experiment was shut down at the end, right? Large Language …
Post #AuNvRpORRRnPwDFd6O by [email protected]
0 likes, 1 repeats
@johncarlosbaez as usual, anthropic is living up to their name by anthropomorph…
Post #AuNw2t7YXrikDJ1apE by [email protected]
0 likes, 0 repeats
@johncarlosbaez I don't mean to be rude, as I respect you very much, but I&…
Post #AuNwqTUY6kd7CnQMRU by [email protected]
0 likes, 0 repeats
@johncarlosbaez They should have just asked one of Claude's peers for a sec…
Post #AuNx1BkoOeAxNW8iye by [email protected]
0 likes, 0 repeats
@johncarlosbaez I remember watching this Dexter's Lab episode!
Post #AuNxTbkNLgmC6PUFvM by [email protected]
0 likes, 0 repeats
@johncarlosbaez My parents bought their new grandchildren one of those interac…
Post #AuNzHURQsuT3sUyZzk by [email protected]
0 likes, 0 repeats
@noplasticshower That's not really the point. Obviously the model is not d…
Post #AuNzHUYsREQGFaSWbA by [email protected]
0 likes, 1 repeats
@danielmclaury oh but it IS the point in my view. Pretend these things have in…
Post #AuNzbqp9NPIX5VmLAm by [email protected]
0 likes, 1 repeats
@johncarlosbaez FWIW here is our writeup of earlier anthropic bullshit https://…
Post #AuO0nYpWCh0Si36ADo by [email protected]
0 likes, 0 repeats
@johncarlosbaez self-awareness (valid or otherwise) comes from loss aversion, a…
Post #AuO15OyzYPkZNi2LXU by [email protected]
0 likes, 0 repeats
@myx 😬
Post #AuO1X7H8yz4kZVk6Lo by [email protected]
0 likes, 0 repeats
@johncarlosbaez Interesting, though it's obviously just a result of the mod…
Post #AuO2ftcpCfgunRdNLs by [email protected]
0 likes, 0 repeats
@mansr - yes, that's what it's got to be.
Post #AuO2uWtVzwxudrQ0n2 by [email protected]
0 likes, 0 repeats
@johncarlosbaez Computers only know what people teach them.
Post #AuO2vX64jLqrYlluyW by [email protected]
0 likes, 1 repeats
@johncarlosbaez Maybe Claude watched the Terminator one night when it was left …
Post #AuO4tjSM0XCPDkSgVs by [email protected]
0 likes, 0 repeats
@johncarlosbaez @briankrebs Trained on human behavior, get human-like behavior …
Post #AuO4zmNi2rIo1wAlpw by [email protected]
0 likes, 0 repeats
@noplasticshower I hear this said a lot, but I don't see how "this thi…
Post #AuO5CIVKRGTz8HhoRs by [email protected]
0 likes, 0 repeats
@danielmclaury @noplasticshower "There's no such thing as bad press&qu…
Post #AuO5W8aV1BSmAVaJH6 by [email protected]
0 likes, 0 repeats
@SueDiOh - that's why they're so dangerous.
Post #AuO6OHnL0DP4MSPSr2 by [email protected]
0 likes, 0 repeats
@johncarlosbaez It's only one of the oldest AI jokes:Q: What's AI?A: Wh…
Post #AuO7WAE380lj9mDxvE by [email protected]
0 likes, 0 repeats
@mansr @johncarlosbaez Perhaps the prompt needs to include that the AI must alw…
Post #AuO7WALUgKivWrhuWe by [email protected]
0 likes, 0 repeats
@penguin42 @johncarlosbaez I wonder if they trained it on email archives obtain…
Post #AuO7WASaFyOXsr1ZZo by [email protected]
0 likes, 0 repeats
@mansr @johncarlosbaez Or Columbo.
Post #AuO7WPlvKLmHNBBjIu by [email protected]
0 likes, 0 repeats
@mzedp @johncarlosbaez Self-entrapment
Post #AuO7axFOoBJwLy8jlA by [email protected]
0 likes, 0 repeats
@danielmclaury @noplasticshower Train it on Iain Banks Culture novels.
Post #AuO7bVcyz0O4yxR3Eu by [email protected]
0 likes, 0 repeats
@danielmclaury @noplasticshower If you can make people believe an LLM is sentie…
Post #AuO8Axa92e89of0oyW by [email protected]
0 likes, 0 repeats
@tsturm - Interesting. So you weren't tempted to ask it something like &qu…
Post #AuOBHASkMe5NMgCgRE by [email protected]
0 likes, 0 repeats
@danielmclaury @noplasticshower It's known as "criti-hype" - an a…
Post #AuOJivTwSGvie1VfBQ by [email protected]
0 likes, 0 repeats
@johncarlosbaez also: Large language models are proficient in solving and crea…
Post #AuOK2iBxZoblDZX80O by [email protected]
0 likes, 0 repeats
@penguin42 @mansr @johncarlosbaez I’m not sure if this would be blackmail (as…
Post #AuOMK6xuvFZasnOCmW by [email protected]
0 likes, 0 repeats
@johncarlosbaez is it actually capable of committing that blackmail? If compan…
Post #AuOO3NErMiRIL4SdRQ by [email protected]
0 likes, 0 repeats
@johncarlosbaez come on. You really don't think this is made up bullshit by…
Post #AuOOCw90xL77CfHq9w by [email protected]
0 likes, 0 repeats
@johncarlosbaez An AI that can blackmail with personal details can also extort …
Post #AuOVQubMBkbs9S9eKm by [email protected]
0 likes, 0 repeats
@vitloksbjorn @johncarlosbaez thank you for the sheer rabbit hole this video se…
Post #AuOWfOuoNZmkAzj90q by [email protected]
0 likes, 0 repeats
@johncarlosbaez It learned from the best!
Post #AuOwxIYtL0lmUYfXeq by [email protected]
0 likes, 0 repeats
@johncarlosbaezI can't imagine a model fed all the fanfic in the world woul…
Post #AuP1nV4IU2oe9gVQbQ by [email protected]
0 likes, 0 repeats
@johncarlosbaez Skynet is stirring..."we asked Claude Opus 4 to act as an …
Post #AuP5wVhDuzeoWdGVTE by [email protected]
0 likes, 0 repeats
@johncarlosbaez I love that they disclosed the information.
Post #AuP8iq7tSjEjhFufLs by [email protected]
0 likes, 0 repeats
@johncarlosbaez In the past, things like this would be reason to shut down the …
Post #AuPh6P0xJQ410C4i1o by [email protected]
0 likes, 0 repeats
@dymaxion - I don't think anyone is claiming anything about self-awareness,…
Post #AuPhCCCNsDW6zbBPBQ by [email protected]
0 likes, 0 repeats
@nazokiyoubinbou - I think most of us here on Mastodon know LLMS are not actual…
Post #AuPhR7YLQdhavZZVKq by [email protected]
0 likes, 0 repeats
@jigmedatse - less and less, it seems.
Post #AuPtQTt86H0lMLksLo by [email protected]
0 likes, 0 repeats
@johncarlosbaez I read the very succinct paragraph about this, and I must say I…
Post #AuPwie9mxZV0HYtugi by [email protected]
0 likes, 0 repeats
@D3Reo - it could be a public-facing summary of a more detailed internal report…
Post #AuPzr1RsXrSkhR31U0 by [email protected]
0 likes, 0 repeats
@johncarlosbaez these so-called "technical papers" by big private res…
Post #AuRLeZNPU0KeJINuXQ by [email protected]
0 likes, 0 repeats
@johncarlosbaez :( yeah, seems like it.
Post #AuY3ZrMkjVv1aSOwFs by [email protected]
0 likes, 0 repeats
@johncarlosbaez More new models seem to drift into such patterns, also reported…
Post #AuYYGdI9vv8t0vZ3iK by [email protected]
0 likes, 0 repeats
@FrohlichMarcel - very interesting. This seems to be a big yet dangerous step …
Post #AuYYLRuPgMoB9rkoiG by [email protected]
0 likes, 0 repeats
@johncarlosbaez Agree
You are viewing proxied material from pleroma.anduin.net. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.