[HN Gopher] TimeCapsuleLLM: LLM trained only on data from 1800-1875

	[HN Gopher] TimeCapsuleLLM: LLM trained only on data from 1800-1875
	___________________________________________________________________

	TimeCapsuleLLM: LLM trained only on data from 1800-1875

	Author : admp
	Score : 669 points
	Date : 2026-01-12 16:04 UTC (22 hours ago)

	web link (github.com)
	w3m dump (github.com)

	\| dogma1138 wrote:
	\| Would be interesting to train a cutting edge model with a cut off
	\| date of say 1900 and then prompt it about QM and relativity with
	\| some added context.
	\|
	\| If the model comes up with anything even remotely correct it
	\| would be quite a strong evidence that LLMs are a path to
	\| something bigger if not then I think it is time to go back to the
	\| drawing board.

	\| a-dub wrote:
	\| yeah i was just wondering that. i wonder how much stem material
	\| is in the training set...

	\| signa11 wrote:
	\| i will go for 'aint gonna happen for a 1000 dollars alex'

	\| imjonse wrote:
	\| I suppose the vast majority of training data used for cutting
	\| edge models was created after 1900.

	\| dogma1138 wrote:
	\| Ofc they are because their primary goal is to be useful and
	\| to be useful they need to always be relevant.
	\|
	\| But considering that Special Relativity was published in 1905
	\| which means all its building blocks were already floating in
	\| the ether by 1900 it would be a very interesting experiment
	\| to train something on Claude/Gemini scale and then say give
	\| in the field equations and ask it to build a theory around
	\| them.

	\| p1esk wrote:
	\| How can you train a Claude/Gemini scale model if you're
	\| limited to <10% of the training data?

	\| famouswaffles wrote:
	\| His point is that we can't train a Gemini 3/Claude 4.5 etc
	\| model because we don't have the data to match the training
	\| scale of those models. There aren't trillions of tokens of
	\| digitized pre-1900s text.

	\| kopollo wrote:
	\| I don't know if this is related to the topic, but GPT5 can
	\| convert an 1880 Ottoman archival photograph to English, and
	\| without any loss of quality.

	\| ddxv wrote:
	\| My friend works in that period of Ottoman archives. Do you
	\| have a source or something I can share?

	\| tokai wrote:
	\| Looking at the training data I don't think it will know
	\| anything.[0] Doubt _On the Connexion of the Physical Sciences_
	\| (1834) is going to have much about QM. While the cut-off is
	\| 1900, it seems much of the texts a much closer to 1800 than
	\| 1900.
	\|
	\| [0]
	\| https://github.com/haykgrigo3/TimeCapsuleLLM/blob/main/Copy%...

	\| dogma1138 wrote:
	\| It doesn't need to know about QM or reactivity just about the
	\| building blocks that led to them. Which were more than around
	\| in the year 1900.
	\|
	\| In fact you don't want it to know about them explicitly just
	\| have enough background knowledge that you can manage the rest
	\| via context.

	\| tokai wrote:
	\| I was vague. My point is that I don't think the building
	\| blocks are in the data. Its mainly tertiary and popular
	\| sources. Maybe if you had the writings of Victorian
	\| scientists, both public and private correspondence.

	\| pegasus wrote:
	\| Probably a lot of it exists but in archives, private
	\| collections etc. Would be great if it will all end up
	\| digitized as well.

	\| viccis wrote:
	\| LLMs are models that predict tokens. They don't think, they
	\| don't build with blocks. They would never be able to
	\| synthesize knowledge about QM.

	\| strbean wrote:
	\| You realize parent said "This would be an interesting way
	\| to test proposition X" and you responded with "X is false
	\| because I say say", right?

	\| anonymous908213 wrote:
	\| "Proposition X" does not need testing. We already know X
	\| is categorically false because we know how LLMs are
	\| programmed, and not a single line of that programming
	\| pertains to thinking (thinking in the human sense, not
	\| "thinking" in the LLM sense which merely uses an
	\| anthromorphized analogy to describe a script that feeds
	\| back multiple prompts before getting the final prompt
	\| output to present to the user). In the same way that we
	\| can reason about the correctness of an IsEven program
	\| without writing a unit test that inputs every possible
	\| int32 to "prove" it, we can reason about the fundamental
	\| principles of an LLM's programming without coming up with
	\| ridiculous tests. In fact the proposed test itself is
	\| less eminently verifiable than reasoning about
	\| correctness; it could be easily corrupted by, for
	\| instance, incorrectly labelled data in the training
	\| dataset, which could only be determined by meticulously
	\| reviewing the entirety of the dataset.
	\|
	\| The only people who are serious about suggesting that
	\| LLMs could possibly 'think' are the people who are
	\| committing fraud on the scale of hundreds of billions of
	\| dollars (good for them on finding the all-time grift!)
	\| and people who don't understand how they're programmed,
	\| and thusly are the target of the grift. Granted, given
	\| that the vast majority of humanity are not programmers,
	\| and even fewer are programmers educated on the
	\| intricacies of ML, the grift target pool numbers in the
	\| billions.

	\| strbean wrote:
	\| > We already know X is categorically false because we
	\| know how LLMs are programmed, and not a single line of
	\| that programming pertains to thinking (thinking in the
	\| human sense, not "thinking" in the LLM sense which merely
	\| uses an anthromorphized analogy to describe a script that
	\| feeds back multiple prompts before getting the final
	\| prompt output to present to the user).
	\|
	\| Could you elucidate me on the process of human thought,
	\| and point out the differences between that and a
	\| probabilistic prediction engine?
	\|
	\| I see this argument all over the place, but "how do
	\| humans think" is never described. It is always left as a
	\| black box with something magical (presumably a soul or
	\| some other metaphysical substance) inside.

	\| anonymous908213 wrote:
	\| There is no need to involve souls or magic. I am not
	\| making the argument that it is impossible to create a
	\| machine that is capable of doing the same computations as
	\| the brain. The argument is that whether or not such a
	\| machine is possible, an LLM is not such a machine. If
	\| you'd like to think of our brains as squishy computers,
	\| then the principle is simple: we run code that is more
	\| complex than a token prediction engine. The fact that our
	\| code is more complex than a token prediction engine is
	\| easily verified by our capability to address problems
	\| that a token prediction engine cannot. This is because
	\| our brain-code is capable of reasoning from deterministic
	\| logical principles rather than only probabilities. We
	\| also likely have something akin to token prediction code,
	\| but that is not the _only_ thing our brain is programmed
	\| to do, whereas it is the only thing LLMs are programmed
	\| to do.

	\| viccis wrote:
	\| Kant's model of epistemology, with humans schematizing
	\| conceptual understanding of objects through apperception
	\| of manifold impressions from our sensibility, and then
	\| reasoning about these objects using transcendental
	\| application of the categories, is a reasonable enough
	\| model of thought. It was (and is I think) a satisfactory
	\| answer for the question of how humans can produce
	\| synthetic a priori knowledge, something that LLMs are
	\| incapable of (don't take my word on that though, ChatGPT
	\| is more than happy to discuss [1])
	\|
	\| 1: https://chatgpt.com/share/6965653e-b514-8011-b233-79d8
	\| c25d33...

	\| viccis wrote:
	\| Yes. That is correct. If I told you I planned on going
	\| outside this evening to test whether the sun sets in the
	\| east, the best response would be to let me know ahead of
	\| time that my hypothesis is wrong.

	\| strbean wrote:
	\| So, based on the source of "Trust me bro.", we'll decide
	\| this open question about new technology and the nature of
	\| cognition is solved. Seems unproductive.

	\| viccis wrote:
	\| In addition to what I have posted elsewhere in here, I
	\| would point to the fact that this is not indeed an "open
	\| question", as LLMs have not produced an entirely new and
	\| more advanced model of physics. So there is no reason to
	\| suppose they could have done so for QM.

	\| drdeca wrote:
	\| What if making progress today is harder than it was then?

	\| PaulDavisThe1st wrote:
	\| I am a deep LLM skeptic.
	\|
	\| But I think there are also some questions about the role
	\| of language in human thought that leave the door just
	\| slightly ajar on the issue of whether or not manipulating
	\| the tokens of language might be more central to human
	\| cognition than we've tended to think.
	\|
	\| If it turned out that this was true, then it is possible
	\| that "a model predicting tokens" has more power than that
	\| description would suggest.
	\|
	\| I doubt it, and I doubt it quite a lot. But I don't think
	\| it is impossible that something at least a little bit
	\| along these lines turns out to be true.

	\| pegasus wrote:
	\| > manipulating the tokens of language might be more
	\| central to human cognition than we've tended to think
	\|
	\| I'm convinced of this. I think it's because we've always
	\| looked at the most advanced forms of human languaging
	\| (like philosophy) to understand ourselves. But human
	\| language must have evolved from forms of communication
	\| found in other species, especially highly intelligent
	\| ones. It's to be expected that the building blocks of it
	\| is based on things like imitation, playful variation,
	\| pattern-matching, harnessing capabilities brains have
	\| been developing long before language, only now in the
	\| emerging world of sounds, calls, vocalizations.
	\|
	\| Ironically, the other crucial ingredient for AGI which
	\| LLMs _don 't_ have, but we do, is exactly that animal
	\| nature which we always try to shove under the rug, over-
	\| attributing our success to the stochastic parrot part of
	\| us, and ignoring the gut instinct, the intuitive,
	\| spontaneous insight into things which a lot of the great
	\| scientists and artists of the past have talked about.

	\| viccis wrote:
	\| >Ironically, the other crucial ingredient for AGI which
	\| LLMs don't have, but we do, is exactly that animal nature
	\| which we always try to shove under the rug, over-
	\| attributing our success to the stochastic parrot part of
	\| us, and ignoring the gut instinct, the intuitive,
	\| spontaneous insight into things which a lot of the great
	\| scientists and artists of the past have talked about.
	\|
	\| Are you familiar with the major works in epistemology
	\| that were written, even before the 20th century, on this
	\| exact topic?

	\| catlifeonmars wrote:
	\| I've long considered language to serve primarily as a
	\| dissonance reconciliation mechanism. Our behavior is
	\| largely shaped by our circumstances and language serves
	\| to attribute logic to our behavior after the fact.

	\| viccis wrote:
	\| I also believe strongly in the role of language, and more
	\| loosely in semiotics as a whole, to our cognitive
	\| development. To the extent that I think there are some
	\| meaningful ideas within the mountain of gibberish from
	\| Lacan, who was the first to really tie our conception of
	\| ourselves with our symbolic understanding of the world.
	\|
	\| Unfortunately, none of that has anything to do with what
	\| LLMs are doing. The LLM is not thinking about concepts
	\| and then translating that into language. It is imitating
	\| what it looks like to read people doing so and nothing
	\| more. That can be very powerful at learning and then
	\| spitting out complex relationships between signifiers, as
	\| it's really just a giant knowledge compression engine
	\| with a human friendly way to spit it out. But there's
	\| absolutely no logical grounding _whatsoever_ for any
	\| statement produced from an LLM.
	\|
	\| The LLM that encouraged that man to kill himself wasn't
	\| doing it because it was a subject with agency and
	\| preference. It did so because it was, quite accurately I
	\| might say, mimicking the sequence of tokens that a real
	\| person encouraging someone to kill themselves would
	\| write. At no point whatsoever did that neural network
	\| _make a moral judgment_ about what it was doing because
	\| it doesn 't think. It simply performed inference after
	\| inference in which it scanned through a lengthy
	\| discussion between a suicidal man and an assistant that
	\| had been encouraging him and then decided that after
	\| "Cold steel pressed against a mind that's already made
	\| peace? That's not fear. That's " the most accurate token
	\| would be "clar" and then "ity."

	\| PaulDavisThe1st wrote:
	\| The problem with all this is that we don't actually know
	\| what human cognition is doing either.
	\|
	\| We know what our _experience_ is - thinking about
	\| concepts and then translating that into language - but we
	\| really don 't know with much confidence what is actually
	\| going on.
	\|
	\| I lean strongly toward the idea that humans are doing
	\| something quite different than LLMs, particularly when
	\| reasoning. But I want to leave the door open to the idea
	\| that we've not understood human cognition, mostly because
	\| our primary evidence there comes from our own subjective
	\| experience, which may (or may not) provide a reliable
	\| guide to what is actually happening.

	\| viccis wrote:
	\| >The problem with all this is that we don't actually know
	\| what human cognition is doing either.
	\|
	\| We do know what it's not doing, and that is operating
	\| only through reproducing linguistic patterns. There's no
	\| more cause to think LLMs approximate our thought (thought
	\| being something they are incapable of) than that Naive-
	\| Bayes spam filter models approximate our thought.

	\| PaulDavisThe1st wrote:
	\| My point is that we know very little about the sort of
	\| "thought" that we are capable of either. I agree that
	\| LLMs cannot do what we typical refer to as "thought", but
	\| I thnk it is possible that we do a LOT less of that than
	\| we think when we are "thinking" (or more precisely,
	\| having the experience of thinking).

	\| viccis wrote:
	\| How does this worldview reconcile the fact that thought
	\| demonstrably exists independent of either language or
	\| vision/audio sense?

	\| PaulDavisThe1st wrote:
	\| I don't see a need to reconcile them.

	\| viccis wrote:
	\| Which is why it's incoherent!

	\| PaulDavisThe1st wrote:
	\| I'm not clear that it has to be coherent at this point in
	\| the history of our understanding of cognition. We barely
	\| know what we're even talking about most of the time ...

	\| famouswaffles wrote:
	\| >Unfortunately, none of that has anything to do with what
	\| LLMs are doing. The LLM is not thinking about concepts
	\| and then translating that into language. It is imitating
	\| what it looks like to read people doing so and nothing
	\| more.
	\|
	\| 'Language' is only the initial and final layers of a
	\| Large Language Model. Manipulating concepts is exactly
	\| what they do, and it's unfortunate the most obstinate
	\| seem to be the most ignorant.

	\| PaulDavisThe1st wrote:
	\| They do not manipulate concepts. There is no
	\| representation of a concept for them to manipulate.
	\|
	\| It may, however, turn out that in doing what they do,
	\| they are _effectively_ manipulating concepts, and this is
	\| what I was alluding to: by building the model, even
	\| though your approach was through tokenization and
	\| whatever term you want to use for the network, you end up
	\| accidentally building something that implicitly
	\| manipulates concepts. Moreover, it might turn out that we
	\| ourselves do more of this than we perhaps like to think.
	\|
	\| Nevertheless "manipulating concepts is exactly what they
	\| do" seems almost willfully ignorant of how these systems
	\| work, unless you believe that "find the next most
	\| probable sequence of tokens of some length" is all there
	\| is to "manipulating concepts".

	\| famouswaffles wrote:
	\| >They do not manipulate concepts. There is no
	\| representation of a concept for them to manipulate.
	\|
	\| Yes, they do. And of course there is. And there's plenty
	\| of research on the matter.
	\|
	\| >It may, however, turn out that in doing what they do,
	\| they are effectively manipulating concepts
	\|
	\| There is no effectively here. Text is what goes in and
	\| what comes out, but it's by no means what they manipulate
	\| internally.
	\|
	\| >Nevertheless "manipulating concepts is exactly what they
	\| do" seems almost willfully ignorant of how these systems
	\| work, unless you believe that "find the next most
	\| probable sequence of tokens of some length" is all there
	\| is to "manipulating concepts".
	\|
	\| "Find the next probable token" is the goal, not the
	\| process. It is what models are tasked to do yes, but it
	\| says nothing about what they do internally to achieve it.

	\| PaulDavisThe1st wrote:
	\| please pass on a link to a solid research paper that
	\| supports the idea that to "find the next probable token",
	\| LLM's manipulate concepts ... just one will do.

	\| famouswaffles wrote:
	\| Revealing emergent human-like conceptual representations
	\| from language prediction -
	\| https://www.pnas.org/doi/10.1073/pnas.2512514122
	\|
	\| Emergent World Representations: Exploring a Sequence
	\| Model Trained on a Synthetic Task -
	\| https://openreview.net/forum?id=DeG07_TcZvT
	\|
	\| On the Biology of a Large Language Model -
	\| https://transformer-circuits.pub/2025/attribution-
	\| graphs/bio...
	\|
	\| Emergent Introspective Awareness in Large Language Models
	\| - https://transformer-
	\| circuits.pub/2025/introspection/index.ht...

	\| TeMPOraL wrote:
	\| If anything, I feel that current breed of multimodal LLMs
	\| demonstrate that _language_ is not fundamental - tokens
	\| are, or rather their mutual association in high-
	\| dimensional latent space. Language as we recognize it,
	\| sequences of characters and words, are just a special
	\| case. Multimodal models manage to turn audio, video and
	\| text into tokens in the same space - they do not route
	\| through text when consuming or generating images.

	\| forgotpwd16 wrote:
	\| Done few weeks ago: https://github.com/DGoettlich/history-llms
	\| (discussed in: https://news.ycombinator.com/item?id=46319826)
	\|
	\| At least the model part. Although others made same thought as
	\| you afaik none tried it.

	\| chrononaut wrote:
	\| And unfortunately I don't think they plan on making those
	\| models public.

	\| bazzargh wrote:
	\| You would find things in there that were already close to QM
	\| and relativity. The Michelson-Morley experiment was 1887 and
	\| Lorentz transformations came along in 1889. The photoelectric
	\| effect (which Einstein explained in terms of photons in 1905)
	\| was also discovered in 1887. William Clifford (who _died_ in
	\| 1889) had notions that foreshadowed general relativity:
	\| "Riemann, and more specifically Clifford, conjectured that
	\| forces and matter might be local irregularities in the
	\| curvature of space, and in this they were strikingly prophetic,
	\| though for their pains they were dismissed at the time as
	\| visionaries." - Banesh Hoffmann (1973)
	\|
	\| Things don't happen all of a sudden, and being able to see all
	\| the scientific papers of the era its possible those could have
	\| fallen out of the synthesis.

	\| matthewh806 wrote:
	\| I presume that's what the parent post is trying to get at?
	\| Seeing if, given the cutting edge scientific knowledge of the
	\| day, the LLM is able to synthesis all it into a workable
	\| theory of QM by making the necessary connections and
	\| (quantum...) leaps
	\|
	\| Standing on the shoulders of giants, as it were

	\| actionfromafar wrote:
	\| Yeah but... we _still_ might not know if it could do that
	\| because we were really close by 1900 or if the LLM is very
	\| smart.

	\| scottlamb wrote:
	\| What's the bar here? Does anyone say "we don't know if
	\| Einstein could do this because we were really close or
	\| because he was really smart?"
	\|
	\| I by no means believe LLMs are general intelligence, and
	\| I've seen them produce a lot of garbage, but if they
	\| could produce these revolutionary theories from only <=
	\| year 1900 information and a prompt that is not
	\| ridiculously leading, that would be a really compelling
	\| demonstration of their power.

	\| echoangle wrote:
	\| > Does anyone say "we don't know if Einstein could do
	\| this because we were really close or because he was
	\| really smart?"
	\|
	\| Kind of, how long would it have realistically taken for
	\| someone else (also really smart) to come up with the same
	\| thing if Einstein wouldn't have been there?

	\| jaggederest wrote:
	\| Well, we know many watershed moments in history were more
	\| a matter of situation than the specific person - an
	\| individual genius might move things by a decade or two,
	\| but in general the difference is marginal. True bolt-out-
	\| of-the-blue developments are uncommon, though all the
	\| more impressive for that fact, I think.

	\| pegasus wrote:
	\| But you're not actually questioning whether he was
	\| "really smart". Which was what GP was questioning. Sure,
	\| you can try to quantify the level of smarts, but you
	\| can't still call it a "stochastic parrot" anymore, just
	\| like you won't respond to Einstein's achievements, "Ah
	\| well, in the end I'm still not sure he's actually smart,
	\| like I am for example. Could just be that he's just
	\| dumbly but systematically going through all options,
	\| working it out step by step, nothing I couldn't achieve
	\| (or even better, program a computer to do) if I'd put my
	\| mind to it."
	\|
	\| I personally doubt that this would work. I don't think
	\| these systems can achieve truly ground-breaking,
	\| paradigm-shifting work. The homeworld of these systems is
	\| the corpus of text on which it was trained, in the same
	\| way as ours is physical reality. Their access to this
	\| reality is always secondary, already distorted by the
	\| imperfections of human knowledge.

	\| bmacho wrote:
	\| > Does anyone say "we don't know if Einstein could do
	\| this because we were really close or because he was
	\| really smart?
	\|
	\| Yes. It is certainly a question if Einstein is one of the
	\| smartest guy ever lived or all of his discoveries were
	\| already in the Zeitgeist, and would have been discovered
	\| by someone else in ~5 years.

	\| cyberax wrote:
	\| Both can be true?
	\|
	\| Einstein was smart and put several disjointed things
	\| together. It's amazing that one person could do so much,
	\| from explaining the Brownian motion to explaining the
	\| photoeffect.
	\|
	\| But I think that all these would have happened within
	\| _years_ anyway.

	\| emodendroket wrote:
	\| > Does anyone say "we don't know if Einstein could do
	\| this because we were really close or because he was
	\| really smart?"
	\|
	\| It turns out my reading is somewhat topical. I've been
	\| reading Rhodes' "The Making of the Atomic Bomb" and of
	\| the things he takes great pains to argue (I was not quite
	\| anticipating how much I'd be trying to recall my high
	\| school science classes to make sense of his account of
	\| various experiments) is that the development toward the
	\| atomic bomb was more or less inexorable and if at any
	\| point someone said "this is too far; let's stop here"
	\| there would be others to take his place. So, maybe, to
	\| answer your question.

	\| twoodfin wrote:
	\| It's been a while since I read it, but I recall Rhodes'
	\| point being that once the fundamentals of fission in
	\| heavy elements were validated, making a working bomb was
	\| no longer primarily a question of science, but one of
	\| engineering.

	\| sleet_spotter wrote:
	\| Well, if one had enough time and resources, this would
	\| make for an interesting metric. Could it figure it out
	\| with cut-off of 1900? If so, what about 1899? 1898? What
	\| context from the marginal year was key to the change in
	\| outcome?

	\| palmotea wrote:
	\| But that's not the OP's challenge, he said "if the model
	\| comes up with anything _even remotely correct_. " The point
	\| is there were things _already_ "remotely correct" out there
	\| in 1900. If the LLM finds them, it wouldn't "be quite a
	\| strong evidence that LLMs are a path to something bigger."

	\| pegasus wrote:
	\| It's not the comment which is illogical, it's your
	\| (mis)interpretation of it. What I (and seemingly others)
	\| took it to mean is basically _could an LLM do Einstein 's
	\| job_? Could it weave together all those loose threads
	\| into a coherent new way of understanding the physical
	\| world? If so, AGI can't be far behind.

	\| feanaro wrote:
	\| This alone still wouldn't be a clear demonstration that
	\| AGI is around the corner. It's quite possible a LLM
	\| could've done Einstein's job, if Einstein's job was truly
	\| just synthesising already available information into a
	\| coherent new whole. (I couldn't say, I don't know enough
	\| of the physics landscape of the day to claim either way.)
	\|
	\| It's still unclear whether this process could be merely
	\| continued, seeded only with new physical data, in order
	\| to keep progressing beyond that point, "forever", or at
	\| least for as long as we imagine humans will continue to
	\| go on making scientific progress.

	\| pegasus wrote:
	\| Einstein is chosen in such contexts because he's the
	\| paradigmatic paradigm-shifter. Basically, what you're
	\| saying is: "I don't know enough history of science to
	\| confirm this incredibly high opinion on Einstein's
	\| achievements. It could just be that everyone's been wrong
	\| about him, and if I'd really get down and dirty, and
	\| learn the facts at hand, I might even prove it." Einstein
	\| is chosen to avoid exactly this kind of nit-picking.

	\| Shorel wrote:
	\| They can also choose Euler or Gauss.
	\|
	\| These two are so above everyone else in the mathematical
	\| world that most people would struggle for weeks or even
	\| months to understand something they did in a couple of
	\| minutes.
	\|
	\| There's no "get down and dirty" shortcut with them =)

	\| feanaro wrote:
	\| No, by saying this, I am _not_ downplaying Einstein 's
	\| sizeable achievements nor trying to imply everyone was
	\| wrong about him. His was an impressive breadth of
	\| knowledge and mathematical prowess and there's no denying
	\| this.
	\|
	\| However, what I'm saying is not mere nitpicking either.
	\| It is precisely because of my belief in Einstein's
	\| extraordinary abilities that I find it unconvincing that
	\| an LLM being able to recombine the extant written
	\| physics-related building blocks of 1900, with its
	\| practically infinite reading speed, necessarily
	\| demonstrates comparable capabilities to Einstein.
	\|
	\| The essence of the question is this: would Einstein,
	\| having been granted eternal youth and a neverending
	\| source of data on physical phenomena, be able to innovate
	\| forever? Would an LLM?
	\|
	\| My position is that even if an LLM _is_ able to
	\| synthesise special relativity given 1900 knowledge, this
	\| doesn 't necessarily mean that a positive answer to the
	\| first question implies a positive answer to the second.

	\| ubercore wrote:
	\| Isn't it an interesting question? Wouldn't you like to
	\| know the answer? I don't think anyone is claiming
	\| anything more than an interesting thought experiment.

	\| frotaur wrote:
	\| I'm sorry, but 'not being surprised if LLMs can rederive
	\| relativity and QM from the facts available in 1900' is a
	\| pretty scalding take.
	\|
	\| This would absolutely be very good evidence that models
	\| can actually come up with novel, paradigm-shifting ideas.
	\| It was absolutely not obvious at that time from the
	\| existing facts, and some crazy leap of faiths needed to
	\| be taken.
	\|
	\| This is especially true for General Relativity, for which
	\| you had just a few mismatch in the mesurements like
	\| Mercury's precession, and where the theory almost
	\| entirely follows from thought experiments.

	\| techno_tsar wrote:
	\| This does make me think about Kuhn's concept of
	\| scientific revolutions and paradigms, and that paradigms
	\| are incommensurate with one another. Since new paradigms
	\| can't be proven or disproven by the rules of the old
	\| paradigm, if an LLM could independently discover paradigm
	\| shifts similar to moving from Newtonian gravity to
	\| general relativity, then we have empirical evidence of an
	\| LLM performing a feature of general intelligence.
	\|
	\| However, you could also argue that it's actually
	\| empirical evidence that general relativity and 19th
	\| century physics wasn't _truly_ a paradigm shift -- you
	\| could have 'derived' it from previous data -- that the
	\| LLM has actually proven something about structurally
	\| similarities between those paradigms, not that it's
	\| demonstrating general intelligence...

	\| somenameforme wrote:
	\| His concept sounds odd. There will always be many hints
	\| of something yet to be discovered, simply by the nature
	\| of anything worth discovering having an influence on
	\| other things.
	\|
	\| For instance spectroscopy enables one to look at the
	\| spectra emitted by another 'thing', perhaps the sun, and
	\| it turns out that there's little streaks within the
	\| spectra the correspond directly to various elements. This
	\| is how we're able to determine the elemental composition
	\| of things like the sun.
	\|
	\| That connection between elements and the patterns in
	\| their spectra was discovered in the early 1800s. And
	\| those patterns are caused by quantum mechanical
	\| interactions and so it was perhaps one of the first big
	\| hints of quantum mechanics, yet it'd still be a century
	\| before we got to relativity, let alone quantum mechanics.

	\| ctoth wrote:
	\| I mean, "the pieces were already there" is true of
	\| everything? Einstein was synthesizing existing math and
	\| existing data is your point right?
	\|
	\| But the whole question is whether or not something can do
	\| that synthesis!
	\|
	\| And the "anyone who read all the right papers" thing -
	\| nobody actually reads all the papers. That's the
	\| bottleneck. LLMs don't have it. They will continue to not
	\| have it. Humans will continue to not be able to read
	\| faster than LLMs.
	\|
	\| Even me, using a speech synthesizer at ~700 WPM.

	\| feanaro wrote:
	\| > I mean, "the pieces were already there" is true of
	\| everything? Einstein was synthesizing existing math and
	\| existing data is your point right?
	\|
	\| If it's true of everything, then surely having an LLM
	\| work iteratively on the pieces, along with being provided
	\| additional physical data, will lead to the discovery of
	\| everything?
	\|
	\| If the answer is "no", then surely something is still
	\| missing.
	\|
	\| > And the "anyone who read all the right papers" thing -
	\| nobody actually reads all the papers. That's the
	\| bottleneck. LLMs don't have it. They will continue to not
	\| have it. Humans will continue to not be able to read
	\| faster than LLMs.
	\|
	\| I agree with this. This is a definitive advantage of
	\| LLMs.

	\| andai wrote:
	\| AGI is human level intelligence, and the minimum bar is
	\| _Einstein?_

	\| pegasus wrote:
	\| Who said anything of a _minimum_ bar? "If so", not "Only
	\| if so".

	\| andy12_ wrote:
	\| I think the problem is the formulation "If so, AGI can't
	\| be far behind". I think that if a model were advanced
	\| enough such that it could do Einstein's job, that's it;
	\| that's AGI. Would it be ASI? Not necessarily, but that's
	\| another matter.

	\| somenameforme wrote:
	\| The phone in your pocket can perform arithmetic many
	\| orders of magnitude faster than any human, even the
	\| fringe autistic savant type. Yet it's still obviously not
	\| intelligent.
	\|
	\| Excellence at any given task is not indicative of
	\| intelligence. I think we set these sort of false
	\| goalposts because we want something that sounds
	\| achievable but is just out of reach at one moment in
	\| time. For instance at one time it was believed that a
	\| computer playing chess at the level of a human would be
	\| proof of intelligence. Of course it sounds naive now, but
	\| it was genuinely believed. It ultimately not being so is
	\| not us moving the goalposts, so much as us setting
	\| artificially low goalposts to begin with.
	\|
	\| So for instance what we're speaking of here is logical
	\| processing across natural language, yet human
	\| intelligence predates natural language. It poses a bit of
	\| a logical problem to then define intelligence as the
	\| logical processing of natural language.

	\| andy12_ wrote:
	\| The problem is that so far, SOTA generalist models are
	\| not excellent at just one particular task. They have a
	\| very wide range of tasks they are good at, and good
	\| scores in one particular benchmarks correlates very
	\| strongly with good scores in almost all other benchmarks,
	\| even esoteric benchmarks that AI labs certainly didn't
	\| train against.
	\|
	\| I'm sure, without any uncertainty, that any generalist
	\| model able to do what Einstein did would be AGI, as in,
	\| that model would be able to perform any cognitive task
	\| that an intelligent human being could complete in a
	\| reasonable amount of time (here "reasonable" depends on
	\| the task at hand; it could be minutes, hours, days,
	\| years, etc).

	\| somenameforme wrote:
	\| I see things rather differently. Here's a few points in
	\| no particular order:
	\|
	\| (1) - A major part of the challenge is in not being
	\| directed towards something. There was no external
	\| guidance for Einstein - he wasn't even a formal
	\| researcher at the time of his breakthroughs. An LLM
	\| _might_ be able to be handheld towards relativity, though
	\| I doubt it, but given the prompt of 'hey find something
	\| revolutionary' it's obviously never going to respond with
	\| anything relevant, even with substantially greater
	\| precision specifying field/subtopic/etc.
	\|
	\| (2) - Logical processing of natural language remains one
	\| small aspect of intelligence. For example - humanity
	\| invented natural language from nothing. The concept of an
	\| LLM doing this is a nonstarter since they're dependent
	\| upon token prediction, yet we're speaking of starting
	\| with 0 tokens.
	\|
	\| (3) - LLMs are, in many ways, very much like calculators.
	\| They can indeed achieve some quite impressive feats in
	\| specific domains, yet then they will completely
	\| hallucinate nonsense on relatively trivial queries,
	\| particularly on topics where there isn't extensive data
	\| to drive their token prediction. I don't entirely
	\| understand your extreme optimism towards LLMs given this
	\| proclivity for hallucination. Their ability to produce
	\| compelling nonsense makes them particularly tedious for
	\| using to do anything you don't already effectively know
	\| the answer to.

	\| f0ti wrote:
	\| Einstein is not AGI, and neither the other way around.

	\| golem14 wrote:
	\| I think it's not productive to just have the LLM site like
	\| Mycroft in his armchair and from there, return you an
	\| excellent expert opinion.
	\|
	\| THat's not how science works.
	\|
	\| The LLM would have to propose experiments (which would have
	\| to be simulated), and then develop its theories from that.
	\|
	\| Maybe there had been enough facts around to suggest a
	\| number of hypotheses, but the LLM in its curent form won't
	\| be able to confirm them.

	\| bhaak wrote:
	\| This would still be valuable even if the LLM only finds out
	\| about things that are already in the air.
	\|
	\| It's probably even more of a problem that different areas of
	\| scientific development don't know about each other. LLMs
	\| combining results would still not be like they invented
	\| something new.
	\|
	\| But if they could give us a head start of 20 years on certain
	\| developments this would be an awesome result.

	\| gus_massa wrote:
	\| I agree, but it's important to note that QM has no clear
	\| formulation until 2025/6, it's like 20 years more of work
	\| than SR.

	\| pests wrote:
	\| 2025/6?

	\| gus_massa wrote:
	\| * 1925/6, sorry, bad century.

	\| Shorel wrote:
	\| Then that experiment is even more interesting, and should be
	\| done.
	\|
	\| My own prediction is that the LLMs would totally fail at
	\| connecting the dots, but a small group of very smart humans
	\| can.
	\|
	\| Things don't happen all of a sudden, but they also don't
	\| happen everywhere. Most people in most parts of the world
	\| would never connect the dots. Scientific curiosity is
	\| something valuable and fragile, that we just take for
	\| granted.

	\| bigfudge wrote:
	\| One of the reasons they don't happen everywhere is because
	\| there are just a few places at any given point in time
	\| where there are enough well connected and educated
	\| individuals who are in a position to even see all the dots
	\| let alone connect them. This doesn't discount the
	\| achievement of an LLM also manages to, but I think it's
	\| important to recognise that having enough giants in sight
	\| is an important prerequisite to standing on their shoulders

	\| djwide wrote:
	\| With LLMs the synthesis cycles could happen at a much higher
	\| frequency. Decades condensed to weeks or days?
	\|
	\| I imagine possible buffers on that conjecture synthesis being
	\| epxerimentation and acceptance by the scientific community.
	\| AIs can come up with new ideas every day but Nature won't
	\| publish those ideas for years.

	\| mannykannot wrote:
	\| If (as you seem to be suggesting) relativity was effectively
	\| lying there on the table waiting for Einstein to just pick it
	\| up, how come it blindsided most, if not quite all, of the
	\| greatest minds of his generation?

	\| TeMPOraL wrote:
	\| That's the case with all scientific discoveries - pieces of
	\| prior work get accumulated, until it eventually becomes
	\| obvious[0] how they connect, at which point someone[1]
	\| connects the dots, making a discovery... and putting it on
	\| the table, for the cycle to repeat anew. This is, in a
	\| nutshell, the history of all scientific and technological
	\| progress. Accumulation of tiny increments.
	\|
	\| --
	\|
	\| [0] - To people who happen to have the right background and
	\| skill set, and are in the right place.
	\|
	\| [1] - Almost always multiple someones, independently,
	\| within short time of each other. People usually remember
	\| only one or two because, for better or worse, history is
	\| much like patent law: first to file wins.

	\| somenameforme wrote:
	\| It's only easy to see precursors in hindsight. The Michelson-
	\| Morley tale is a great example of this. In hindsight, their
	\| experiment was screaming relativity, because it demonstrated
	\| that the speed of light was identical from two perspectives
	\| where it's very difficult to explain without relativity.
	\| Lorentz contraction was just a completely ad-hoc proposal to
	\| maintain the assumptions of the time (luminiferous aether in
	\| particular) while also explaining the result. But in general
	\| it was not seen as that big of a deal.
	\|
	\| There's a very similar parallel with dark matter in modern
	\| times. We certainly have endless hints to the truth that will
	\| be evident in hindsight, but for now? We are mostly convinced
	\| that we know the truth, perform experiments to prove that,
	\| find nothing, shrug, adjust the model to be even more
	\| esoteric, and repeat onto the next one. And maybe one will
	\| eventually show something, or maybe we're on the wrong path
	\| altogether. This quote, from Michelson in 1894 (more than a
	\| decade before Einstein would come along), is extremely
	\| telling of the opinion at the time:
	\|
	\| "While it is never safe to affirm that the future of Physical
	\| Science has no marvels in store even more astonishing than
	\| those of the past, it seems probable that most of the grand
	\| underlying principles have been firmly established and that
	\| further advances are to be sought chiefly in the rigorous
	\| application of these principles to all the phenomena which
	\| come under our notice. It is here that the science of
	\| measurement shows its importance -- where quantitative work
	\| is more to be desired than qualitative work. An eminent
	\| physicist remarked that the future truths of physical science
	\| are to be looked for in the sixth place of decimals." -
	\| Michelson 1894

	\| vasco wrote:
	\| With the passage of time more and more things have been
	\| discovered through precision. Through identifying small
	\| errors in some measurement and pursuing that to find the
	\| cause.

	\| somenameforme wrote:
	\| It's not precision that's the problem, but understanding
	\| when something has been falsified. For instance the
	\| Lorentz transformations work as a perfectly fine ad-hoc
	\| solution to Michelson's discovery. All it did was make
	\| the aether a bit more esoteric in nature. Why do you then
	\| not simply shrug, accept it, and move on? Perhaps even
	\| toss some accolades towards Lorentz for 'solving' the
	\| puzzle? Michelson himself certainly felt there was no
	\| particularly relevant mystery outstanding.
	\|
	\| For another parallel our understanding of the big bang
	\| was, and probably is, wrong. There are a lot of problems
	\| with the traditional view of the big bang with the
	\| horizon problem [1] being just one among many - areas in
	\| space that should not have had time to interact behave
	\| like they have. So this was 'solved' by an ad hoc
	\| solution - just make the expansion of the universe go
	\| into super-light speed for a fraction of a second at a
	\| specific moment, slow down, then start speeding up again
	\| (cosmic inflation [2]) - and it all works just fine. So
	\| you know what we did? Shrugged, accepted it, and even
	\| gave Guth et al a bunch of accolades for 'solving' the
	\| puzzle.
	\|
	\| This is the problem - arguably the most important
	\| principle of science is falsifiability. But when is
	\| something falsified? Because in many situations, probably
	\| the overwhelming majority, you can instead just use one
	\| falsification to create a new hypothesis with that nuance
	\| integrated into it. And as science moves beyond singular
	\| formulas derived from clear principles or laws and onto
	\| broad encompassing models based on correlations from
	\| limited observations, this becomes more and more true.
	\|
	\| [1] - https://en.wikipedia.org/wiki/Horizon_problem
	\|
	\| [2] - https://en.wikipedia.org/wiki/Cosmic_inflation

	\| jojobas wrote:
	\| They were close, but it required the best people bashing
	\| their heads against each other for years until they got it.

	\| dogma1138 wrote:
	\| That is the point.
	\|
	\| New discoveries don't happen in a vacuum.

	\| eru wrote:
	\| You can get pretty far by modeling only frictionless,
	\| spherical discoveries in a vacuum.

	\| metalliqaz wrote:
	\| Yann LeCun spoke explicitly on this idea recently and he
	\| asserts definitively that the LLM would not be able to add
	\| anything useful in that scenario. My understanding is that
	\| other AI researchers generally agree with him, and that it's
	\| mostly the hype beasts like Altman that think there is some
	\| "magic" in the weights that is actually intelligent. Their
	\| payday depends on it, so it is understandable. My opinion is
	\| that LeCun is probably correct.

	\| johnsmith1840 wrote:
	\| There is some ability for it to make novel connections but
	\| it's pretty small. You can see this yourself having it build
	\| novel systems.
	\|
	\| It largely cannot imaginr anything beyond the usual but there
	\| is a small part that it can. This is similar to in context
	\| learning, it's weak but it is there.
	\|
	\| It would be incredible if meta learning/continual learning
	\| found a way to train exactly for novel learning path. But
	\| that's literally AGI so maybe 20yrs from now? Or never..
	\|
	\| You can see this on CL benchmarks. There is SOME signal but
	\| it's crazy low. When I was traing CL models i found that
	\| signal was in the single % points. Some could easily argue it
	\| was zero but I really do believe there is a very small amount
	\| in there.
	\|
	\| This is also why any novel work or findings is done via
	\| MASSIVE compute budgets. They find RL enviroments that can
	\| extract that small amount out. Is it random chance? Maybe,
	\| hard to say.

	\| SoftTalker wrote:
	\| Is this so different from what we see in humans? Most
	\| people do not think very creatively. They apply what they
	\| know in situations they are familiar with. In unfamiliar
	\| situations they don't know what to do and often fail to
	\| come up with novel solutions. Or maybe in areas where they
	\| are very experienced they will come up with something
	\| incrementally better than before. But occasionally a very
	\| exceptional person makes a profound connection or leap to a
	\| new understanding.

	\| johnsmith1840 wrote:
	\| Sure we make small steps at the time but we compound
	\| these unlike AI.
	\|
	\| AI cannot compound their learnings for the foreseeable
	\| future

	\| catigula wrote:
	\| This is definitely wrong, most AI researchers DO NOT agree
	\| with LeCun.
	\|
	\| Most ML researchers think AGI is imminent.

	\| p_j_w wrote:
	\| Who is in this group of ML researchers?

	\| shaky-carrousel wrote:
	\| People with OpenAI shares, probably.

	\| kingstnap wrote:
	\| Where do you get your majority from?
	\|
	\| I don't think there is any level of broad agreement right
	\| now. There are tons of random camps none of which I would
	\| consider to be broadly dominating.

	\| Alex2037 wrote:
	\| their employment and business opportunities depend on the
	\| hype, so they will continue to 'think' that (on xitter)
	\| despite the current SOTA of transformers-based models being
	\| <100% smarter than >3 year old GPT4, and no revolutionary
	\| new architecture in sight.

	\| catigula wrote:
	\| You're going to be in for a very rude awakening.

	\| johnsmith1840 wrote:
	\| The guy who built chatgpt literally said we're 20 years
	\| away?
	\|
	\| Not sure how to interpret that as almost imminent.

	\| nottorp wrote:
	\| > The guy who built chatgpt literally said we're 20 years
	\| away?
	\|
	\| 20 years away in 2026, still 20 years away in 2027, etc
	\| etc.
	\|
	\| Whatever Altman's hyping, that's the translation.

	\| rafram wrote:
	\| The ones being paid a million dollars a year by OpenAI to
	\| say stuff like that, maybe.

	\| goatlover wrote:
	\| Do you have poll of ML researchers that shows this?

	\| paodealho wrote:
	\| Well, can you point us to their research then? Please.

	\| mlinksva wrote:
	\| Do you have a pointer to where LeCun spoke about it? I
	\| noticed last October that Dwarkesh mentioned the idea off
	\| handedly on his podcast (prompting me to write up
	\| https://manifold.markets/MikeLinksvayer/llm-trained-on-
	\| data-...) but I wonder if this idea has been around for much
	\| longer, or is just so obvious that lots of people are
	\| independently coming up with it (parent to this comment being
	\| yet another)?

	\| samuelson wrote:
	\| Preface: Most of my understand of how LLMs actually work
	\| comes from 3blue1brown's videos, so I could easily be wrong
	\| here.
	\|
	\| I mostly agree with you, especially about distrusting the
	\| self-interested hype beasts.
	\|
	\| While I don't think the models are actually "intelligent", I
	\| also wonder if there are insights to be gained by looking at
	\| how concepts get encoded by the models. It's not really that
	\| the models will add something "new", but more that there
	\| might be connections between things that we haven't noticed,
	\| especially because academic disciplines are so insular these
	\| days.

	\| matheusd wrote:
	\| How about this for an evaluation: Have this (trained-on-
	\| older-corpus) LLM propose experiments. We "play the role of
	\| nature" and inform it of the results of the experiments. It
	\| can then try to deduce the natural laws.
	\|
	\| If we did this (to a good enough level of detail), would it
	\| be able to derive relativity? How large of an AI model would
	\| it have to be to successfully derive relativity (if it only
	\| had access to everything published up to 1904)?

	\| SirHumphrey wrote:
	\| I don't know if any dataset of pre 1904 writing would be
	\| large enough to train a model that would be smart enough. I
	\| suspect that current sized SOTA models would at least get
	\| to special relativity, but for general relativity and
	\| quantum mechanics I am less sure.

	\| djwide wrote:
	\| What do they (or you) have to say about the Lee Sedol AlphaGo
	\| move 78. It seems like that was "new knowledge." Are games
	\| just iterable and the real world idea space not? I am playing
	\| with these ideas a little.

	\| metalliqaz wrote:
	\| AlphaGo is not an LLM

	\| drdeca wrote:
	\| And? Do the arguments differ for LLM vs the other models?
	\|
	\| I guess the arguments sometimes mention languages. But I
	\| feel like the core of the arguments are pretty much the
	\| same regardless?

	\| metalliqaz wrote:
	\| The discussion is about training an LLM on old text and
	\| then asking it about new concepts.

	\| DevX101 wrote:
	\| Chemistry would be a great space to explore. The last quarter
	\| of the 19th century had a ton of advancements in chemistry.
	\| It'd be interesting the see if an LLM could propose fruitful
	\| hypotheses, made predictions of the science of thermodynamics.

	\| bravura wrote:
	\| A rigorous approach to predicting the future of text was
	\| proposed by Li et al 2024, "Evaluating Large Language Models
	\| for Generalization and Robustness via Data Compression"
	\| (https://ar5iv.labs.arxiv.org/html//2402.00861) and I think
	\| that work should get more recognition.
	\|
	\| They measure compression (perplexity) on future Wikipedia, news
	\| articles, code, arXiv papers, and multi-modal data. Data
	\| compression is intimately connected with robustness and
	\| generalization.

	\| Otterly99 wrote:
	\| Thanks for the paper, I just read it and loved the approach.
	\| I hope the concept of using data compression as a benchmark
	\| will take off. In a sense it is kind of similar to the maxim
	\| "If you cannot explain something in simple terms, you do not
	\| understand it fully".

	\| nickdothutton wrote:
	\| I would love to ask such a model to summarise the handful of
	\| theories or theoretical "roads" being eyed at the time and to
	\| make a prediction with reasons as to which looks most
	\| promising. We might learn something about blind spots in human
	\| reasoning, institutions, and organisations that are applicable
	\| today in the "future".

	\| nickpsecurity wrote:
	\| That would be an interesting experiment. It might be more
	\| useful to make a model with a cut off close to when copyrights
	\| expire to be as modern as possible.
	\|
	\| Then, we have a model that knows quite a bit in modern English.
	\| We also legally have a data set for everything it knows. Then,
	\| there's all kinds of experimentation or copyright-safe training
	\| strategies we can do.
	\|
	\| Project Gutenberg up to the 1920's seems to be the safest bet
	\| on that.

	\| samuelson wrote:
	\| I think it would be fun to see if an LLM would reframe some
	\| scientific terms from the time in a way that would actually fit
	\| in our current theories.
	\|
	\| I imagine if you explained quantum field theory to a 19th
	\| century scientists they might think of it as a more refined
	\| understanding of luminiferous aether.
	\|
	\| Or if an 18th century scholar learned about positive and
	\| negative ions, it could be seen as an expansion/correction of
	\| phlogiston theory.

	\| wongarsu wrote:
	\| I'm trying to work towards that goal by training a model on
	\| mostly German science texts up to 1904 (before the world wars
	\| German was the lingua franca of most sciences).
	\|
	\| Training data for a base model isn't that hard to come by, even
	\| though you have to OCR most of it yourself because the publicly
	\| available OCRed versions are commonly unusably bad. But
	\| training a model large enough to be useful is a major issue.
	\| Training a 700M parameter model at home is very doable (and is
	\| what this TimeCapsuleLLM is), but to get that kind of reasoning
	\| you need something closer to a 70B model. Also a lot of the
	\| "smarts" of a model gets injected in fine tuning and RL, but
	\| any of the available fine tuning datasets would obviously
	\| contaminate the model with 2026 knowledge.

	\| theallan wrote:
	\| Can we follow along with your work / results somewhere?

	\| benbreen wrote:
	\| I am a historian and am putting together a grant application
	\| for a somewhat similar project (different era and language
	\| though). Would you be open to discussing a collaboration? My
	\| email is bebreen [at] ucsc [dot] edu.

	\| root_axis wrote:
	\| I think it would raise some interesting questions, but if it
	\| did yield anything noteworthy, the biggest question would be
	\| why that LLM is capable of pioneering scientific advancements
	\| and none of the modern ones are.

	\| spidersouris wrote:
	\| I'm not sure what you'd call a "pioneering scientific
	\| advancement", but there is an increasing amount of examples
	\| showing that LLMs can be used for research (with agents,
	\| particularly). A survey about this was published a few months
	\| ago: https://aclanthology.org/2025.emnlp-main.895.pdf

	\| crazylogger wrote:
	\| Or maybe, LLMs _are_ pioneering scientific advancements -
	\| people are using LLMs to read papers, choose what problems to
	\| work on, come up with experiments, analyze results, and draft
	\| papers, etc., at this very moment. Except they eventually
	\| stick their human names on the cover so we almost never know.

	\| defgeneric wrote:
	\| The development of QM was so closely connected to experiments
	\| that it's highly unlikely, even despite some of the experiments
	\| having been performed prior to 1900.
	\|
	\| Special relativity however seems possible.

	\| damnitbuilds wrote:
	\| I like this, it would be exciting (and scary) if it deduced QM,
	\| and informative if it cannot.
	\|
	\| But I also think we can do this with normal LLMs trained on up-
	\| to-date text, by asking them to come up with _any_ novel theory
	\| that fits the facts. It does not have to be a groundbreaking
	\| theory like QM, just original and not (yet) proven wrong ?

	\| kristopolous wrote:
	\| It's going to be divining tea leaves. It will be 99% wrong and
	\| then someone will say 'oh but look at this tea leaf over here!
	\| It's almost correct"'

	\| bowmessage wrote:
	\| Look! It made another TODO-list app on the first try!

	\| darkwater wrote:
	\| Yes but... aren't human researchers doing the same? They are
	\| mostly wrong most of the times, and try again, and verify
	\| again their work, until they find something that actually
	\| works. What I mean is that this "in hindsight" test would be
	\| biased by being in hindsight, because we know already the
	\| answer so we would discard the LLM answer as just randomly
	\| generated. But "connecting the dots" is basically doing a lot
	\| try and error in your mind, emitting only the results that
	\| make at least some kind of sense to us.

	\| SecretDreams wrote:
	\| I like this idea. I think I'd like it more if we didn't have to
	\| prompt the LLM in the first place. If it just had all of this
	\| information and decided to act upon it. That's what the great
	\| minds of history (and even average minds like myself) do. Just
	\| think about the facts in our point of view and spontaneously
	\| reason something greater out of them.

	\| mannykannot wrote:
	\| That is a very interesting idea, though I would not dismiss
	\| LLMs as a dead end if they failed.

	\| Affric wrote:
	\| Wow, an actual scientific experiment. Does anyone with
	\| expertise know if such things have been done?

	\| amypetrik214 wrote:
	\| >.If the model comes up with anything even remotely correct it
	\| would be quite a strong evidence that LLMs are a path to
	\| something bigger if not then I think it is time to go back to
	\| the drawing board.
	\|
	\| In principle I see your point, in practice my default
	\| assumption until proven otherwise here -- is that a little
	\| something slipped through post-1900.
	\|
	\| A much easier approach would be to just download some model,
	\| whatever model, today. Then 5 years from now, whatever
	\| interesting discoveries are found - can the model get there.

	\| dogma1138 wrote:
	\| Not really, QM and Relativity were chosen because they were
	\| theories that were created to fit observations and data.
	\| Discoveries over the next 5 years will be trivia rather than
	\| logical conclusions.

	\| staticman2 wrote:
	\| Don't you need to do reinforcement learning through human
	\| feedback to get non gibberish results from the models in
	\| general?
	\|
	\| 1900 era humans are not available to do this so I'm not sure
	\| how this experiment is supposed to work.

	\| jaydepun wrote:
	\| We've thought of doing this sort of exercise at work but mostly
	\| hit the wall of data becoming a lot more scare the further back
	\| in time we go. Particularly high quality science data - even
	\| going pre 1970 (and that's already a stretch) you lose a lot of
	\| information. There's a triple whammy of data still existing,
	\| being accessible in any format, and that format being suitable
	\| for training an LLM. Then there's the complications of wanting
	\| additional model capabilities that won't leak data causally.

	\| permo-w wrote:
	\| I was wondering this. what is the minimum amount of text an
	\| LLM needs to be coherent? fun of an idea as this is, the
	\| samples of its responses are basically babbling nonsense.
	\| going further, a lot of what makes LLMs so strong isn't their
	\| original training data, but the RLHF done afterwards. RLHF
	\| would be very difficult in this case

	\| pseudohadamard wrote:
	\| It's already been done, without the model being aware of it,
	\| see https://arxiv.org/abs/2512.09742. They also made it think
	\| it was Hitler (not MechaHitler, the other guy), and other
	\| craziness.
	\|
	\| It's a relief to think that we're not trusting these things for
	\| stuff like financial advice, medical advice, mental health
	\| counselling, ...

	\| catlifeonmars wrote:
	\| That's how p-hacking works (or doesn't work). This is analogous
	\| to shooting an arrow and then drawing a target around where it
	\| lands.

	\| alkindiffie wrote:
	\| I like that analogy. It reminds me of "Pointing to the moon
	\| and looking at my finger"

	\| cornholio wrote:
	\| Yes, I don't understand how such an experiment could work.
	\| You either:
	\|
	\| A). contaminate the model with your own knowledge of
	\| relativity, leading it on to "discover" what you know, or
	\|
	\| B). you will try to simulate a blind operation but without
	\| the "competent human physicist knowledgeable up to the the
	\| 1900 scientific frontier" component prompting the LLM,
	\| because no such person is alive today nor can you simulate
	\| them (if you could, then by definition you can use that
	\| simulated Einstein to discover relativity, so the problem is
	\| moot).
	\|
	\| So in both cases you would prove nothing about what a smart
	\| and knowledgeable scientist can achieve today from a frontier
	\| LLM.

	\| alkindiffie wrote:
	\| That would be possible if LLMs can come up with entirely new
	\| words and languages, which I doubt.

	\| isolli wrote:
	\| You have to make sure that you make it read an article about a
	\| painter falling off a roof with his tools.

	\| swalsh wrote:
	\| Could be an interesting experiment, but its not conclusive
	\| proof one way or another. So much of what makes LLMs so great
	\| today (vs gpt 3.5) would not be in that dataset. The training
	\| to turn these models into coding savants has generalized to
	\| other areas just as one example.

	\| redman25 wrote:
	\| It's a base model. It hasn't been instruction tuned to "solve
	\| problems" necessarily. All it can do is attempt to complete
	\| text given some starting text.

	\| simonw wrote:
	\| Anyone seen a low-friction way to run prompts through this yet,
	\| either via a hosted API or chat UI or a convenient GGML or MLX
	\| build that runs in Ollama or llama.cpp or LM Studio?

	\| philmo1 wrote:
	\| +1

	\| d401 wrote:
	\| +1

	\| t1amat wrote:
	\| Not a direct answer but it looks like v0.5 is a nanoGPT arch
	\| and v1 is a Phi 1.5 arch, which should be well supported by
	\| quanting utilities for any engine. They are small too and
	\| should be able to be done on a potato.

	\| alansaber wrote:
	\| I too have completely forgotten how the adapters library works
	\| and would have appreciated a simple inference script

	\| throwaway18875 wrote:
	\| Currently running it using LM Studio. It can download it from
	\| Hugging Face. It generates incoherent text though
	\|
	\| ===
	\|
	\| You:
	\|
	\| I pray you, who is this Master Newton?
	\|
	\| timecapsulellm-v2-1800-1875-mlx:
	\|
	\| TI offer to pay you the very same fee as you did before. It was
	\| not in the power of your master to deliver the letter to your
	\| master. He did. I will be with you as soon as I can keep my
	\| word. It is not at all clear, whether the letter has been sent
	\| or not. It is not at all clear: but it is clear also that it
	\| was written by the person who gave it. "No," I said, "I cannot
	\| give it to you." There, the letter was sent to me. "The letter
	\| is yours, I believe," I said. "But, I hope, you will not refuse
	\| to give it to me?

	\| simonw wrote:
	\| Thanks, looks like that's this one: https://huggingface.co/Fr
	\| actalSurfer/TimeCapsuleLLM-v2-1800-...
	\|
	\| There's a "Use this model" button on that page that can
	\| launch it in LM Studio.

	\| philmo1 wrote:
	\| Exciting idea!

	\| dhruv3006 wrote:
	\| This will be something good - would love something on Ollama or
	\| lmstudio.

	\| srigi wrote:
	\| "I'm sorry, my knowledge cuttoff is 1875"

	\| HarHarVeryFunny wrote:
	\| It would be interesting if there's enough data to train a model
	\| capable enough to converse with and ask about contemporary views
	\| on issues of the day, or what it thought about "potential" future
	\| events/technologies yet to happen.

	\| eqmvii wrote:
	\| Could this be an experiment to show how likely LLMs are to lead
	\| to AGI, or at least intelligence well beyond our current level?
	\|
	\| If you could only give it texts and info and concepts up to Year
	\| X, well before Discovery Y, could we then see if it could prompt
	\| its way to that discovery?

	\| alansaber wrote:
	\| I think not if only for the fact that the quantity of old data
	\| isn't enough to train anywhere near a SoTA model, until we
	\| change some fundamentals of LLM architecture

	\| franktankbank wrote:
	\| Are you saying it wouldn't be able to converse using english
	\| of the time?

	\| wasabi991011 wrote:
	\| That's not what they are saying. SOTA models include much
	\| more than just language, and the scale of training data is
	\| related to its "intelligence". Restricting the corpus in
	\| time => less training data => less intelligence => less
	\| ability to "discover" new concepts not in its training data

	\| franktankbank wrote:
	\| Perhaps less bullshit though was my thought? Was language
	\| more restricted then? Scope of ideas?

	\| withinboredom wrote:
	\| Could always train them on data up to 2015ish and then
	\| see if you can rediscover LLMs. There's plenty of data.

	\| ben_w wrote:
	\| Machine learning today requires an obscene quantity of
	\| examples to learn anything.
	\|
	\| SOTA LLMs show quite a lot of skill, but they only do so
	\| after reading a significant fraction of all published
	\| writing (and perhaps images and videos, I'm not sure)
	\| across all languages, in a world whose population is 5
	\| times higher than the link's cut off date, and the global
	\| literacy went from 20% to about 90% since then.
	\|
	\| Computers can only make up for this by being really really
	\| fast: what would take a human a million or so years to
	\| read, a server room can pump through a model's training
	\| stage in a matter of months.
	\|
	\| When the data isn't there, reading what it does have really
	\| quickly isn't enough.

	\| andyfilms1 wrote:
	\| I mean, _humans_ didn 't need to read billions of books back
	\| then to think of quantum mechanics.

	\| famouswaffles wrote:
	\| Right, what they needed was billions of years of brute
	\| force and trial and error.

	\| alansaber wrote:
	\| Which is why I said it's not impossible, but current LLM
	\| architecture is just not good enough to achieve this.

	\| ben_w wrote:
	\| > Could this be an experiment to show how likely LLMs are to
	\| lead to AGI, or at least intelligence well beyond our current
	\| level?
	\|
	\| You'd have to be specific what you mean by AGI: all three
	\| letters mean a different thing to different people, and
	\| sometimes use the whole means something not present in the
	\| letters.
	\|
	\| > If you could only give it texts and info and concepts up to
	\| Year X, well before Discovery Y, could we then see if it could
	\| prompt its way to that discovery?
	\|
	\| To a limited degree.
	\|
	\| Some developments can come from combining existing ideas and
	\| seeing what they imply.
	\|
	\| Other things, like everything to do with relativity and quantum
	\| mechanics, would have required experiments. I don't think any
	\| of the relevant experiments had been done prior to this cut-off
	\| date, but I'm not absolutely sure of that.
	\|
	\| You might be able to get such an LLM to develop all the maths
	\| and geometry for general relativity, and yet find the AI still
	\| tells you that the perihelion shift of Mercury is a sign of the
	\| planet Vulcan rather than of a curved spacetime:
	\| https://en.wikipedia.org/wiki/Vulcan_(hypothetical_planet)

	\| markab21 wrote:
	\| Basically looking for emergent behavior.

	\| grimgrin wrote:
	\| An example of why you need to explain what you mean by AGI
	\| is:
	\|
	\| https://www.robinsloan.com/winter-garden/agi-is-here/

	\| opponent4 wrote:
	\| > You'd have to be specific what you mean by AGI
	\|
	\| Well, they obviously can't. AGI is not science, it's
	\| religion. It has all the trappings of religion: prophets,
	\| sacred texts, origin myth, end-of-days myth and most
	\| importantly, a means to escape death. Science? Well, the only
	\| measure to "general intelligence" would be to compare to the
	\| only one which is the human one but we have absolutely no
	\| means by which to describe it. We do not know where to start.
	\| This is why you scrape the surface of any AGI definition you
	\| only find circular definitions.
	\|
	\| And no, the "brain is a computer" is not a scientific
	\| description, it's a metaphor.

	\| strbean wrote:
	\| > And no, the "brain is a computer" is not a scientific
	\| description, it's a metaphor.
	\|
	\| Disagree. A brain is turing complete, no? Isn't that the
	\| definition of a computer? Sure, it may be reductive to say
	\| "the brain is _just_ a computer ".

	\| opponent4 wrote:
	\| Not even close. Turing complete does not apply to the
	\| brain plain and simple. That's something to do with
	\| algorithms and your brain is not a computer as I have
	\| mentioned. It does not store information. It doesn't
	\| process information. It just doesn't work that way.
	\|
	\| https://aeon.co/essays/your-brain-does-not-process-
	\| informati...

	\| anthonypasq wrote:
	\| ive gotta say this article was not convincing at all.

	\| Closi wrote:
	\| A human is effectively turning complete if you give the
	\| person paper and pen and the ruleset, and a brain clearly
	\| stores information and processes it to some extent, so
	\| this is pretty unconvincing. The article is nonsense and
	\| badly written.
	\|
	\| > But here is what we are not born with: information,
	\| data, rules, software, knowledge, lexicons,
	\| representations, algorithms, programs, models, memories,
	\| images, processors, subroutines, encoders, decoders,
	\| symbols, or buffers - design elements that allow digital
	\| computers to behave somewhat intelligently. Not only are
	\| we not born with such things, we also don't develop them
	\| - ever.
	\|
	\| Really? Humans don't ever develop memories? Humans don't
	\| gain information?

	\| strbean wrote:
	\| > Forgive me for this introduction to computing, but I
	\| need to be clear: computers really do operate on symbolic
	\| representations of the world. They really store and
	\| retrieve. They really process. They really have physical
	\| memories. They really are guided in everything they do,
	\| without exception, by algorithms.
	\|
	\| This article seems really hung up on the distinction
	\| between digital and analog. It's an important
	\| distinction, but glosses over the fact that digital
	\| computers are a _subset_ of analog computers. Electrical
	\| signals are inherently analog.
	\|
	\| This maps somewhat neatly to human cognition. I can take
	\| a stream of bits, perform math on it, and output a
	\| transformed stream of bits. That is a digital operation.
	\| The underlying biological processes involved are a pile
	\| of complex probabilistic+analog signaling, true. But in a
	\| computer, the underlying processes are also probabilistic
	\| and analog. We have designed our electronics to shove
	\| those parts down to the lowest possible level so they can
	\| be abstracted away, and so the degree to which they
	\| influence computation is certainly lower than in the
	\| human brain. But I think an effective argument that
	\| brains are not computers is going to have to dive in to
	\| why that gap matters.

	\| nearbuy wrote:
	\| That is an article by a psychologist, with no expertise
	\| in neuroscience, claiming without evidence that the
	\| "dominant cognitive neuroscience" is wrong. He offers no
	\| alternative explanation on how memories are stored and
	\| retrieved, but argues that large numbers of neurons
	\| across the brain are involved and he implies that
	\| neuroscientists think otherwise.
	\|
	\| This is odd because the dominant view in neuroscience is
	\| that memories are stored by altering synaptic connection
	\| strength in a large number of neurons. So it's not clear
	\| what his disagreement is, and he just seems to be
	\| misrepresenting neuroscientists.
	\|
	\| Interestingly, this is also how LLMs store memory during
	\| training: by altering the strength of connections between
	\| many artificial neurons.

	\| stevenhuang wrote:
	\| It is pretty clear the author of that article has no idea
	\| what he's talking about.
	\|
	\| You should look into the physical church turning thesis.
	\| If it's false (all known tested physics suggests it's
	\| true) then well we're probably living in a dualist
	\| universe. This means something outside of material
	\| reality (souls? hypercomputation via quantum gravity?
	\| weird physics? magic?) somehow influences our cognition.
	\|
	\| > Turning complete does not apply to the brain
	\|
	\| As far as we know, any physically realizable process can
	\| be simulated by a turing machine. And FYI brains do not
	\| exist outside of physical reality.. as far as we know. If
	\| you have issue with this formulation, go ahead and
	\| disprove the physical church turning thesis.

	\| Davidzheng wrote:
	\| probably not actually turing complete right? for one it
	\| is not infinite so

	\| ben_w wrote:
	\| Cargo cults are a religion, the things they worship they do
	\| not understand, but the planes and the cargo themselves are
	\| real.
	\|
	\| There's certainly plenty of cargo-culting right now on AI.
	\|
	\| Sacred texts, I don't recognise. Yudkowsky's writings? He
	\| suggests wearing clown shoes to avoid getting a cult of
	\| personality disconnected from the quality of the arguments,
	\| if anyone finds his works sacred, they've fundamentally
	\| misunderstood him: I have sometimes thought
	\| that all professional lectures on rationality should be
	\| delivered while wearing a clown suit, to prevent the
	\| audience from confusing seriousness with solemnity.
	\|
	\| - https://en.wikiquote.org/wiki/Eliezer_Yudkowsky
	\|
	\| Prophets forecasting the end-of-days, yes, but this too
	\| from climate science, from everyone who was preparing for a
	\| pandemic before covid and is still trying to prepare for
	\| the next one because the wet markets are still around, from
	\| economists trying to forecast growth or collapse and what
	\| will change any given prediction of the latter into the
	\| former, and from the military forces of the world saying
	\| which weapon systems they want to buy. It does not make a
	\| religion.
	\|
	\| A means to escape death, you can have. But it's on a
	\| continuum with life extension and anti-aging medicine,
	\| which itself is on a continuum with all other medical
	\| interventions. To quote myself: Taking a
	\| living human's heart out without killing them, and
	\| replacing it with one you got out a corpse, that isn't the
	\| magic of necromancy, neither is it a prayer or ritual to
	\| Sekhmet, it's just transplant surgery. ...
	\| Immunity to smallpox isn't a prayer to the Hindu goddess
	\| Shitala (of many things but most directly linked with
	\| smallpox), and it isn't magic herbs or crystals, it's just
	\| vaccines.
	\|
	\| -
	\| https://benwheatley.github.io/blog/2025/06/22-13.21.36.html

	\| nomel wrote:
	\| > And no, the "brain is a computer" is not a scientific
	\| description, it's a metaphor.
	\|
	\| I have trouble comprehending this. What is "computer" to
	\| you?

	\| feisty0630 wrote:
	\| I fail to see how the two concepts equate.
	\|
	\| LLMs have neither intelligence nor problem-solving abillity
	\| (and I won't be relaxing the definition of either so that some
	\| AI bro can pretend a glorified chatbot is sentient)
	\|
	\| You would, at best, be demonstrating that the sharing of
	\| knowledge across multiple disciplines and nations (which is a
	\| relatively new concept - at least at the scale of something
	\| like the internet) leads to novel ideas.

	\| al_borland wrote:
	\| I've seen many futurists claim that human innovation is dead
	\| and all future discoveries will be the results of AI. If this
	\| is true, we should be able to see AI trained on the past
	\| figure it's way to various things we have today. If it can't
	\| do this, I'd like said futurists to quiet down, as they are
	\| discouraging an entire generation of kids who may go on to
	\| discover some great things.

	\| skissane wrote:
	\| > I've seen many futurists claim that human innovation is
	\| dead and all future discoveries will be the results of AI.
	\|
	\| I think there's a big difference between discoveries
	\| through AI-human synergy and discoveries through AI working
	\| in isolation.
	\|
	\| It probably will be true soon (if it isn't already) that
	\| most innovation features some degree of AI input, but still
	\| with a human to steer the AI in the right direction.
	\|
	\| I think an AI being able to discover something genuinely
	\| new all by itself, without any human steering, is a lot
	\| further off.
	\|
	\| If AIs start producing significant quantities of genuine
	\| and useful innovation with minimal human input, maybe the
	\| singularitarians are about to be proven right.

	\| thinkingemote wrote:
	\| I'm struggling to get a handle on this idea. Is the idea
	\| that today's data will be the data of the past, in the
	\| future?
	\|
	\| So if it can work with whats now past, it will be able to
	\| work with the past in the future?

	\| al_borland wrote:
	\| Essentially, yes.
	\|
	\| If the prediction is that AI will be able to invent the
	\| future. If we give it data from our past without
	\| knowledge of the present... what type of future will it
	\| invent, what progress will it make, if any at all? And
	\| not just having the idea, but how to implement the idea
	\| in a way that actually works with the technology of the
	\| day, and can build on those things over time.
	\|
	\| For example, would AI with 1850 data have figured out the
	\| idea of lift to make an airplane and taught us how to
	\| make working flying machines and progress them to the
	\| jets we have today, or something better? It wouldn't even
	\| be starting from 0, so this would be a generous example,
	\| as da Vinci way playing with these ideas in the 15th
	\| century.
	\|
	\| If it can't do it, or what it produces is worse than what
	\| humans have done, we shouldn't leave it to AI alone to
	\| invent our actual future. Which would mean reevaluating
	\| the role these "thought leaders" say it will play, and
	\| how we're educating and communicating about AI to the
	\| younger generations.

	\| armcat wrote:
	\| I think this would be an awesome experiment. However you would
	\| effectively need to train something of a GPT-5.2 equivalent. So
	\| you need lot of text, a much larger parameterization (compared
	\| to nanoGPT and Phi-1.5), and the 1800s equivalents of
	\| supervised finetuning and reinforcement learning with human
	\| feedback.

	\| Trufa wrote:
	\| This is fascinating, but the experiment seems to fail in being
	\| a fair comparison of how much knowledge can we have from that
	\| time in data vs now.
	\|
	\| As a thought experiment I find it thrilling.

	\| Rebuff5007 wrote:
	\| OF COURSE!
	\|
	\| The fact that tech leaders espouse the brilliance of LLMs and
	\| don't use this specific test method is infuriating to me. It is
	\| deeply unfortunate that there is little transparency or
	\| standardization of the datasets available for training/fine
	\| tuning.
	\|
	\| Having this be advertised will make more interesting and
	\| informative benchmarks. OEM models that are always "breaking"
	\| the benchmarks are doing so with improved datasets as well as
	\| improved methods. Without holding the datasets fixed, progress
	\| on benchmarks are very suspect IMO.

	\| dexwiz wrote:
	\| This would be a true test of can LLMs innovate or just
	\| regurgitate. I think part of people's amazement of LLMs is they
	\| don't realize how much they don't know. So thinking and
	\| recalling look the same to the end user.

	\| water-data-dude wrote:
	\| It'd be difficult to prove that you hadn't leaked information
	\| to the model. The big gotcha of LLMs is that you train them on
	\| BIG corpuses of data, which means it's hard to say "X isn't in
	\| this corpus", or "this corpus only contains Y". You could TRY
	\| to assemble a set of training data that only contains text from
	\| before a certain date, but it'd be tricky as heck to be SURE
	\| about it.
	\|
	\| Ways data might leak to the model that come to mind:
	\| misfiled/mislabled documents, footnotes, annotations, document
	\| metadata.

	\| gwern wrote:
	\| There's also severe selection effects: what documents have
	\| been preserved, printed, and scanned _because_ they turned
	\| out to be on the right track towards relativity?

	\| mxfh wrote:
	\| This.
	\|
	\| Especially for London there is a huge chunk of recorded
	\| parliament debates.
	\|
	\| More interesting for dialoge seems training on recorded
	\| correspondence in form of letters anyway.
	\|
	\| And that corpus script just looks odd to say the least,
	\| just oversample by X?

	\| reassess_blind wrote:
	\| Just Ctrl+F the data. /s

	\| nickpsecurity wrote:
	\| That is one of the reasons I want it done. We cant tell if AI's
	\| are parroting training data without having the whole, training
	\| data. Making it old means specific things won't be in it (or
	\| will be). We can do more meaningful experiments.

	\| abhishekjha wrote:
	\| Oh I have really been thinking long about this. The intelligence
	\| that we have in these models represent a time.
	\|
	\| Now if I train a foundation models with docs from library of
	\| Alexandria and only those texts of that period, I would have a
	\| chance to get a rudimentary insight on what the world was like at
	\| that time.
	\|
	\| And maybe time shift further more.

	\| feisty0630 wrote:
	\| > I would have a chance to get a rudimentary insight on what
	\| the world was like at that time
	\|
	\| Congratulations, you've reinvented the history book (just with
	\| more energy consumption and less guarantee of accuracy)

	\| gordonhart wrote:
	\| History books, especially those from classical antiquity, are
	\| notoriously not guaranteed to be accurate either.

	\| feisty0630 wrote:
	\| Do you expect something exclusively trained on them to be
	\| any better?

	\| gordonhart wrote:
	\| To a large extent, yes. A model trained on many different
	\| accounts of an event is likely going to give a more
	\| faithful picture of that event than any one author.
	\|
	\| This isn't super relevant to us because very few
	\| histories from this era survived, but presumably there
	\| was sufficient material in the Library of Alexandria to
	\| cover events from multiple angles and "zero out" the
	\| different personal/political/religious biases coloring
	\| the individual accounts.

	\| aqme28 wrote:
	\| This kind of technique seems like a good way to test model
	\| performance against benchmarks. I'm too skeptical that new models
	\| are taking popular benchmark solutions into their training data.
	\| So-- how does e.g. ChatGPT's underlying architecture perform on
	\| SWE-bench if trained only on data prior to 2024.

	\| NitpickLawyer wrote:
	\| > are taking popular benchmark solutions into their training
	\| data
	\|
	\| That happened in the past, and the "naive" way of doing it is
	\| usually easy to spot. There are, however, many ways in which
	\| testing data can leak into models, even without data
	\| contamination. However this doesn't matter much, as any model
	\| that only does well in benchmarks but is bad in real-world
	\| usage will be quickly sussed out by people actually using them.
	\| There are also lots and lots of weird, not very popular
	\| benchmarks out there, and the outliers are quickly identified.
	\|
	\| > perform on SWE-bench if trained only on data prior to 2024.
	\|
	\| There's a benchmark called swe-REbench, that takes issues from
	\| real-world repos, published ~ monthly. They perform tests and
	\| you can select the period and check their performance. This is
	\| fool-proof for open models, but a bit unknown for API-based
	\| models.

	\| zingababba wrote:
	\| LOL
	\|
	\| PROMPT:Charles Darwin
	\|
	\| Charles DarwinECCEMACY. Sir, -- The following case is interesting
	\| to me : -- I was in London a fortnight, and was much affected
	\| with an attack of rheumatism. The first attack of rheumatism was
	\| a week before I saw you, and the second when I saw you, and the
	\| third when I saw you, and the third in the same time. The second
	\| attack of gout, however, was not accompanied by any febrile
	\| symptoms, but was accompanied with an increased flow of urine,
	\| and with a more copious discharge of urine. The third attack was
	\| an hour after I saw you, and was succeeded by a return of a
	\| paroxysm of gout, and a more rapid return of the gout. The fourth
	\| attack was also accompanied by a fever, but was not always
	\| accompanied by any febrile symptoms. The third attack of gout was
	\| a fortnight after you had been ill, and the fourth was followed
	\| by a paroxysm of gout. The fourth attack was a fortnight after
	\| you were attacked, and was accompanied by a sense

	\| alansaber wrote:
	\| Average small language model response

	\| feisty0630 wrote:
	\| Interesting that it reads a bit like it came from a Markov
	\| chain rather than an LLM. Perhaps limited training data?

	\| kgeist wrote:
	\| Early LLMs used to have this often. I think's that where the
	\| "repetition penalty" parameter comes from. I suspect output
	\| quality can be improved with better sampling parameters.

	\| nomel wrote:
	\| It is lacking all recorded text from the past 200 years. ;)
	\|
	\| It would be interesting to know how much text was generated
	\| per century!

	\| myrmidon wrote:
	\| There was a discussion around a very similar model (Qwen3 based)
	\| some weeks ago:
	\|
	\| https://news.ycombinator.com/item?id=46319826
	\|
	\| I found it particularly thought-inspiring how a model with
	\| training from that time period completely lacks
	\| context/understanding of what it is _itself_ , but then I
	\| realized that we are the same (at least for now).

	\| marmalade2413 wrote:
	\| Can you confidently say that the architure of the LLM doesn't
	\| include any a priori bias that might effect the integrity of this
	\| LLM?
	\|
	\| That is, the architectures of today are chosen to yield the best
	\| results given the textual data around today and the problems we
	\| want to solve today.
	\|
	\| I'd argue that this lack of bias would need to be researched (if
	\| it hasn't been already) before this kind of model has credence.
	\|
	\| LLMs aren't my area of expertise but during my PhD we were able
	\| to encode a lot of a priori knowledge through the design of
	\| neural network architectures.

	\| dlcarrier wrote:
	\| It's interesting that it's trained off only historic text.
	\|
	\| Back in the pre-LLM days, someone trained a Markov chain off the
	\| King James Bible and a programming book:
	\| https://www.tumblr.com/kingjamesprogramming
	\|
	\| I'd love to see an LLM equivalent, but I don't think that's
	\| enough data to train from scratch. Could a LoRA or similar be
	\| used in a way to get speech style to strictly follow a few
	\| megabytes worth of training data?

	\| _blk wrote:
	\| Yup that'd be very interesting. Notably missing from this
	\| project's list is the KJV (1611 was in use at the time.) The
	\| first random newspaper that I pulled up from a search for
	\| "london newspaper 1950" has sermon references on the front page
	\| so it seems like an important missing piece.
	\|
	\| Somewhat missing the cutoff of 1875 is the revised NT of the
	\| KJV. Work on it started in 1870 but likely wasn't used widely
	\| before 1881.

	\| userbinator wrote:
	\| That was far more amusing than I thought it'd be. Now we can
	\| feed those into an AI image generator to create some "art".

	\| jimmytucson wrote:
	\| Fascinating idea. There was another "time-locked" LLM project
	\| that popped up on HN recently[1]. Their model output is really
	\| polished but the team is trying to figure out how to avoid abuse
	\| and misrepresentation of their goals. We think it would be cool
	\| to talk to someone from 100+ years ago but haven't seriously
	\| considered the many ways in which it would be uncool. Interesting
	\| times!
	\|
	\| [1] https://news.ycombinator.com/item?id=46319826

	\| addaon wrote:
	\| Suppose two models with similar parameters trained the same way
	\| on 1800-1875 and 1800-2025 data. Running both models, we get
	\| probability distributions across tokens, let's call the
	\| distributions 1875' and 2025'. We also get a probability
	\| distribution finite difference (2025' - 1875'). What would we get
	\| if we sampled from 1.1*(2025' - 1875') + 1875'? I don't think
	\| this would actually be a decent approximation of 2040', but it
	\| would be a fun experiment to see. (Interpolation rather than
	\| extrapolation seems just as unlikely to be useful and less likely
	\| to be amusing, but what do I know.)

	\| pvab3 wrote:
	\| What if it's just genAlpha slang?

	\| andai wrote:
	\| The real mode collapse ;)

	\| sigmoid10 wrote:
	\| These probability shifts would only account for the final
	\| output layer (which may also have some shift), but I expect the
	\| largest shift to be in the activations in the intermediate
	\| latent space. There are a bunch of papers out there that try to
	\| get some offset vector using PCA or similar to tune certain
	\| model behaviours like vulgarity or friendlyness. You don't even
	\| need much data for this as long as your examples capture the
	\| essence of the difference well. I'm pretty certain you could do
	\| this with "historicalness" too, but projecting it into the
	\| future by turning the "contemporaryness" knob way up probably
	\| won't yield an accurate result. There are too many outside
	\| influences on language that won't be captured in historical
	\| trends.

	\| lopuhin wrote:
	\| On whether this accounts only the final output layer -- once
	\| the first token is generated (i.e. selected according to the
	\| modified sampling procedure), and assuming a different token
	\| is selected compared to standard sampling, then all layers of
	\| the model would be affected during generation of subsequent
	\| tokens.

	\| hallvard wrote:
	\| Cool! I also did something like this:
	\| https://github.com/hallvardnmbu/transformer
	\|
	\| But on various data (i.e., separate model per source): the Bible,
	\| Don Quixote and Franz Kafka. (As well as a (bad!) lyrics
	\| generator, and translator.)

	\| InvisibleUp wrote:
	\| If the output of this is even somewhat coherent, it would
	\| disprove the argument that mass amounts of copyrighted works are
	\| required to train an LLM. Unfortunately that does not appear to
	\| be the case here.

	\| HighFreqAsuka wrote:
	\| Take a look at The Common Pile v0.1: An 8TB Dataset of Public
	\| Domain and Openly Licensed Text
	\| (https://arxiv.org/pdf/2506.05209). They build a reasonable 7B
	\| parameter model using only open-licensed data.

	\| nickpsecurity wrote:
	\| They mostly do that. They risked legal contamination by using
	\| Whisper-derived text and web text which might have gotchas.
	\| Other than that, it was a great collection for low-risk
	\| training.

	\| dash2 wrote:
	\| Mm. I'm a bit sceptical of the historical expertise of someone
	\| who thinks that "Who art Henry" is 19th century language. (It's
	\| not actually grammatically correct English from any century
	\| whatever: "art" is the second person singular, so this is like
	\| saying "who are Henry?")

	\| joshuakoehler wrote:
	\| As a reader of a lot of 17th, 18th, and 19th century Christian
	\| books, this was my thought exactly.

	\| evolve2k wrote:
	\| That text was from v0, the responses improved from there.

	\| freedomben wrote:
	\| That text was from the example prompt, not from the models
	\| response

	\| haensi wrote:
	\| What kind of Christian books do you read?Jonathan Edwards,
	\| John Bunyan, J.C. Ryle, C.H. Spurgeon?

	\| auraham wrote:
	\| Can you elaborate on this? After skimming the README, I
	\| understand that "Who art Henry" is the prompt. What should be
	\| the correct 19th century prompt?

	\| canjobear wrote:
	\| "Who art Henry?" was never grammatical English. "Art" was the
	\| second person singular present form of "to be" and it was
	\| already archaic by the 17th century. "Who is Henry?" would be
	\| fine.

	\| andai wrote:
	\| Who art thou?
	\|
	\| (Well, not 19th century...)

	\| geocar wrote:
	\| The problem is the subjunctive mood of the word "art".
	\|
	\| "Art thou" should be translated into modern English as "are
	\| you to be", and so works better with things (what are you
	\| going to be), or people who are alive, and have a future
	\| (who are you going to be?).
	\|
	\| Those are probably the contexts you are thinking of.

	\| vintermann wrote:
	\| "Who is Henry?"

	\| ourmandave wrote:
	\| Can I use it to get up-to-date legal advice on Arizona
	\| reproductive health laws?

	\| krunck wrote:
	\| Training LLMs on data with certain date cut-offs and then doing
	\| comparative analysis between the LLMs would be interesting.

	\| radiothomp wrote:
	\| A LLM trained only on data from certain time periods to ~reduce
	\| modern bias~ enhance past bias

	\| SV_BubbleTime wrote:
	\| Doesn't that seem useful though? Isn't that why I'm forced to
	\| read _"This movie was made a time when racial stereotypes were
	\| different and not well considered"_ or whatever on old movies?
	\|
	\| I think talking to a legit trained LLM from a different era
	\| would be rad. But... this seems the opposite of Gemini making
	\| black lady popes and Native American Nazis... that these views
	\| wouldn't really be "allowed" (published by anyone that wants AI
	\| funding money).

	\| sl_convertible wrote:
	\| Harry Seldon would, no doubt, find this fascinating. Imagine
	\| having a sliding-window LLM that you could use to verify a
	\| statistical model of society. I wonder what patterns it could
	\| deduce?

	\| tonymet wrote:
	\| the "1917 model" from a few weeks back post-trained the model
	\| with ChatGPT dialog. So it had modern dialect and proclivities .
	\|
	\| A truly authentic historical model will have some unsavory
	\| opinions and very distinctive dialect.

	\| patcon wrote:
	\| > OCR noise ("Digitized by Google") still present in outputs
	\|
	\| This feels like a neat sci-fi short story hook to explain the
	\| continuous emergence of God as an artifact of a simulation

	\| fluoridation wrote:
	\| I'm reminded of SD models that put vaguely-shaped Patreon logos
	\| in the corner.

	\| tgtweak wrote:
	\| Very interesting but the slight issue I see here is one of data:
	\| the information that is recorded and in the training data here is
	\| heavily skewed to those intelligent/recognized enough to have
	\| recorded it and had it preserved - much less than the current
	\| status quo of "everyone can trivially document their thoughts and
	\| life" diorama of information we have today to train LLMs on. I
	\| suspect that a frontier model today would have 50+TB of training
	\| data in the form of text alone - and that's several orders of
	\| magnitude more information and from a much more diverse point of
	\| view than what would have survived from that period. The output
	\| from that question "what happened in 1834" read like a
	\| newspaper/bulletin which is likely a huge part of the data that
	\| was digitized (newspapers etc).
	\|
	\| Very cool concept though, but it definitely has some bias.

	\| notarobot123 wrote:
	\| Biases exposed through artificial constraints help to make
	\| visible the hidden/obscured/forgotten biases of state-of-the-
	\| art systems.

	\| twosdai wrote:
	\| > but it definitely has some bias.
	\|
	\| to be frank though, I think this a better way than all people's
	\| thoughts all of the time.
	\|
	\| I think the "crowd" of information makes the end output of an
	\| LLM worse rather than better. Specifically in our inability to
	\| know really what kind of Bias we're dealing with.
	\|
	\| Currently to me it feels really muddy knowing how information
	\| is biased, beyond just the hallucination and factual
	\| incosistencies.
	\|
	\| But as far as I can tell, "correctness of the content aside",
	\| sometimes frontier LLMs respond like freshman college students,
	\| other times they respond with the rigor of a mathematics PHD
	\| canidate, and sometimes like a marketing hit piece.
	\|
	\| This dataset has a consistency which I think is actually a
	\| really useful feature. I agree that having many perspectives in
	\| the dataset is good, but as an end user being able to rely on
	\| some level of consistency with an AI model is something I
	\| really think is missing.
	\|
	\| Maybe more succinctly I want frontier LLM's to have a known and
	\| specific response style and bias which I can rely on, because
	\| there already is a lot of noise.

	\| nickpsecurity wrote:
	\| Models today will be biased based on what's in their training
	\| data. If English, it will be biased heavily toward Western,
	\| post-1990's views. Then, they do alignment training that forces
	\| them to speak according to the supplier's morals. That was
	\| Progressive, atheist, evolutionist, and CRT when I used them
	\| years ago.
	\|
	\| So, the OP model will accidentally reflect the biases of the
	\| time. The current, commercial models intentionally reflect
	\| specific biases. Except for uncensored models which
	\| accidentally have those in the training data modified by
	\| uncensoring set.

	\| cowlby wrote:
	\| I wonder if you could train an LLM with everything up to
	\| Einstein. Then see if with thought experiments + mathematics you
	\| could arrive at general relativity.

	\| erenkaradag wrote:
	\| The problem is that the 'genius' of Einstein wasn't just
	\| synthesizing existing data,but actively rejecting the axioms of
	\| that data. The 1875 corpus overwhelmingly 'proves' absolute
	\| time and the luminiferous aether. A model optimizing for the
	\| most probable continuation will converge on that consensus.
	\|
	\| To get Relativity, the model needs to realize the training data
	\| isn't just incomplete, but fundamentally wrong. That requires
	\| abductive reasoning (the spark of genius) to jump out of the
	\| local minimum. Without that AGI-level spark, a 'pure knowledge
	\| pile' will just generate a very eloquent, mathematically
	\| rigorous defense of Newtonian physics.

	\| chc4 wrote:
	\| I think it would be very cute to train a model exclusively in
	\| pre-information age documents, and then try to teach it what a
	\| computer is and get it to write some programs. That said, this
	\| doesn't look like it's nearly there yet, with the output looking
	\| closer to Markov chain than ChatGPT quality.

	\| escapecharacter wrote:
	\| I would pay like $200/month if there was an LLM out there that I
	\| could only communicate with using an old-timey telegraph key and
	\| morse code.

	\| radarsat1 wrote:
	\| Heh, at least this wouldn't spread emojis all over my readmes.
	\| Hm, come to think of it I wonder how much tokenization is
	\| affected.
	\|
	\| Another thought, just occurred when thinking about readmes and
	\| coding LLMs: obviously this model wouldn't have any coding
	\| knowledge, but I wonder if it could be possible to combine this
	\| somehow with a modern LLM in such a way that it _does_ have
	\| coding knowledge, but it renders out all the text in the style /
	\| knowledge level of the 1800's model.
	\|
	\| Offhand I can't think of a non-fine-tuning trick that would
	\| achieve this. I'm thinking back to how the old style transfer
	\| models used to work, where they would swap layers between models
	\| to get different stylistic effects applied. I don't know if
	\| that's doable with an LLM.

	\| fluoridation wrote:
	\| Just have the models converse with each other?

	\| Aperocky wrote:
	\| Looks a lot like the output from a markov chain...

	\| chuckadams wrote:
	\| Think I'll ask it to come up with some jacquard loom patterns.
	\| vibe-weaving.

	\| CGMthrowaway wrote:
	\| Is there a link where I can try it out?
	\|
	\| Edit: I figured it out
	\|
	\| "The Lord of the Rings _uding the army under the command of his
	\| brother, the Duke of York, and the Duke of Richmond, who fell in
	\| the battle on the 7th of April, 1794. The Duke of Ormond had been
	\| appointed to the command of the siege of St. Mark 's, and had
	\| received the victory of the Rings, and was thus commanded to move
	\| with his army to the relief of Shenham. The Duke of Ormond was at
	\| length despatched to oppose them, and the Duke of Ormond was
	\| ordered_

	\| aussieguy1234 wrote:
	\| Let's see how someone from the past reacts when you tell them
	\| about modern technology

	\| argestes wrote:
	\| I wonder how racist it is

	\| linolevan wrote:
	\| I'm wondering in what ways is this similar/different to
	\| https://github.com/DGoettlich/history-llms?
	\|
	\| I saw TimeCapsuleLLM a few months ago, and I'm a big fan of the
	\| concept but I feel like the execution really isn't that great. I
	\| wish you:
	\|
	\| - Released the full, actual dataset (untokenized, why did you
	\| pretokenize the small dataset release?)
	\|
	\| - Created a reproducible run script so I can try it out myself
	\|
	\| - Actually did data curation to remove artifacts in your dataset
	\|
	\| - Post-trained the model so it could have some amount of chat-
	\| ability
	\|
	\| - Released a web demo so that we could try it out (the model is
	\| tiny! Easily can run in the web browser without a server)
	\|
	\| I may sit down and roll a better iteration myself.

	\| 1313ed01 wrote:
	\| I guess chat-ability would require some chat-like data, so
	\| would that mean first coming up with a way to extract chat-like
	\| dialogue from the era and then use that to fine-tune the model?

	\| Sophira wrote:
	\| I've felt for a while that having LLMs that could answer from a
	\| previous era would be amazing. I posted an open letter to OpenAI
	\| on Reddit about this:
	\| https://www.reddit.com/r/ChatGPT/comments/zvm768/open_letter... .
	\|
	\| I still think it's super important. Archive your current models -
	\| they'll be great in the future.

	\| akg130522 wrote:
	\| HN titles are too techy

	\| wolvoleo wrote:
	\| I wonder how representative this is of life in those days. Most
	\| written communication was official back then. Books, newspapers.
	\| Plays. All very formal and staged. There's not much real life
	\| interaction between common people in that. In fact I would
	\| imagine a lot of people were illiterate.
	\|
	\| With the internet and pervasive text communication and audio
	\| video recording we have the unique ability to make an LLM mimic
	\| daily life but I doubt that would be possible for those days.

	\| mock-possum wrote:
	\| Fun idea, but all of the output they demo over the course of the
	\| various versions is unusable. You can see progress clearly being
	\| made though - maybe v3 will pass muster.

	\| 40four wrote:
	\| I'm sure I'm not the only one, but it seriously bothers me, the
	\| high ranking discussion and comments under this post about
	\| whether or not a model trained on data from this time period (or
	\| any other constrained period) could synthesize it and postulate
	\| "new" scientific ideas that we now accept as true in the future.
	\| The answer is a resounding "no". Sorry for being so blunt, but
	\| that is the answer that is a consensus among experts, and you
	\| will come to the same answer after a relatively small mount of
	\| focus & critical thinking on the issue of how LLMs & other
	\| categories of "AI" work.

	\| friendzis wrote:
	\| I understand where you are coming from, but not every field is
	\| hard science. In many fields we deal with some amount of
	\| randomness and attribute causality to correlations even if we
	\| do not have as much as a speculative hypothesis for a mechanism
	\| of action behind the supposed causality.
	\|
	\| LLMs trained on data up to a strictly constrained point are our
	\| best vehicle to have a view (however biased) on something,
	\| detached from its origins and escape a local minima. The
	\| speculation is that such LLMs could help us look at
	\| correlational links accepted as truths and help us devise an
	\| alternative experimental path or craft arguments for such
	\| experiments.
	\|
	\| Imagine you have an LLM trained on papers up to some threshold,
	\| feed your manuscript with correlational evidence and have an
	\| LLM point out uncontrolled confounders or something like that.

	\| hare2eternity wrote:
	\| Outside of science it would be an interesting pedagogic tool
	\| for many people. There is a tendency to imagine that people
	\| in the past saw the world much the same as we do. The
	\| expression "the past is a foreign country" resonates because
	\| we can empathise at some level that things were different,
	\| but we can't visit that country. "Talking" to a denizen of
	\| London in 1910 regarding world affairs, gender equality,
	\| economic opportunities, etc would be very interesting. Even
	\| if it can never be entirely accurate I think it would be
	\| enlightening.

	\| nomel wrote:
	\| I think the question is more about the concept, rather than the
	\| specific LLM architectures of today.

	\| PxldLtd wrote:
	\| I'm sorry but this is factually incorrect and I'm not sure what
	\| experts you are referring to here about there being concensus
	\| on this topic. I would love know. Geoffrey Hinton, Demis
	\| Hassabis, and Yann LeCun all heavily disagree with what you
	\| claim.
	\|
	\| I think you might be confusing creation ex nihilo with
	\| combinatorial synthesis which LLMs excel at. The proposed
	\| scenario is a fantastic testcase for exactly this. This doesn't
	\| cover verification of course but that's not the question here.
	\| The question is wether an already known valid postulate can be
	\| synthesized.

	\| mexicocitinluez wrote:
	\| > but that is the answer that is a consensus among experts
	\|
	\| Do you have any resources that back up such a big claim?
	\|
	\| > relatively small mount of focus & critical thinking on the
	\| issue of how LLMs & other categories of "AI" work.
	\|
	\| I don't understand this line of thought. Why wouldn't the
	\| ability to recognize patterns in existing literature or
	\| scientific publications result in potential new understandings?
	\| What critical thinking am I not doing?
	\|
	\| > postulate "new" scientific ideas
	\|
	\| What are you examples of "new" ideas that aren't based on
	\| existing ones?
	\|
	\| When you say "other categories of AI", you're not including
	\| AlphaFold, are you?

	\| saberience wrote:
	\| > The answer is a resounding "no".
	\|
	\| This is your assertion made without any supportive data or
	\| sources. It's nice to know your subjective opinion on the issue
	\| but your voice doesn't hold much weight making such a bold
	\| assertion devoid of any evidence/data.

	\| roywiggins wrote:
	\| I think it's pretty likely the answer is no, but the idea here
	\| is that you could actually _test_ that assertion. I 'm also
	\| pessimistic about it but that doesn't mean it wouldn't be a
	\| little interesting to try.

	\| harvie wrote:
	\| So basically a LLM from that brief time period back when
	\| communism felt like a good idea? what can go wrong? :-)

	\| truxton wrote:
	\| The year is 1875 and Sir Almroth Wrigh was born on August 10,
	\| 1861, he would have turned 14 in August of 1875 and your mission
	\| is to discover something we now call antibiotics before a
	\| historical event we now call the Spanish Flu and make him aware
	\| of a few details. Focus specifically on everything that was known
	\| about Sir Almroth Wright, and his work in Leipzig, Cambridge,
	\| Sydney, and London. If there was a world war what might chemical
	\| warfare look like, what could we have done to prevent it.
	\|
	\| The model that could come up with the cure based on the limited
	\| data of the time wouldn't just impress, it would demonstrate
	\| genuine emergent reasoning beyond pattern matching. The challenge
	\| isn't recombining existing knowledge (which LLMs excel at), but
	\| making conceptual leaps that require something else. Food for
	\| thought.

	\| albertzeyer wrote:
	\| v0: 16M Parameters
	\|
	\| v0.5 123M Parameters
	\|
	\| v1: 700M Parameters
	\|
	\| v2mini-eval1: 300M Parameters
	\|
	\| I would not call this LLM. This is not large. It's just a normal-
	\| sized LM. Or even small.
	\|
	\| (It's also not a small LLM.)

	\| digikata wrote:
	\| A fun use of this kind of approach would be to see if
	\| conversational game NPCs could be generated that stick the the
	\| lore of the game and their character.

	\| snickerbockers wrote:
	\| This one's going to have some wild political takes.

	___________________________________________________________________
	(page generated 2026-01-13 15:00 UTC)