I did a similar project but using 3D fractals I found on shadertoy feeding into ViTs. They are extremely simple iterative functions that produce a ton of scene like complexity.
I have a pet theory that the visual cortex when developing is linked to some kind of mechanism such as this. You just need proteins that create some sort of resonating signal that feed into the neurons as they grow (obviously this is hand-wavy) but similar feedback loops guide nervous system growth in Zebra fish for example.
Reminds me of "Universal pre-training by iterated random computation" https://arxiv.org/pdf/2506.20057, with bit less formal approach.
I wonder if there is a closed-form solution for those kinds of initialization methods (call them pre-training if you wish). A solution that would allow attention heads to detect a variety of diverse patterns, yet more structured than random init.
Neural cellular automata are interesting because they shift learning from “predict tokens” to “model state evolution.” That feels much closer to a transition-based view of systems, where structure emerges from repeated local updates (transitions) rather than being encoded explicitly.
I'm working on a theoretical/computational framework, the Functional Universe, intended for modeling physical reality as functional state evolution. i would say it could be used to replicate your CA process. Won't link it here to signal my good faith discussing this issue - it's on my GH.
from https://voxleone.github.io/FunctionalUniverse/pages/executiv..., "The Functional Universe models reality as a history built from irreversible transitions, with time emerging from the accumulation of causal commitments rather than flowing as a primitive parameter." Is it fair to say that time is simply a way of organizing a log file on a dynamicreality? I interpreted "composition of transitions" as a system of processes. I think the hard modeling problem is interpreting interactions between processes - that transitions don't simply compose, that observed transitions may be confounded views of more complex transitions. I gather NCA would be granular enough to overcome that.
“The long-term vision is: foundation models that acquire reasoning from fully synthetic data, then learn semantics from a small, curated corpus of natural language. This would help us build models that reason without inheriting human biases from inception.”
I don’t think that assumption is being made, why do you think that? In terms of metaphor, training a model could be considered both knowledge acquired after birth and its evolution. But I don’t think it’s particularly useful to stay thinking in metaphors.
I have a pet theory that the visual cortex when developing is linked to some kind of mechanism such as this. You just need proteins that create some sort of resonating signal that feed into the neurons as they grow (obviously this is hand-wavy) but similar feedback loops guide nervous system growth in Zebra fish for example.
I wonder if there is a closed-form solution for those kinds of initialization methods (call them pre-training if you wish). A solution that would allow attention heads to detect a variety of diverse patterns, yet more structured than random init.
I'm working on a theoretical/computational framework, the Functional Universe, intended for modeling physical reality as functional state evolution. i would say it could be used to replicate your CA process. Won't link it here to signal my good faith discussing this issue - it's on my GH.
But is that correct? I think organisms also come with a partial built in understanding of nature at birth.
I agree. Most organisms are quite pre-trained: they have “instincts” and natural behaviors.
E.g. newly hatched turtles know to crawl towards the ocean immediately when they hatch. They don’t learn that on their way.
It seems to me that most lifeforms come into this world pre-trained.