"They're made out of weights"

(maxleiter.com)

89 points | by MaxLeiter 3 hours ago

5 comments

kami23 33 minutes ago
This read like poetry to me. Thank you for sharing it.
I have a linguistics background and a lot of my philosophizing lately has been on whether or not the emergent abilities of the LLMs is deep down a similar mechanism that creates our consciousness.
For a little bit I was working on having linguistics based evals for a kaggle competition. My challenge was whether or not I could mask things well enough to not trigger its internal state of certain phenomena, and that sent me down a rabbit hole that I'm still exploring.
This story resonated with a lot of questions that can come out of figuring a good solid answer to the what is consciousness question. The one I triggered for me is: Is our perception of time just a slow thread in the giant GPU we are running the universe on? Or more generally, what is time? That's a fun YouTube rabbit hole if you ever need one.
[-]
- kridsdale3 6 minutes ago
  Time is entropy unfolding as things with nonzero temperature do what they do.
  Psychological time is your own weights being updated in response to stimuli and internal processing.
  When there isn't anything interesting happening, no updates are needed, and you don't perceive much time. That's why there's a logarithmic effect on the "density" of time as you age.
- eszed 19 minutes ago
  Yeah, I currently suspect that consciousness is an emergent property. I read elsewhere (it's somewhere in my HN history, I'm sure) that the biggest compute we can currently muster is something like three or four magnitudes away from the number of neurons / connections (or their analog) that our brains have, so it may be a while until we can expect to see it in our machines. But, if the emergent phenomenon hypothesis is correct, then we eventually will. I'm more scared than pleased by the prospect, but there you are.
noosphr 1 hour ago
It's not often I see something that's fractally wrong but here we are.
There is a dictionary, it's called the tokenizer.
There are grammar rules, they are just very weak because the structure of human language is generally quite weak. When presented with languages which have strong consistent grammars the weights are very easily interpretable as a grammar: https://arxiv.org/abs/2201.02177
The point of the original short story is that the computational substrate doesn't matter when you have Turing completeness. This one seems to think that you don't need structure and interpretability just because you change substrates.
[-]
- glitchc 8 minutes ago
  > fractally wrong
  fractally or factually? You mean wrong on so many levels you need a fractal to capture them? If so, what if you could use a neural network instead?
- dpark 23 minutes ago
  A tokenizer is not a dictionary any more than an alphabet is a dictionary.
  [-]
  - noosphr 15 minutes ago
    The Chinese alphabet is very much a dictionary. All the major tokenizers are far larger.
- benlivengood 59 minutes ago
  I don't think the grokking paper is a great argument for the difference between weights and meat. E.g. https://en.wikipedia.org/wiki/Cortical_Labs learning to play Pong.
  The tokenizer is, at best, a sensory mechanism as evidenced by 1) the random generation of the tokenization scheme, and 2) vastly different tokenization schemes produce virtually identical behavior. It'd be like if Noah Webster threw a bunch of movable type into a bucket (breaking some words in half) and then drew randomly to make the first English dictionary.
  [-]
  - noosphr 41 minutes ago
    I'm kind of stunned that someone is using my work to tell me I'm wrong. I wrote the code for the dish brain pong and encoding information was a huge part of what that experiment was about.
    So when I way that the grok paper and the pong paper fundamentally agree I have some idea of what I'm talking about.
    [-]
    - benlivengood 18 minutes ago
      I might have misunderstood the point you are making. I read the original article as "weights are like meat", and so I'm confused by what you consider fractally wrong.
      [-]
      - noosphr 4 minutes ago
        The point that when the rules the model learns are simple enough they stop being spread out over all the layers and become as easily interpretable as any expert system.
        It's just that the rules we feed in the model are extremely poorly defined and we end up with the soup of disjoint rules smeared all across the weights.
        This isn't a feature of the models. It's a feature of the training set.
    - js2 17 minutes ago
      https://news.ycombinator.com/item?id=35079
    - ufocia 21 minutes ago
      Hubris much? I don't see a necessary contradiction in using someone's work to disprove another aspect of that same person's work.
- throw310822 52 minutes ago
  > There are grammar rules
  And they're made out of weights.
turtleyacht 2 hours ago
Numbers that dream.
CSSer 1 hour ago
It works until they get to the sentience part. Neat idea!
[-]
- margalabargala 1 hour ago
  Even there it works a bit.
  > These models are the only other things we've ever met that can hold a conversation, and they're made out of weights
  Is a fair point.
  [-]
  - RodgerTheGreat 52 minutes ago
    Not especially. Depending on where you set your standards for "holding a conversation" you can satisfy the requirement with a classical markov chatterbot, a well-trained parrot, a copy of Eliza, or a telemarketer flowchart drawn on a sheet of paper. Only the markov bot is made out of "weights" in the sense of a statistical model.
    Parrots are intelligent animals, albeit with a limited capacity for vocabulary and syntax compared to a human, and Eliza and the flowchart are made out of explicitly encoded rules and conversational tactics.
    [-]
    - margalabargala 10 minutes ago
      The quality of "conversation" you can have with everything on your list is highly limited, and is categorically different than the sort of conversation you are able to have with any modern AI.
oofbey 39 minutes ago
I love this. For anybody not getting the joke, it’s riffing on the classic 1990s essay “They’re made out of meat.”
https://web.mit.edu/people/dpolicar/writing/prose/text/think...
[-]
- tom_ 13 minutes ago
  This original author is mentioned in the second sentence of the linked article, and then again in the third sentence, along with a link to the original story.