2mo ago

Human-like object concept representations emerge naturally in multimodal large language models

2mo ago

21 comments

Isn't this just because LLMs use the object concept representation data from actual humans?
- The object concept representation is an emergent property within these networks. Basically, the network learns to create stable associations between different modalities and associate an abstract concept of an object that unites them together.
  
  But it's emerging from networks of data from humans, which means our object concept representation is in the data. This isn't random data, after all, it comes from us. Seems like the LLMs are just regurgitating what we're feeding them.
  What this shows, I think, is how deeply we are influencing the data we feed to LLMs. They're human-based models and so they produce human-like outputs.
Unpaywalled direct link to paper [PDF] courtesy of the Unpaywall add-on.
In your opinion, is this a good thing, a bad thing, or is it just a curiosity that LLMs currently have?
- It's a good thing in a sense that it means the models are creating stable representations of objects across modalities. It means that there is potential for extending LLM approach to building actual world models in the future.

21 comments