5/29/05 AI thread | AutoAdmit.com

The most prestigious law school admissions discussion board in the world.

Back

Refresh

Options

Favorite

5/29/05 AI thread

https://x.com/ESYudkowsky/status/1927880597965287888 This...

Content Creator

https://x.com/MoonL88537/status/1927927988399575070 They ...

Content Creator

People conflate LLMs with transformers trained in the standa...

.,,,.,.,.,.,.,,,,..,,..,.,.,.,

An AI needs to be able to use the same program (or at least,...

Content Creator

If the LLM has the ability to use full recurrency and is str...

.,,,.,.,.,.,.,,,,..,,..,.,.,.,

"The latent, broadly accepted human logic is well repre...

Content Creator

SGD with transformers isn’t actually doing minimum des...

.,,,.,.,.,.,.,,,,..,,..,.,.,.,

Agreed with everything in your first paragraph. That is how ...

Content Creator

I think you could have a transformer or something similar tr...

.,,,.,.,.,.,.,,,,..,,..,.,.,.,

But how would the outer network's weights be mapped on to th...

Content Creator

i am very much a layman in the realms of AI, tech, and even ...

https://x.com/vitrupo/status/1927978101058982135

Content Creator

In our recent interpretability research, we introduced a new...

Content Creator

LOL. Yes. I am sure you can partially follow the 13 or 1...

this guy is interesting and talks a lot about AI https://...

That guy is a fraud bullshitter I used to follow him but he'...

Content Creator

oh why u say that? he follows my shit posting account so i l...

Idk maybe I'm being too harsh, I just saw too many posts fro...

Content Creator

I followed him when he was based and talking about dating. N...

https://x.com/dystopiangf/status/1928156746989633625 &r...

Content Creator

Researchers hear teenage girl laughingly say she’s dat...

Poast new message in this thread

Favorite

Date: May 29th, 2025 10:46 AM
Author: Content Creator

https://x.com/ESYudkowsky/status/1927880597965287888

This is some really funny sophistry

Yud is such a kike clown lmao

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48970519)

Favorite

Date: May 29th, 2025 12:42 PM
Author: Content Creator

https://x.com/MoonL88537/status/1927927988399575070

They don't have world-models. They can't have world-models

Even their own explanations for their own thought processes are flame

LLMs are not getting us to "AGI" whatever that ends up being. They have to be able to build their own real world, empirical world-models from scratch

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48970898)

Favorite

Date: May 29th, 2025 1:48 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,

People conflate LLMs with transformers trained in the standard way. Arithmetic generalization comes from learning a small program that describes arithmetic structure and operations and then reusing that program in different ways depending on the context. It’s the same reason why they don’t length generalize. The models aren’t using the same program for every arithmetic input. It’s just a mess of heuristics. Transformers trained with SGD don’t do that, but there are LLM architectures that likely could.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971102)

Favorite

Date: May 29th, 2025 2:10 PM
Author: Content Creator

An AI needs to be able to use the same program (or at least, a working program) for every arithmetic exercise, length exercise, etc though. It seems to me like this is always going to be a problem for LLMs, that isn't a problem for machine learning models trained on that data specifically, like you said

I don't see how LLMs ever deal with this. They're *always* going to be a mess of conflicting, inconsistent heuristics. The best they can ever achieve is for a human to specifically insert "artificial" programs as they are recognized as needed, such as in the 'strawberry' example. Am I off base here?

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971196)

Favorite

Date: May 29th, 2025 2:53 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,

If the LLM has the ability to use full recurrency and is strongly pushed to learning minimum description length programs, then the tendency to learn inconsistent heuristics will go away. They aren’t capable of learning a compact program and iteratively using it to achieve length generalization. They may learn a program for handling 6 digit arithmetic, but if 7 digits are not in their training set, then they won’t know how to do it. The training process may also create distributed circuits that can answer arithmetic problems in one context but not another. An LLM learning the most compact program possible that fits its data well will not use distributed patterns for answering arithmetic problems or require learning a new circuit for certain length problems. It will know the true generalizing program.

I’ll note that there is inconsistency in the data that would seem to imply the models have to learn a mess of heuristics, but this just requires the LLMs to learn programs to model things like writer error. The latent, broadly accepted human logic is well represented in the data so that’s what the models should be able to learn.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971362)

Favorite

Date: May 29th, 2025 3:26 PM
Author: Content Creator

"The latent, broadly accepted human logic is well represented in the data so that’s what the models should be able to learn."

I agree that it seems like this should be true, but if this is the case, why haven't the models been able to learn it yet? There are many categories of tasks that LLMs fail at in this same way/due to the same reason. They can't generalize because they can't build a mental model of (some concept) in the way that humans can

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971486)

Favorite

Date: May 29th, 2025 4:10 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,

SGD with transformers isn’t actually doing minimum description length learning. For inputs that are consistently structured in the same way, it will learn the underlying program that generalizes to other samples. You see certain types of verbal reasoning or inference that are consistently seen in certain “standard” ways and the models handle them well. But when these models fit data, they can learn a mixture of things that are contextually dependent in a way that isn’t desirable and produces imperfect generalization. They will not necessarily learn how to combine different programs in a way that generalizes to new samples. SGD with weight normalization can be viewed as a learning process that produces substantial but imperfect generalization. Current LLMs kind of bypass this problem to a large extent by training on everything, but this approach will ultimately fail.

Note that I don’t think this means AGI is far away. I think SGD and backprop and transformers are all human designed components of the learning process that are very likely learnable or evolvable.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971612)

Favorite

Date: May 29th, 2025 4:26 PM
Author: Content Creator

Agreed with everything in your first paragraph. That is how I understand it too. But I don't understand your last sentence in the second paragraph. Are you saying that LLMs would somehow learn how to set up their own "virtual" transformer architecture-within-transformer architecture to learn different generalizable concepts?

I'm not an ML scientist by trade or even work in tech at all so I appreciate your patience with me. Playing a lot of catch up here

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971677)

Favorite

Date: May 29th, 2025 4:47 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,

I think you could have a transformer or something similar trained to take in training samples and then write weight updates directly to another network. It would be essentially trained to program the network actually used for predictions. This outer network would be trained such that for a certain number of samples, the network it is training has the lowest possible generalization error. You move that network down a gradient based on generalization error. This sort of meta-learning is expensive but could likely produce powerful learning algorithms that don’t do the stupid things our current ones do.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971731)

Favorite

Date: May 29th, 2025 5:03 PM
Author: Content Creator

But how would the outer network's weights be mapped on to the inner network's weights? How would humans (or the inner network) know how the weights correspond to each other? How would the information be transferred/trained between networks?

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971770)

Favorite

Date: May 29th, 2025 2:23 PM
Author: ,.,,,.,,.,,

i am very much a layman in the realms of AI, tech, and even probably philosophy of mind.

but as naysayers and skeptics like you continue to voice their doubts, Im watching shit happen that was fantasy as recently as covid.

these computers, whatever the fuck you call them, can now express a human personality and demonstrate most of the signs we use to assign the value of intelligence.

"It's mimicry!"

But what human behavior isnt

"Its a fake kind of intelligence!"

Maybe maybe to the philosopher. Do you think 99% of human laypeople have discriminating enough minds to care?

And from Amazon Alexa to Grok, or whatever, the rapidity of the increase in intelligence has been breathtaking.

Even if that growth curve were to slow down, this suggests that in a few years - five certainly - these machines will produce all the verifiable "signs" humans use to gauge intelligence in other humans or animals.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971239)

Favorite

Date: May 29th, 2025 2:25 PM
Author: Content Creator

https://x.com/vitrupo/status/1927978101058982135

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971249)

Favorite

Date: May 29th, 2025 1:25 PM
Author: Content Creator

In our recent interpretability research, we introduced a new method to trace the thoughts of a large language model. Today, we’re open-sourcing the method so that anyone can build on our research.

Our approach is to generate attribution graphs, which (partially) reveal the steps a model took internally to decide on a particular output. The open-source library we’re releasing supports the generation of attribution graphs on popular open-weights models—and a frontend hosted by Neuronpedia lets you explore the graphs interactively.

https://www.anthropic.com/research/open-source-circuit-tracing

this shit seems like total flame

i'm playing with it right now and it doesn't seem to work

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971018)

Favorite

Date: May 29th, 2025 4:51 PM
Author: wlmas

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971738)

Favorite

Date: May 29th, 2025 11:14 PM
Author: Summa

LOL.

Yes. I am sure you can partially follow the 13 or 14th transformation of the 100 million parameter model. Fucking pajeets man, fucking pajeets.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48972564)

Favorite

Date: May 29th, 2025 2:13 PM
Author: Gay Factory

this guy is interesting and talks a lot about AI

https://x.com/signulll

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971210)

Favorite

Date: May 29th, 2025 2:16 PM
Author: Content Creator

That guy is a fraud bullshitter I used to follow him but he's full of shit and engagement bait shtick

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971223)

Favorite

Date: May 29th, 2025 2:43 PM
Author: Gay Factory

oh why u say that? he follows my shit posting account so i luv him

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971312)

Favorite

Date: May 29th, 2025 2:47 PM
Author: Content Creator

Idk maybe I'm being too harsh, I just saw too many posts from him that were obviously made up bullshit and judged him for it. Not all his poasting is bad by any means he has good stuff too

I'm really self conscious about making sure I don't assert things that I don't know for sure so I can be pretty judgmental about that trait in others

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971330)

Favorite

Date: May 29th, 2025 11:08 PM
Author: Summa

I followed him when he was based and talking about dating. Now this pradeep got hired by some another fly by night ai company and is fucking annoying.

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48972556)

Favorite

Date: May 29th, 2025 2:32 PM
Author: Content Creator

https://x.com/dystopiangf/status/1928156746989633625

ℜ𝔞𝔢
@dystopiangf

This week’s Totally Normal Teenage Trends™️:

- Spoke to a researcher at a character AI company. They surveyed high schools & found that a majority of students have friends who are “dating” character AIs

- Teens are identifying as “solosexual,” i.e. they only have “sex” alone

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971274)

Favorite

Date: May 29th, 2025 4:51 PM
Author: wlmas

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48971739)

Favorite

Date: May 29th, 2025 11:49 PM
Author: scholarship

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48972612)

Favorite

Date: May 30th, 2025 1:15 AM
Author: George Jetson

Researchers hear teenage girl laughingly say she’s dating an ai model bc she’s played a game with it pretending to be dating it and researcher rapidly scribbleds study thinking the test subjects believe they’re actually dating ai models

(http://www.autoadmit.com/thread.php?thread_id=5731013&forum_id=2)#48972685)