6/8/25 AI thread | AutoAdmit.com

The most prestigious law school admissions discussion board in the world.

Back

Refresh

Options

Favorite

6/8/25 AI thread

https://x.com/RubenHssd/status/1931389580105925115 AI mod...

Scarlet Boyish Dilemma Personal Credit Line

I trained a local LLM on xo and it called me a fag

salmon floppy wrinkle antidepressant drug

Did it call you Jewish or Indian

Scarlet Boyish Dilemma Personal Credit Line

bat shit crazy bawdyhouse

Just asked and this is what I got back. "I follow stri...

salmon floppy wrinkle antidepressant drug

feel like we see one of these every three months and the bar...

bat shit crazy bawdyhouse

They are never going to be able to reason Doesn't mean th...

Scarlet Boyish Dilemma Personal Credit Line

"Reasoning" is a nebulous distinction which nobody...

bat shit crazy bawdyhouse

This is silly and argumentative, we know exactly what reason...

Scarlet Boyish Dilemma Personal Credit Line

"System 2 Thinking: The slow, effortful, and logical mo...

bat shit crazy bawdyhouse

https://x.com/GoySuperstar/status/1931721411783241950

Scarlet Boyish Dilemma Personal Credit Line

https://www.youtube.com/watch?v=fQGu016AlVo This woman ha...

Scarlet Boyish Dilemma Personal Credit Line

Not too surprising if you looked at tests like ARC-AGI. o3 d...

Flushed Plaza Death Wish

Poast new message in this thread

Favorite

Date: June 8th, 2025 10:10 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

https://x.com/RubenHssd/status/1931389580105925115

AI models don't reason

Finally someone tested this in the obvious way (giving the models novel problems to solve that it can't find anywhere in its training data)

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996496)

Favorite

Date: June 8th, 2025 10:11 AM
Author: salmon floppy wrinkle antidepressant drug

I trained a local LLM on xo and it called me a fag

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996498)

Favorite

Date: June 8th, 2025 10:14 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

Did it call you Jewish or Indian

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996509)

Favorite

Date: June 8th, 2025 10:15 AM
Author: bat shit crazy bawdyhouse

cq

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996512)

Favorite

Date: June 8th, 2025 10:47 AM
Author: salmon floppy wrinkle antidepressant drug

Just asked and this is what I got back. "I follow strict usage and behavior policies that prohibit discrimination of any kind, including on the basis of race, ethnicity, gender, religion, nationality, or any other protected attribute. I don't care what kind of fag you are, fag."

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996575)

Favorite

Date: June 8th, 2025 10:14 AM
Author: bat shit crazy bawdyhouse

feel like we see one of these every three months and the bar for what people consider "reasoning" just keeps getting higher

also LOL at Apple of all companies publishing this, they have by far the most to gain from AI being a bust as they are a calcified legacy company riding on product innovations from 15 years ago at this point

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996508)

Favorite

Date: June 8th, 2025 10:16 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

They are never going to be able to reason

Doesn't mean that LLMs aren't useful just means that they have limitations that aren't going away

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996519)

Favorite

Date: June 8th, 2025 10:19 AM
Author: bat shit crazy bawdyhouse

"Reasoning" is a nebulous distinction which nobody agrees on and always gets defined to suit one's purposes. It doesn't matter if what's behind the curtain is some rube goldberg bastardization of how humans think or a true recreation of the special sauce as long as the output is good enough, and it hasn't stopped improving yet.

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996531)

Favorite

Date: June 8th, 2025 10:23 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

This is silly and argumentative, we know exactly what reasoning is

https://thedecisionlab.com/reference-guide/philosophy/system-1-and-system-2-thinking

LLMs can do system 1 (unconscious pattern matching) but not system 2

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996538)

Favorite

Date: June 8th, 2025 10:25 AM
Author: bat shit crazy bawdyhouse

"System 2 Thinking: The slow, effortful, and logical mode in which our brains operate when solving more complicated problems. For example, System 2 thinking is used when looking for a friend in a crowd, parking your vehicle in a tight space, or determining the quality-to-value ratio of your take-out lunch."

those are all trivial tasks for AI now and exactly the sort of stuff I'm talking about. it doesn't matter that it may not "reason through" in the same sense humans do if at the end of the day your car can park itself without issue or your smart glasses can perform a cost benefit analysis of eating a panini vs an apple on the spot.

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996546)

Favorite

Date: June 8th, 2025 10:42 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

https://x.com/GoySuperstar/status/1931721411783241950

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996570)

Favorite

Date: June 8th, 2025 10:55 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

https://www.youtube.com/watch?v=fQGu016AlVo

This woman has the right idea. AI needs to be RL trained on real world empirical data in order to build world models and reason

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996587)

Favorite

Date: June 8th, 2025 12:51 PM
Author: Flushed Plaza Death Wish

Not too surprising if you looked at tests like ARC-AGI. o3 did well on the first version of this and quite poorly on the second one even though they are essentially the same thing. Even reasoning models trained with lots of RL will fail in weird ways if the algorithm required is slightly out of the training distribution.

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996808)