\
  The most prestigious law school admissions discussion board in the world.
BackRefresh Options Favorite

6/8/25 AI thread

https://x.com/RubenHssd/status/1931389580105925115 AI mod...
Scarlet Boyish Dilemma Personal Credit Line
  06/08/25
I trained a local LLM on xo and it called me a fag
salmon floppy wrinkle antidepressant drug
  06/08/25
Did it call you Jewish or Indian
Scarlet Boyish Dilemma Personal Credit Line
  06/08/25
cq
bat shit crazy bawdyhouse
  06/08/25
Just asked and this is what I got back. "I follow stri...
salmon floppy wrinkle antidepressant drug
  06/08/25
feel like we see one of these every three months and the bar...
bat shit crazy bawdyhouse
  06/08/25
They are never going to be able to reason Doesn't mean th...
Scarlet Boyish Dilemma Personal Credit Line
  06/08/25
"Reasoning" is a nebulous distinction which nobody...
bat shit crazy bawdyhouse
  06/08/25
This is silly and argumentative, we know exactly what reason...
Scarlet Boyish Dilemma Personal Credit Line
  06/08/25
"System 2 Thinking: The slow, effortful, and logical mo...
bat shit crazy bawdyhouse
  06/08/25
https://x.com/GoySuperstar/status/1931721411783241950
Scarlet Boyish Dilemma Personal Credit Line
  06/08/25
https://www.youtube.com/watch?v=fQGu016AlVo This woman ha...
Scarlet Boyish Dilemma Personal Credit Line
  06/08/25
Not too surprising if you looked at tests like ARC-AGI. o3 d...
Flushed Plaza Death Wish
  06/08/25


Poast new message in this thread



Reply Favorite

Date: June 8th, 2025 10:10 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

https://x.com/RubenHssd/status/1931389580105925115

AI models don't reason

Finally someone tested this in the obvious way (giving the models novel problems to solve that it can't find anywhere in its training data)

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996496)



Reply Favorite

Date: June 8th, 2025 10:11 AM
Author: salmon floppy wrinkle antidepressant drug

I trained a local LLM on xo and it called me a fag

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996498)



Reply Favorite

Date: June 8th, 2025 10:14 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

Did it call you Jewish or Indian

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996509)



Reply Favorite

Date: June 8th, 2025 10:15 AM
Author: bat shit crazy bawdyhouse

cq

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996512)



Reply Favorite

Date: June 8th, 2025 10:47 AM
Author: salmon floppy wrinkle antidepressant drug

Just asked and this is what I got back. "I follow strict usage and behavior policies that prohibit discrimination of any kind, including on the basis of race, ethnicity, gender, religion, nationality, or any other protected attribute. I don't care what kind of fag you are, fag."

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996575)



Reply Favorite

Date: June 8th, 2025 10:14 AM
Author: bat shit crazy bawdyhouse

feel like we see one of these every three months and the bar for what people consider "reasoning" just keeps getting higher

also LOL at Apple of all companies publishing this, they have by far the most to gain from AI being a bust as they are a calcified legacy company riding on product innovations from 15 years ago at this point

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996508)



Reply Favorite

Date: June 8th, 2025 10:16 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

They are never going to be able to reason

Doesn't mean that LLMs aren't useful just means that they have limitations that aren't going away

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996519)



Reply Favorite

Date: June 8th, 2025 10:19 AM
Author: bat shit crazy bawdyhouse

"Reasoning" is a nebulous distinction which nobody agrees on and always gets defined to suit one's purposes. It doesn't matter if what's behind the curtain is some rube goldberg bastardization of how humans think or a true recreation of the special sauce as long as the output is good enough, and it hasn't stopped improving yet.

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996531)



Reply Favorite

Date: June 8th, 2025 10:23 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

This is silly and argumentative, we know exactly what reasoning is

https://thedecisionlab.com/reference-guide/philosophy/system-1-and-system-2-thinking

LLMs can do system 1 (unconscious pattern matching) but not system 2

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996538)



Reply Favorite

Date: June 8th, 2025 10:25 AM
Author: bat shit crazy bawdyhouse

"System 2 Thinking: The slow, effortful, and logical mode in which our brains operate when solving more complicated problems. For example, System 2 thinking is used when looking for a friend in a crowd, parking your vehicle in a tight space, or determining the quality-to-value ratio of your take-out lunch."

those are all trivial tasks for AI now and exactly the sort of stuff I'm talking about. it doesn't matter that it may not "reason through" in the same sense humans do if at the end of the day your car can park itself without issue or your smart glasses can perform a cost benefit analysis of eating a panini vs an apple on the spot.

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996546)



Reply Favorite

Date: June 8th, 2025 10:42 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

https://x.com/GoySuperstar/status/1931721411783241950

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996570)



Reply Favorite

Date: June 8th, 2025 10:55 AM
Author: Scarlet Boyish Dilemma Personal Credit Line

https://www.youtube.com/watch?v=fQGu016AlVo

This woman has the right idea. AI needs to be RL trained on real world empirical data in order to build world models and reason

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996587)



Reply Favorite

Date: June 8th, 2025 12:51 PM
Author: Flushed Plaza Death Wish

Not too surprising if you looked at tests like ARC-AGI. o3 did well on the first version of this and quite poorly on the second one even though they are essentially the same thing. Even reasoning models trained with lots of RL will fail in weird ways if the algorithm required is slightly out of the training distribution.

(http://www.autoadmit.com/thread.php?thread_id=5734840&forum_id=2Elisa#48996808)