Hi Ben, thanks for your article, though I must admit I'm still hoping to hear good news eventually?.. I knew about the tutors study but I didn't know it was BS as well - sigh...
I too am impatient to get that Nigeria detailed study, talk about an outlier!
Well, interestingly I think by your taxonomy the Tutor Co-Pilot study would have fallen in the "good science" category. It's only after digging into the preregistration that things get murky.
Just based on what's been blogged about already, I suspect the Nigeria "study" is going into your rock bottom category, if results even get published.
Have you formed any theory as to why the universal bias so clearly tips the scale toward positive outcomes? Why Isn't anyone fudging the numbers to say it sucks? They can't all be paid, otherwise where's my sweet corruption money?
Well, a couple of things. First and foremost, positive results are what get attention, so that's probably the main driver. There's also the fact that jobs in the AI industry are far more plentiful and high-paying than AI skepticism (take it from me), so that further skews things. Plus many journalists are going to be more excited to run positive AI stories than negative. Oh, and edu-philanthropy is gearing up to throw more money behind AI too.
Now it strikes me, how many times it is possible that I took a study about AI, and said - "this is a scientific study by researchers from MIT, it must be correct". Thanks for this article!
This pairs nicely with the Nick McGrievy you mentioned that was published yesterday in Understanding AI. My thesis is that what's happening in research is similar to what's happening in teaching. AI is making longstanding problems visible because it shines light on broken processes.
Now that there is greater attention on "gaming of results" or outright misconduct, what happens? I hope it is a genuine reckoning with broken peer review and broken models of classroom instruction. But if we're all too overwhelmed to change things on the individual level and foundations remain committed to automating and optimizing what is broken, then it becomes a question of how long the systems we have set up can last. Not long, I suspect.
Good question, Rob. This is me musing aloud, but I wonder how we might change the nature of what research looks like in order to make it more useful *and* auditable. For example, I glanced at that ChatGPT meta-analysis, I knew the results had to be bullshit, but trying to pick apart the complicated mathematical formulas they present seemed more trouble than it's worth. Peer review should do more to catch this stuff, but as Paul Bruno noted in the thread I linked to, there's only so much any human can be expected to do when it comes to reviewing someone else's work. I don't know, something to keep thinking about.
Thanks for this thoughtful analysis of the AI-tutoring hype. Are there any verified successes for AI, outside of military and surveillance, for which we do have the piling up of bodies to assist in the count? Or, perhaps it might be the job losses, especially in customer service, and the measurable increase in customer dissatisfaction?
I seem to have misplaced the memo. Please remind me, we need AI, why?
Hey Ben you might appreciate my friend @Wess Trebelsi’s research into this topic. He’s got some great reviews and resources in here:
https://open.substack.com/pub/wesstrabelsi/p/the-good-the-bad-and-the-ugly-science?r=elugn&utm_medium=ios
Holy smokes, this is terrific Mike, thank you for sharing this! I am going to add another addendum to cite to this.
Kudos for taking the time to _revise_ your post to include the Trabelsi data. It makes the substack medium less ephemeral and more collegial as well.
Awesome! I thought so too.
Tagging @wess trabelsi to make sure he knows!
Thx Mike, I'm also glad to know of Ben!
Hi Ben, thanks for your article, though I must admit I'm still hoping to hear good news eventually?.. I knew about the tutors study but I didn't know it was BS as well - sigh...
I too am impatient to get that Nigeria detailed study, talk about an outlier!
Well, interestingly I think by your taxonomy the Tutor Co-Pilot study would have fallen in the "good science" category. It's only after digging into the preregistration that things get murky.
Just based on what's been blogged about already, I suspect the Nigeria "study" is going into your rock bottom category, if results even get published.
Have you formed any theory as to why the universal bias so clearly tips the scale toward positive outcomes? Why Isn't anyone fudging the numbers to say it sucks? They can't all be paid, otherwise where's my sweet corruption money?
Well, a couple of things. First and foremost, positive results are what get attention, so that's probably the main driver. There's also the fact that jobs in the AI industry are far more plentiful and high-paying than AI skepticism (take it from me), so that further skews things. Plus many journalists are going to be more excited to run positive AI stories than negative. Oh, and edu-philanthropy is gearing up to throw more money behind AI too.
Now it strikes me, how many times it is possible that I took a study about AI, and said - "this is a scientific study by researchers from MIT, it must be correct". Thanks for this article!
I’m glad to see AI researchers embracing the caveat emptor model of education so readily. /s
This pairs nicely with the Nick McGrievy you mentioned that was published yesterday in Understanding AI. My thesis is that what's happening in research is similar to what's happening in teaching. AI is making longstanding problems visible because it shines light on broken processes.
Now that there is greater attention on "gaming of results" or outright misconduct, what happens? I hope it is a genuine reckoning with broken peer review and broken models of classroom instruction. But if we're all too overwhelmed to change things on the individual level and foundations remain committed to automating and optimizing what is broken, then it becomes a question of how long the systems we have set up can last. Not long, I suspect.
Good question, Rob. This is me musing aloud, but I wonder how we might change the nature of what research looks like in order to make it more useful *and* auditable. For example, I glanced at that ChatGPT meta-analysis, I knew the results had to be bullshit, but trying to pick apart the complicated mathematical formulas they present seemed more trouble than it's worth. Peer review should do more to catch this stuff, but as Paul Bruno noted in the thread I linked to, there's only so much any human can be expected to do when it comes to reviewing someone else's work. I don't know, something to keep thinking about.
Wild timing. The full Nigerian study just dropped -
https://documents1.worldbank.org/curated/en/099548105192529324/pdf/IDU-c09f40d8-9ff8-42dc-b315-591157499be7.pdf
It's almost bad enough to bring me out of retirement.
(Fires up the TLC on his CD player)
I ain't too proud to beg!
Thanks for this thoughtful analysis of the AI-tutoring hype. Are there any verified successes for AI, outside of military and surveillance, for which we do have the piling up of bodies to assist in the count? Or, perhaps it might be the job losses, especially in customer service, and the measurable increase in customer dissatisfaction?
I seem to have misplaced the memo. Please remind me, we need AI, why?
Sad day for science and education 😔