{"id":8498,"date":"2024-08-26T10:25:42","date_gmt":"2024-08-26T10:25:42","guid":{"rendered":"https:\/\/worlduniversitydirectory.com\/edu\/researchers-combat-ai-hallucinations-in-math\/"},"modified":"2024-08-26T10:28:46","modified_gmt":"2024-08-26T10:28:46","slug":"researchers-combat-ai-hallucinations-in-math","status":"publish","type":"post","link":"https:\/\/worlduniversitydirectory.com\/edu\/researchers-combat-ai-hallucinations-in-math\/","title":{"rendered":"Researchers Combat AI Hallucinations in Math"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p><span style=\"font-weight: 400\">The Berkeley researchers took benefit of the truth that ChatGPT, like people, is erratic. They requested ChatGPT to reply the identical math downside 10 instances in a row. I used to be stunned {that a} machine would possibly reply the identical query in a different way, however that&#8217;s what these massive language fashions do. Typically the step-by-step course of and the reply had been the identical, however the precise wording differed. Typically the strategies had been weird and the outcomes had been useless mistaken. (See an instance within the illustration beneath.)<\/span><\/p>\n<p><span style=\"font-weight: 400\">Researchers grouped related solutions collectively. After they assessed the accuracy of the commonest reply among the many 10 options, ChatGPT was astonishingly good. For fundamental high-school algebra, AI\u2019s error fee fell from 25% to zero. For intermediate algebra, the error fee fell from 47% to 2%. For faculty algebra, it fell from 27% to 2%.\u00a0<\/span><\/p>\n<p><b>ChatGPT answered the identical algebra query three alternative ways, nevertheless it landed on the correct response seven out of 10 instances on this instance<\/b><\/p>\n<figure id=\"attachment_64533\" class=\"wp-caption aligncenter\" style=\"max-width: 1050px\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-64533\" src=\"https:\/\/cdn.kqed.org\/wp-content\/uploads\/sites\/23\/2024\/08\/pardosbhandari.png\" alt=\"\" width=\"1050\" height=\"1067\" srcset=\"https:\/\/cdn.kqed.org\/wp-content\/uploads\/sites\/23\/2024\/08\/pardosbhandari.png 1050w, https:\/\/cdn.kqed.org\/wp-content\/uploads\/sites\/23\/2024\/08\/pardosbhandari-800x813.png 800w, https:\/\/cdn.kqed.org\/wp-content\/uploads\/sites\/23\/2024\/08\/pardosbhandari-1020x1037.png 1020w, https:\/\/cdn.kqed.org\/wp-content\/uploads\/sites\/23\/2024\/08\/pardosbhandari-160x163.png 160w, https:\/\/cdn.kqed.org\/wp-content\/uploads\/sites\/23\/2024\/08\/pardosbhandari-768x780.png 768w\" sizes=\"(max-width: 1050px) 100vw, 1050px\"\/><figcaption class=\"wp-caption-text\">Supply: Pardos and Bhandari, <a href=\"https:\/\/journals.plos.org\/plosone\/article?id=10.1371\/journal.pone.0304013\">\u201cChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills,\u201d<\/a> PLOS ONE, Might 2024<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400\">Nonetheless, when the scientists utilized this methodology, which they name \u201cself-consistency,\u201d to statistics, it didn&#8217;t work as nicely. ChatGPT\u2019s error fee fell from 29% to 13%, however nonetheless multiple out of 10 solutions was mistaken. I feel that\u2019s too many errors for college kids who&#8217;re studying math.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The large query, after all, is whether or not these ChatGPT\u2019s options assist college students be taught math higher than conventional instructing. In a second a part of this examine, researchers recruited 274 adults on-line to unravel math issues and randomly assigned a 3rd of them to see these ChatGPT\u2019s options as a \u201ctrace\u201d in the event that they wanted one. (ChatGPT\u2019s mistaken solutions had been eliminated first.) On a brief check afterwards, these adults improved 17% in comparison with lower than 12% studying positive aspects for the adults who might see a special group of hints written by undergraduate math tutors. Those that weren\u2019t supplied any hints scored about the identical on a post-test as they did on a pre-test.<\/span><\/p>\n<p><span style=\"font-weight: 400\">These spectacular studying outcomes for ChatGPT prompted the examine authors to boldly predict that \u201cutterly autonomous era\u201d of an efficient computerized tutoring system is \u201cacross the nook.\u201d In principle, ChatGPT might immediately digest a ebook chapter or a video lecture after which instantly flip round and tutor a pupil on it.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Earlier than I embrace that optimism, I\u2019d wish to see how a lot actual college students \u2013 not simply adults recruited on-line \u2013 use these automated tutoring methods. Even on this examine, the place adults had been paid to do math issues, 120 of the roughly 400 contributors didn\u2019t full the work and so their outcomes needed to be thrown out. For a lot of children, and particularly college students who&#8217;re struggling in a topic, <\/span><a href=\"https:\/\/hechingerreport.org\/what-aspects-of-teaching-should-remain-human\/\"><span style=\"font-weight: 400\">learning from a computer just isn\u2019t engaging<\/span><\/a><span style=\"font-weight: 400\">.\u00a0<\/span><\/p>\n<p><i><span style=\"font-weight: 400\">This story about <\/span><\/i><a href=\"https:\/\/hechingerreport.org\/proof-points-combat-ai-hallucinations-math\/\"><i><span style=\"font-weight: 400\">AI hallucinations<\/span><\/i><\/a><i><span style=\"font-weight: 400\"> was written by Jill Barshay and produced by <\/span><\/i><a href=\"https:\/\/hechingerreport.org\/special-reports\/higher-education\/\"><i><span style=\"font-weight: 400\">The Hechinger Report<\/span><\/i><\/a><i><span style=\"font-weight: 400\">, a nonprofit, unbiased information group targeted on inequality and innovation in training. Join <\/span><\/i><a href=\"https:\/\/hechingerreport.org\/proofpoints\/\"><i><span style=\"font-weight: 400\">Proof Points<\/span><\/i><\/a><i><span style=\"font-weight: 400\"> and different <\/span><\/i><a href=\"https:\/\/hechingerreport.org\/newsletters\/\"><i><span style=\"font-weight: 400\">Hechinger newsletters<\/span><\/i><\/a><i><span style=\"font-weight: 400\">.<\/span><\/i><\/p>\n<\/div>\n<p><script async defer crossorigin='anonymous' src=\"https:\/\/connect.facebook.net\/en_US\/sdk.js\"><\/script><br \/>\n<br \/><br \/>\n<br \/><a href=\"https:\/\/ww2.kqed.org\/mindshift\/2024\/08\/26\/researchers-combat-ai-hallucinations-in-math\/\">Source link <\/a><\/p>\n<footer class=\"rafi-content-footer\">\n    <h6> <em> <font color=\"blue\">\n  WUD Post<\/font><\/em><\/h6>\n  \n  <\/footer> ","protected":false},"excerpt":{"rendered":"<p>The Berkeley researchers took benefit of the truth that ChatGPT, like people, is erratic. They requested ChatGPT to reply the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":8499,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"yst_prominent_words":[],"_links":{"self":[{"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/posts\/8498"}],"collection":[{"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/comments?post=8498"}],"version-history":[{"count":1,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/posts\/8498\/revisions"}],"predecessor-version":[{"id":8500,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/posts\/8498\/revisions\/8500"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/media\/8499"}],"wp:attachment":[{"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/media?parent=8498"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/categories?post=8498"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/tags?post=8498"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/worlduniversitydirectory.com\/edu\/wp-json\/wp\/v2\/yst_prominent_words?post=8498"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}