Reasons AI Detectors Are Failing in Education

Explore top LinkedIn content from expert professionals.

Summary

AI detectors, intended to identify AI-generated writing, are struggling in educational settings due to inaccuracies, lack of transparency, and the complexity of determining the origin of content. These flaws have led to false accusations against students and growing concerns about fairness and reliability in academic evaluations.

  • Understand the limitations: AI detectors often flag human-written or historical texts due to their reliance on limited metrics like repetition or predictability, which do not fully capture authentic writing styles.
  • Clarify acceptable tools: Clearly distinguish between prohibited AI usage and allowable aids like Grammarly to prevent unfair accusations and confusion among students.
  • Focus on open dialogue: Encourage educators and institutions to discuss the limitations of AI detection tools openly and create transparent processes for resolving disputes.
Summarized by AI based on LinkedIn member posts
  • View profile for Jessica L. Parker, Ed.D.

    AI Curious | Founder | Educator | Speaker

    5,351 followers

    🚨 AI Writing Detectors I recently had a conversation with a doctoral student who was accused of using AI to write a research paper. Her supervisor used Turnitin's AI detector and accused her of academic misconduct. The student was directed to correct the identified text and given a warning. The student is adamant she didn't use AI to write, but she did use Grammarly, which she has been using for years and is not forbidden by the school. 𝐓𝐡𝐞𝐫𝐞 𝐚𝐫𝐞 𝐬𝐞𝐯𝐞𝐫𝐚𝐥 𝐥𝐚𝐲𝐞𝐫𝐬 𝐭𝐨 𝐭𝐡𝐢𝐬 𝐩𝐫𝐨𝐛𝐥𝐞𝐦: 1️⃣ The core issue is that 𝐀𝐈 𝐝𝐞𝐭𝐞𝐜𝐭𝐨𝐫𝐬 𝐚𝐫𝐞 𝐧𝐨𝐭 100% 𝐚𝐜𝐜𝐮𝐫𝐚𝐭𝐞 𝐨𝐫 𝐫𝐞𝐥𝐢𝐚𝐛𝐥𝐞. Don't just take my word for it - Google "𝘝𝘢𝘯𝘥𝘦𝘳𝘣𝘪𝘭𝘵 𝘜𝘯𝘪𝘷𝘦𝘳𝘴𝘪𝘵𝘺: 𝘎𝘶𝘪𝘥𝘢𝘯𝘤𝘦 𝘰𝘯 𝘈𝘐 𝘋𝘦𝘵𝘦𝘤𝘵𝘪𝘰𝘯 𝘢𝘯𝘥 𝘞𝘩𝘺 𝘞𝘦'𝘳𝘦 𝘋𝘪𝘴𝘢𝘣𝘭𝘪𝘯𝘨 𝘛𝘶𝘳𝘯𝘪𝘵𝘪𝘯'𝘴 𝘈𝘐 𝘋𝘦𝘵𝘦𝘤𝘵𝘰𝘳" 2️⃣ 𝐋𝐚𝐜𝐤 𝐨𝐟 𝐬𝐭𝐮𝐝𝐞𝐧𝐭 𝐚𝐠𝐞𝐧𝐜𝐲: Students often can't check their own work with the technology instructors are using to detect their work, leaving them vulnerable to surprise accusations. 3️⃣ 𝐏𝐨𝐰𝐞𝐫 𝐝𝐲𝐧𝐚𝐦𝐢𝐜𝐬: Instructors can unilaterally run detectors and make accusations, creating an imbalance in student-instructor interactions. 4️⃣ 𝐀𝐦𝐛𝐢𝐠𝐮𝐢𝐭𝐲 𝐚𝐫𝐨𝐮𝐧𝐝 𝐀𝐈 𝐭𝐨𝐨𝐥𝐬: The line between prohibited "AI writing" and allowed AI-powered aids like Grammarly is blurring. 5️⃣ 𝐌𝐢𝐬𝐜𝐨𝐧𝐜𝐞𝐩𝐭𝐢𝐨𝐧𝐬 𝐚𝐛𝐨𝐮𝐭 𝐟𝐢𝐱𝐞𝐬: Simply rephrasing flagged text won't address the underlying issues because the technology used to detect the writing is flawed. ✅ One of my recommendations was for the student to ask the professor to check her work again to see if the same score was generated and the same text was highlighted as being AI generated or phrased. My hunch is that the score will change, which could spark a much-needed conversation and reflection on the limitations of these tools. 𝐇𝐨𝐰 𝐝𝐨 𝐀𝐈 𝐝𝐞𝐭𝐞𝐜𝐭𝐨𝐫𝐬 𝐰𝐨𝐫𝐤? Turnitin's AI detector relies on measures like perplexity (predictability) and burstiness (repetitiveness) to flag potential AI content. But it's a flawed approach: 🚩 Generic, unsurprising sentences like "What surprising predictability!" get high AI scores due to 𝐥𝐨𝐰 𝐩𝐞𝐫𝐩𝐥𝐞𝐱𝐢𝐭𝐲. 🚩 Meanwhile, voice-driven human sentences like "This was written by me, a person" are still getting 𝐧𝐨𝐧-𝐳𝐞𝐫𝐨 𝐀𝐈 𝐬𝐜𝐨𝐫𝐞𝐬. 🚩 The tool seems to equate stylistic uniformity (𝐛𝐮𝐫𝐬𝐭𝐢𝐧𝐞𝐬𝐬) with AI, but human writing can also be repetitive at times. (Especially academic writing, which is often formulaic). 🚩 Even clearly human sentences are getting flagged, underscoring the risk of false positives. Links to sources in comments. #AIEthics #AcademicIntegrity #EdTech #HigherEducation #Turnitin

  • View profile for Christopher Penn
    Christopher Penn Christopher Penn is an Influencer

    Co-Founder and Chief Data Scientist at TrustInsights.ai, AI Expert

    45,398 followers

    AI "detectors" are a joke. Here's a screen shot of an AI detector (self-proclaimed "the most advanced AI detector on the market") saying that 97% of this document was generated by AI. 97%. That's an incredibly confident assessment. It's also completely wrong. The text? That's the US Declaration of Independence, written 246 years before ChatGPT launched. Now, why did this happen? Two reasons. First, AI detectors use a relatively small number of metrics like perplexity and burstiness to assess documents. Documents that have little variation in vocabulary and relatively similar line lengths will get flagged, and the Declaration of Independence meets both. Second, AI detectors also use AI, typically smaller, less costly models. Those models are trained on the same data as their bigger cousins. And that means they've seen documents like the Declaration of Independence as valid training data... which they then probably look for. It's the AI equivalent of sneaking a peek at the answers on the exam - they've seen this data before and they know it goes into AI models. The key takeaway is this: AI detectors are worthless. Show this example when someone loudly proclaims that they've found AI-generated anything. If you're a parent challenging a school's use of these garbage tools, use this example to contest the school's incorrect assessment. Are there giveaways that something's been generated by AI? Yes, but fewer and fewer every day as models advance. What's the solution if we want to know whether a piece of content was generated by AI? The onus is on the creator to show the lineage and provenance of the content - the content equivalent of a DOP certification. #AI #GenerativeAI #GenAI #ChatGPT #ArtificialIntelligence #LargeLanguageModels #MachineLearning #IntelligenceRevolution

  • View profile for Luke Hobson, EdD

    Assistant Director of Instructional Design at MIT | Author | Podcaster | Instructor | Public Speaker

    32,593 followers

    Folks, a reminder: AI detectors are not accurate. I just stumbled across a professor’s post on LinkedIn where he claimed his entire class was cheating with AI. His reasoning was that the AI checkers all came back with high marks. These AI detectors are notorious for false positive scores. OpenAI took down their AI checker two years ago because of the low rate of accuracy. Turnitin’s comes with a large warning section that their model isn’t always accurate. To give you an example, ZeroGPT flagged my article and transcript from a YouTube video on ADDIE as 40% AI-generated. I even uploaded parts of my dissertation from 2019 and they were flagged. Right now, AI detectors are like polygraph tests. They sound scientific and precise, but their results are too unreliable to be used as conclusive evidence. 

Explore categories