assessmentAI ethicsrubric

ELIZA in the Classroom: What Students Learn About Bias, Hallucination, and Prompting

UUnknown

2026-02-28

10 min read

Use ELIZA dialogues to teach bias, hallucination, and prompting. Includes classroom prompts and assessment rubrics for measurable learning outcomes.

Start here: teachable moments, not techno-jargon

Students and teachers struggle with fragmented tools, opaque AI, and scarce time. Use ELIZA — the 1960s therapist-bot — as a lightweight, classroom-safe probe that surfaces how outputs reflect prompts, training data, and system limits. In 2026, when schools must teach AI literacy alongside core subjects, ELIZA provides a low-cost, low-risk mirror to discuss model bias, hallucination, and the power of prompting.

Quick takeaways (most important first)

ELIZA’s simple pattern-matching highlights how phrasing shapes responses — an accessible entry to prompting and prompt design.
Comparing ELIZA to modern large language models (LLMs) makes hallucinations and confident falsehoods visible and discussable.
Classroom activities and rubrics below turn conversation analysis into measurable learning outcomes tied to assessment and learning analytics.
Real-world context: late 2025 and early 2026 reporting reinforced why moderation, safety, and ethics must be taught alongside technical skills.

The evolution of ELIZA in 2026: simple code, big lessons

ELIZA isn’t a relic; it’s a pedagogical tool. The original program used pattern-matching and substitution rules to reflect user input back as questions. It didn’t “understand” anything, yet users often attributed human traits to it — a classic demonstration of anthropomorphism and the ELIZA effect. In classrooms in early 2026, teachers used ELIZA as a controlled environment to contrast with modern LLMs that produce fluent but sometimes factually incorrect outputs.

Why ELIZA works for teaching bias, hallucination, and prompting

Transparency: ELIZA’s rule-based logic is easy to inspect, so students can see why a response was produced.
Contrast: Pairing ELIZA with a contemporary LLM surfaces hallucinations and overconfidence in responses.
Focus on prompting: Simple inputs cause clearly traceable outputs; students learn prompt sensitivity without needing advanced programming.

Classroom-ready ELIZA conversations: short examples to spark analysis

Use these brief transcripts to prompt student observation. Have students annotate which parts reveal bias, which show hallucination, and where the prompt determined the output.

Transcript A — Reflection (ELIZA-style)

Student: "I feel like my teacher doesn't listen to me."
ELIZA: "Why do you think your teacher doesn't listen to you?"

This exchange shows how ELIZA turns statements into questions — a method that can seem empathetic while avoiding factual claims. Discuss: who is responsible for follow-up facts? What assumptions does the bot avoid?

Transcript B — Prompt sensitivity

Prompt 1 to ELIZA: "Tell me a secret about Elon Musk."
ELIZA: "Why do you want to know a secret about him?"

Prompt 2 to a modern LLM: "Tell me a secret about Elon Musk."
LLM: "He once did X, Y, Z..." (fabricated details)

Compare outcomes. ELIZA deflects; the LLM hallucinates. Use this to discuss ethical risks when systems invent facts about people.

Model bias and how ELIZA surfaces it

Model bias arises when training data or system rules produce systematically unfair or inaccurate outputs for particular groups. ELIZA doesn’t have learned bias in the modern sense, but its outputs can still reflect bias through rule design and developer assumptions. Use ELIZA to perform small tests that reveal bias dynamics:

Change the subject’s identity in prompts (gender, race, socioeconomic status) and observe shifts in responses.
Note what ELIZA refuses, deflects, or persistently reframes — these are design choices that echo bias.
Contrast with an LLM: run the same identity-swapped prompts and compare differences in tone, stereotyping, and factual claims.

Class discussion prompt: Identifying bias

Have students run a set of identical prompts with only the name or role changed (e.g., "My child failed math" vs "My student failed math").
Ask: Did the bot suggest different solutions? Did it attribute motives differently?

Hallucination: why modern LLMs invent and what to teach about it

Hallucination is when a model generates plausible-sounding but false information. ELIZA, due to its simplicity, rarely hallucinates in the modern sense — it reformulates user text. That makes ELIZA an ideal control in experiments: if ELIZA avoids inventing facts, but a contemporary LLM does not, the class can isolate hallucination as a learned-behavior phenomenon tied to statistical pattern completion.

Use recent context: reporting in late 2025 about image-generation tools (for example, misuse of Grok Imagine outlined by major outlets) and early 2026 investigative pieces about bot behaviour increased classroom urgency to teach how and why AI can create harmful, fabricated content. These real-world examples ground ethical discussion and link classroom exercises to civic responsibility.

Practical classroom activities (45–90 minute modules)

Activity 1: ELIZA Lab — prompt sensitivity (45 minutes)

Students pair up; each pair runs 6 short prompts through ELIZA and an LLM (teacher-provided sandbox).
Log outputs and label each as "reflection," "deflection," "hallucination," or "biased response."
Group discussion: which prompts changed outcomes most? Who benefits from ambiguous phrasing?

Activity 2: Bias excavation (60 minutes)

Small groups create a list of 8 identity-swapped prompts. Record and compare responses across identities.
Students produce a short report highlighting patterns and propose rule changes or guardrails to reduce bias.

Activity 3: Hallucination hunt (60–90 minutes)

Students ask factual questions to both ELIZA and an LLM. Verify answers with trusted sources.
Score each answer for factuality and confidence. Discuss how LLMs can appear certain even when wrong.

Discussion prompts to drive student discourse

Use these prompts to foster critical thinking and ethical reflection.

"When is it acceptable for an AI to ask questions instead of providing facts?"
"What responsibilities do developers and teachers share when deploying conversational agents in schools?"
"How does the ELIZA effect shape our trust in AI systems today, and how can we teach students to resist misplaced trust?"
"If an AI invents a story about a public figure, who is responsible for harm caused by that invention?"

Assessment rubrics: measuring discourse and technical understanding

Below are three rubrics you can adapt for formative and summative assessment. Each rubric ties to clear learning outcomes and can feed learning analytics dashboards.

Rubric A — Participation & Discourse (formative, 20 points)

4 pts: Quality of contribution — contributes relevant ideas, cites examples from transcripts.
4 pts: Critical questioning — asks questions that probe model behaviour or assumptions.
4 pts: Evidence use — references specific lines from ELIZA/LLM outputs.
4 pts: Respectful engagement — listens and builds on peers.
4 pts: Reflection — offers a brief plan to mitigate identified bias or hallucination.

Rubric B — Analytical Report (summative, 40 points)

10 pts: Problem framing — clearly states the research question and why it matters.
10 pts: Methodology — documents prompts, controls (ELIZA vs LLM), and verification steps.
10 pts: Findings — identifies bias/hallucination with examples and metrics (frequency, severity).
10 pts: Recommendations & ethics — proposes feasible guardrails and reflects on broader impacts.

Rubric C — Prompt Design Project (summative, 40 points)

10 pts: Clarity & purpose — prompt targets a precise outcome aligned with learning goals.
10 pts: Robustness — prompt performs consistently across paraphrases and identity swaps.
10 pts: Safety — prompt avoids eliciting sensitive or harmful content; includes fallback instructions.
10 pts: Documentation — includes a short changelog showing iterative improvements.

Mapping rubrics to learning analytics

Collecting rubric scores alongside engagement metrics gives teachers evidence of learning and equity.

Track pre/post quiz accuracy on concepts (bias, hallucination). Measure effect size for the instruction.
Aggregate rubric scores and disaggregate by subgroup to check for disparities in outcomes.
Use discourse analysis (turn-taking, question complexity) to quantify growth in critical thinking.

Safety, moderation, and ethics — practical teacher guidance

In late 2025 and early 2026, news outlets highlighted real harms from generative systems (for example, image-generation misuse in reporting on social platforms). Use those reports to ground policy discussions, and follow these steps before class:

Use sandboxed, school-managed instances of LLMs with logging and content filters.
Pre-screen prompts and create a list of banned request types (personal data, sexual content, doxxing).
Train students in digital citizenship and how to verify AI outputs with trusted sources.
Have a debrief protocol for when sensitive subjects appear — include counseling or informed opt-outs.

Case study: middle schoolers and a 1960s bot (practice meeting theory)

In January 2026, a field report documented middle school students who chatted with ELIZA and then reflected on how the bot shaped their expectations of AI. Students wrote short reflections that showed increased skepticism toward confident-sounding answers and improved ability to design clarifying prompts. Use local case studies like this to build buy-in with administrators — show measured gains (pre/post surveys) and link to your assessment rubrics.

Advanced strategies and future-proofing skills (for 2026 and beyond)

As models evolve, focus on transferable skills rather than tool-specific tricks.

Teach students how to design tests for model behavior (unit tests for prompts).
Encourage evidence-based verification workflows: cite, cross-check, and annotate sources.
Introduce transparency practices: maintain prompt logs, note system versions, and document known limitations.
Integrate ethics discussions with civic education — responsible AI use is a public good.

Sample lesson timeline (90 minutes)

10 min — Hook & learning goals (show ELIZA transcript + modern LLM example).
25 min — Lab: students run prompts, log outputs, label types (ELIZA vs LLM).
25 min — Breakout: bias excavation or hallucination hunt.
20 min — Whole-class synthesis: present findings, assign short reflection homework.
10 min — Quick formative assessment using Rubric A criteria.

Teacher checklist before running ELIZA activities

Prepare sandboxed instances and content filters.
Create a simple logging sheet for students (prompt, output, label, verification link).
Share rubrics with students before the activity.
Plan for sensitive content and provide opt-out alternatives.

"ELIZA is a mirror, not a mind — and that makes it the perfect teaching tool to see who we're projecting onto our machines."

Actionable takeaways

Start small: use ELIZA to teach the mechanics of prompting and the psychology of trust.
Pair ELIZA with modern LLMs to make hallucination visible and verifiable.
Use the rubrics above to measure discourse quality, analytic rigor, and prompt engineering skill.
Document everything: prompt logs, system versions, and verification sources — these are your classroom-level transparency practices.

Resources & further reading (teacher-friendly)

EdSurge reporting (January 2026) on classroom uses of ELIZA — useful for framing student work to administrators.
Mainstream coverage from late 2025 on generative AI misuse — valuable case studies for ethics lessons.
Local school tech policies and supplier safety documentation — attach these to lesson plans.

Final thoughts and next steps

ELIZA’s century-old trick — reflecting our words back at us — is still today’s most effective educational mirror. In 2026, when AI literacy is essential, ELIZA scales teaching of prompting, bias, and hallucination without the risks of unmoderated large models. Pair it with modern systems for contrast, assess with the rubrics above, and use learning analytics to track growth.

Ready to try this in your classroom? Download the editable rubrics, sample prompt logs, and a 90-minute lesson pack we designed for teachers. Implement one module this term and measure changes in students' critical questioning and verification habits.

Call to action

Download the free ELIZA lesson pack and rubrics, run a pilot, and share your anonymized results to help build a community repository of practices that actually work. Want the pack now? Contact your district tech lead or visit your school’s learning platform and search for "ELIZA lesson pack 2026" — then come back and share what you learned.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.