Measuring the Wrong Thing
**I wrote this on 3.13.26 but sending while I take a break from work for a few days**
Here is the question I keep sitting with this week. Not from a research paper. From a conversation with myself on a long drive.
If your district ran an AI adoption report right now of logins, license usage, training completions, PD hours, and the numbers looked good, would that tell you anything useful? Would it tell you whether teaching got better? Whether students are thinking harder? Whether your teachers are more capable of their craft than they were a year ago?
Or would it just tell you how many people clicked?
We have spent two years optimizing for AI adoption(hopefully). Rollouts. Subscriptions. Professional development days with enthusiasm scores. What we have not figured out yet is how to measure the thing that actually matters: whether the presence of AI in our buildings is making us better at the work, or just faster at producing the appearance of it.
This issue is built around that question. The research is messier than the headlines. The teen data is more urgent than the edtech press is covering. The safety infrastructure doesn’t exist yet. And the philosophical case for why imagination not AI fluency is the actual skill we should be developing landed in my inbox from two directions this week and I can’t stop thinking about it.
What would it look like if you stopped measuring what you’ve adopted and started measuring what has disappeared?
This issue challenges you to run that audit. Not to pull back, but to actually see what you have.
1) The skill we keep trying to test out of existence
Seth Godin published thirty-five words.
“We spend most of the time we’re in school extinguishing imagination. “Will this be on the test?” is a much more common question than “What if?” We’ve been trained to do tasks in a factory. Imagination is a skill and it takes effort. As tasks continue to be automated, the hard work of imagination is worth investing effort in.”
Read that against every number in this issue. More than half of students using AI for schoolwork. Teachers self-teaching at scale because institutions left the training gap wide open. Educational integrity scenarios failing three-quarters of AI safety evaluations. An evidence base that is still empty two years into widespread adoption.
What we have been building isn’t an AI strategy. It’s a fairy tale.
Hannah Shaller put language to this in her newsletter this week that shares similar sentiment while not being about this topic at all. She is writing about life, not EdTech, but the frame is exactly right. She describes what she calls the “arrival fallacy”: the belief that once we reach a particular milestone, lasting stability will follow. The right tool. The right policy. The right rollout. The credits roll. The problem, as she writes it, is that real life “rarely offers permanence. Instead, it offers seasons.” The story we have been telling about AI in education is built on arrival logic. Roll it out. Announce it. Watch adoption numbers. Wait for outcomes data to confirm the story we already want to tell.
What Shaller nails is the distinction between a life or a strategy that is discovered versus one that is constructed. “A meaningful life is not discovered. It is constructed.” She is talking about adulthood. I am talking about AI governance. The logic transfers exactly. We keep waiting for someone to figure this out for us first. A better bill. A clearer federal guidance. A vendor who handles the safety question. And while we wait, the gap between the story we’re telling and the thing actually happening in our buildings keeps widening.
Godin’s point is structural. Schools trained for task completion over generations produce teachers who ask “Will this be on the test?” and we handed those same teachers AI tools designed to complete tasks faster. We didn’t close the imagination loop. We automated around it. The students using AI for all or most of their schoolwork are doing exactly what school trained them to do: find the fastest path to the finished product. The discomfort isn’t with the students. It’s with the system that decided the task mattered more than the thinking.
Shaller ends with something I want every district leader sitting with a half-finished AI policy to read slowly: “Stop waiting for the ending. There is no final chapter where life becomes permanently resolved. There is only the ongoing process of living.” There is no final policy that resolves AI in your district. There is only the ongoing work of navigating it with integrity. That’s not a failure of leadership. That’s the actual job.
Read: Imagination Is Work — Seth’s Blog
Read: The Space Between the Fairy Tale and the Life We Actually Live — Hannah Shaller
So What?
There is no final chapter where AI governance becomes permanently resolved. The work is the ongoing navigation and the districts that understand that are already ahead.
Try This
Write one sentence that describes your district’s AI story the way you’d want to tell it in two years. Then write one sentence that describes the gap between that story and what’s actually true today. The space between those two sentences is your next governance priority.
2) The wrong metaphor is doing real damage
Jeremy Utley made a case this month: AI is not a pill. It is a skill.
When you treat AI as a pill like something you take once and either it works or it doesn’t, you build shame and exit into the adoption model. Teachers try it a few times. It doesn’t deliver. They conclude they’re “not AI people” and stop. That story gets told as resistance when it is actually a design failure. Research from BCG and Wharton found a 40% quality boost for AI users, but Bryce Challamel, former Head of AI at Moderna, found that the inflection point is between occasional and daily users. Not between users and non-users. If you are measuring whether teachers “use AI,” you are measuring the wrong threshold.
Utley borrows from James Clear: 1% daily improvement compounds to 37x over a year. The skill progression is fear, familiar, fluent, fun. Most PD designs get people to familiar and stop. Fluent requires reps, not training days.
Read: It’s a Skill, Not a Pill — Jeremy Utley
So What?
The difference between a teacher who benefits from AI and one who doesn’t is daily practice, not a training day.
Try This
Use AI for one routine low-stakes task three times before Friday. Not a big task. A small one you do anyway. Notice what changes on rep three.
3) Your AI dashboard is lying to you
Another conversation worth sharing is between Jeremy Utley and Eric Porres at Logitech. The frame: real AI transformation shows up as deletion, not usage.
Here is the problem with every adoption dashboard. The engineer who runs 40 prompts a day shows up as a power user. The finance manager who eliminated three recurring reports entirely, the one who genuinely transformed her workflow, doesn’t show up at all. She stopped producing things. That’s not in the data. 88% of organizations report AI use. Only around 6% qualify as high performers by actual economic impact. The MIT Iceberg Index says AI could theoretically perform tasks comprising about 11.7% of the U.S. workforce. The gap between theoretical capacity and actual transformation is not a technology problem. It is a measurement problem.
Porres puts it directly: “AI adoption is not a training problem. It’s a deletion problem.” What has your adoption actually made disappear? What workflows no longer exist? What documents stopped being produced? What meetings got shorter? If you can’t name anything, if everything AI did was additive, layered on top of existing work rather than replacing it, you haven’t transformed. You’ve subscribed.
Read: Your AI Dashboard Is Lying to You — Jeremy Utley
Personal Tie-in
Earlier this month I shared with district leaders the idea of challenging their plans to not have AI Junk Drawers and I read this conversation and further strengthens this notion as budgets, time, and constraints in K-12 education increase.
So What?
If nothing has disappeared from your workflow since you started using AI, you haven’t adopted it. You’ve decorated with it.
Try This
Write down two things your staff does manually that AI should be handling. Ask why it isn’t. That answer is your next governance move.
4) The Pew numbers nobody is framing right
Pew dropped their teen AI report in late February and the coverage landed where it usually does on the emotional support number, the cheating number, the anxiety number. Understandable. But the data underneath those headlines is more useful than the headlines themselves.
More than half of U.S. teens say they’ve used chatbots to search for information (57%) or get help with schoolwork (54%). That is not fringe behavior. That is a majority of your student population using these tools for core academic tasks, mostly without your awareness and mostly without any school-provided structure for doing so. About one-in-ten teens report using a chatbot to help with all or most of their schoolwork. And a majority of teens roughly 59% think using AI to cheat is a regular occurrence at their school, including about a third who say it happens extremely or very often. Those are your students describing their own environment to a researcher. That is the room your academic integrity policies are walking into.
The number that should be sitting uncomfortably with every school counselor and administrator: 12% of U.S. teens use AI chatbots for emotional support or advice, with 16% using them for casual conversation. Those aren’t huge numbers. But mental health professionals are wary because general-purpose tools like ChatGPT, Claude, and Grok are not designed for such uses, and in the most extreme cases these chatbots can have life-threatening psychological effects. The question isn’t whether 12% is alarming. It’s whether your district has any structure at all for the students who are turning to a chatbot because they’re not comfortable turning to a person.
Browse: How Teens Use and View AI — Pew Research Center
Browse: How Teens and Young People Use AI Tools for Learning and Mental Health Support — EdWeek
So What?
Your students are already using AI for academic help, emotional support, and information. The question is whether your district has any structure around that reality, or is just hoping it isn’t happening.
Try This
Find out whether your school counseling staff has had any explicit conversation about AI chatbot use as a mental-health-adjacent behavior. If that conversation hasn’t happened, schedule it before the end of the year.
5) KORA: someone finally built the safety benchmark
Until a few weeks ago, there was no publicly available benchmark for measuring how safe AI models actually are for children. Capability benchmarks everywhere. Safety for kids? Nothing.
KORA changed that. Launched by Mathilde Collin who is the founder of the collaboration software company Front. It’s the first public benchmark specifically designed to test AI models against child safety risks. To build it, she convened more than 30 specialists from diverse fields including child addiction, psychology and psychiatry, education, and child safety, to develop a risk taxonomy. Then she combined LLM and human evaluations until both aligned on what safe and unsafe actually look like in practice.
The findings are not reassuring. Models struggle most with risks related to cheating and academic dishonesty with 76% of responses in those scenarios coming out as inadequate. Let that sit for a moment. The tools your district is either deploying or watching teachers deploy informally are failing three-quarters of the time when tested on educational integrity scenarios. That’s not a reason to panic. But it’s a reason to stop assuming vendor safety claims map to actual child-context performance.
The other finding is more interesting to me. There’s a strong correlation (r = 0.84) between models that avoid pretending to be human and overall emotional safety. Models that maintain clear boundaries rather than claiming feelings or human experiences tend to perform better across all safety categories. Again coming back to the idea of design from the last newsletter, the design choice that matters most for child safety isn’t content filtering. It’s whether the model is honest about what it is. That’s an insight that should inform how you talk to students, parents, and vendors about what “safe” AI actually means.
Collin’s framing for edtech companies is sharp: the right approach is to lead the ecosystem on child safety, own it as your identity, and watch it transform from compliance cost to premium pricing justification. The inverse is also true for districts: ask your vendors what their child safety benchmark scores are. If they don’t have an answer, you have your answer.
Browse: KORA Benchmark — korabench.ai
Read: Launching KORA, the First Public Benchmark for AI Child Safety — Edtech Partnerships
So What?
The first public benchmark for AI child safety just launched and the results suggest the tools already in your buildings would fail it. That’s not a reason to remove them. It’s a reason to evaluate them honestly.
Try This
The next time a vendor demos an AI product for your district, ask one question: “How does this model perform on child safety benchmarks?” If they don’t reference KORA or any comparable evaluation, add it to the list of things to verify before signing.
6) The most intentional AI classroom might be the one that doesn’t use it
Cate Denial teaches history at Knox College. She bans AI in her courses. And then she spends one to two full class periods teaching students explicitly about AI: the ethics, the labor exploitation embedded in training data, the privacy costs, the environmental footprint, what “predicting the next token” actually means in practice.
Her consistent finding across two years of doing this: almost no students knew about any of these issues before she raised them. Think about what that means. Students who are using AI daily for homework help, for search, for emotional support have no organized understanding of what it is, how it was built, or who bears the costs of its existence. They’re fluent users of a technology they’ve never been taught to think about.
I want to be careful here. Denial’s choice to ban AI is her pedagogical call and it makes sense in her context. I’m not arguing that K–12 districts should ban AI. I’m arguing that her approach names something important: a thoughtful “no” grounded in explicit curriculum is more intellectually coherent than a reflexive “yes” with no framework. The question is not whether students should use AI. It’s whether they have any conceptual foundation for understanding what they’re using.
Read: How I’m Teaching About Generative AI — Cate Denial
So What?
Fluency without understanding isn’t education. It’s just a new kind of task completion.
Try This
Find one class where students use AI regularly. Ask the teacher: do students know what AI training data is, who produces it, what the environmental costs are? If no, then that’s the next curriculum conversation.
7) The director seat is open
Two pieces on Claude Cowork recently worth your time if you’ve been in the “I’ll figure this out eventually” lane on agentic AI.
Michael Crist published a walkthrough on Claude Cowork which if you have not used yet is a desktop app at $20/month Pro tier feature. The chat interface puts a partition between you and the model. You’re the curator: manually pulling files, copy-pasting context, managing what the AI can see. Cowork removes that wall. You describe an outcome. It reads your files, works across your system, and builds toward the result. Crist’s example: 12-tab marketing spreadsheet, described the dashboard he wanted, Cowork read all tabs, opened the live Google Sheet, built color-coded KPI tracking with trend arrows. Real caveat: tutorials oversell the “walk away and come back to finished work” version. Real use requires clear outcome description. You are the director. That role doesn’t disappear. But what it requires shifts from doing to describing and for high-constraint workflows like scheduling, compliance reporting, or multi-audience communications, that shift is significant.
For school leaders specifically: the scheduling tool I’ve been building with Claude Code caught a 25-minute daily instructional deficit before any schedule was built and generated three specific resolution paths. That’s the kind of constraint-holding that matters like state instructional minute mandates, specialist service windows, MTSS tier requirements, staffing coverage, the actual math of whether your blocks fit inside your available day.
Read: How I Built My Personal AI Assistant — Michael Crist
Personal Tie-in
I’ve been building with Claude Code for a few months and I still have moments where I can’t believe what it can hold simultaneously. The scheduling tool keeps proving that. More soon.
So What?
The difference between using AI and building with AI is the difference between borrowing someone else’s thinking and encoding your own.
Try This
Write down one workflow in your week that requires you to hold three or more constraints simultaneously like scheduling, staffing, compliance, policy. Describe the finished output in one clear paragraph. That’s your first Claude Cowork prompt.
ON MY RADAR
• Stefan Bauschard, 6 Critical AI & Education Articles — Block layoffs partially AI-caused per insider. Speech and debate gaining value because AI can’t argue for you. Build-with vs. use-it as the job security differentiator. Worth 10 minutes.
• Find Fantastic Books — Wonder Tools — Jeremy Caplan’s updated guide to free book discovery. Most Recommended Books is alone worth the click: look up anyone you admire, see exactly what they’ve read. Good for rebuilding the reading habit.
• Generation AI: What Kids and Families Think — Common Sense Media — Companion to the Pew data above. The 52/52 split — 52% of parents call AI in schoolwork unethical, 52% of kids call it innovative is the stat to bring to your next parent night. Surfaces the exact tension most districts are avoiding.
• AI Memory Portability tip: ChatGPT → Settings → Personalization → Memory → Manage → Export. Claude → Settings → Memory → Import/Export. Worth knowing before you need it.
• Canva AI Skills for Students — Free Course — Spend 20 minutes in it before you assign it to students. It’s better than you’d expect.
ANALOG CHALLENGE: First Rep
Utley’s bike-falling reframe has stayed with me. You don’t learn to ride a bike by studying bikes. You learn by falling. The first fall isn’t failure, it’s rep one.
Pick one thing you’ve been avoiding with AI or without because you’re afraid of not doing it well the first time. Write it on paper. Give yourself three attempts to do it badly this week. The only success condition is that you started. The brain that makes good governance decisions is the same brain you’re exercising when you let yourself be a beginner.
CLOSING REFLECTION
One ask before you go: what are you sitting with right now? What’s the question keeping you up, the thing a colleague said this week that you can’t shake, the policy decision you’re not sure about? Drop it in the comments or send me a direct message. I read everything. I’m building the next conversation from your responses, not just my own five ideas.
The fairy tale version of AI in education ends with a rollout. The real version is built in the space between what you said you’d do and what you can actually account for, season by season, not chapter by chapter.
What does your district’s AI story look like from inside that gap?
— A-A-Ron


