As humans, we often irrationally describe human-like behaviors to objects with some, but not all, characteristics (also known as anthropomorphism) — and we’re seeing this occur more and more with AI.
In some instances, anthropomorphism looks like saying ‘please’ and ‘thank you’ when interacting with a chatbot or praising generative AI when the output matches your expectations.
But etiquette aside, the real challenge here is when you see AI ‘reason’ with a simple task (like summarizing this article) then expect it to effectively perform the same on an anthology of complex scientific articles. Or, when you see a model generate an answer about Microsoft’s recent earnings call and expect it to perform market research by providing the model with the same earnings transcripts of 10 other companies.
These seemingly similar tasks are actually very different for models because, as Cassie Kozyrkov puts it, “AI is as creative as a paintbrush.”
The biggest barrier to productivity with AI is human’s ability to use it as a tool.
Anecdotally, we’ve already heard of clients who rolled out Microsoft Copilot licenses, and then scaled back the number of seats because individuals didn’t feel like it added value.
Chances are that those users had a mismatch of expectations between the problems AI is well-suited to solve and reality. And of course, the polished demos look magical, but AI isn’t magic. I’m very familiar with the disappointment felt after the first time you realize ‘Oh, AI isn’t good for that.’
But instead of throwing up your hands and quitting gen AI, you can work on building the right intuition to more effectively understand AI/ML and avoid the pitfalls of anthropomorphism.
Defining intelligence and reasoning for machine learning
We’ve always had a poor definition of intelligence. When a dog begs for treats, is that intelligent? What about when a monkey uses a tool? Is it intelligent that we intuitively know to move our hands away from heat? When computers do these same things, does that make them intelligent?
I used to be (all 12 months ago) in the camp that was against conceding that large language models (LLMs) could ‘reason’.
However, in a recent discussion with a few trusted AI founders, we hypothesized a potential solution: a rubric to describe levels of reasoning.
Much like we have rubrics for reading comprehension or quantitative reasoning, what if we could introduce an AI equivalent? This could be a powerful tool used to communicate to stakeholders an expected level of ‘reasoning’ from an LLM-powered solution, along with examples of what is not realistic.
Humans form unrealistic expectations of AI
We tend to be more forgiving of human mistakes. In fact, self-driving cars are statistically safer than humans. Yet when accidents happen, there’s an uproar.
This exasperates the disappointment when AI solutions fail to perform a task you might have expected a human to perform.
I hear a lot of anecdotal descriptions of AI solutions as a massive army of ‘interns.’ And yet, machines still fail in ways that humans don’t, while far surpassing them at other tasks.
Knowing this, it’s not surprising that we’re seeing fewer than 10% of organizations successfully developing and deploying gen AI projects. Other factors like misalignment with business values and unexpectedly costly data curation efforts are only compounding the challenges that businesses face with AI projects.
One of the keys to combating these challenges and unlocking project success is to equip AI users with better intuition on when and how to use AI.
Using AI training to build intuition
Training is the key to coping with the rapid evolution of AI and redefining our understanding of machine learning (ML) intelligence. AI training can sound pretty vague on its own, but I’ve found that separating it into three different buckets has been useful for most businesses.
- Safety: How to use AI safely and steer clear of new and AI-improved phishing scams.
- Literacy: Understanding what AI is, what to expect of it, and how it might break.
- Readiness: Knowing how to skillfully (and efficiently) leverage AI-powered tools to accomplish work at a higher quality.
Protecting your team with AI safety training is like arming a new cyclist with knee and elbow pads: It might prevent some scrapes but won’t prepare them for the challenges of intense mountain biking. Meanwhile, AI readiness training ensures your team uses AI and ML to their fullest potential.
The more you give your workforce the chance to safely interact with gen AI tools, the more they will build the right intuition for success.
We can only guess what capabilities will be available in the next 12 months, but being able to tie them back to the same rubric (reasoning levels) and knowing what to expect as a result can only better prepare your workforce to succeed.
Know when to say, ‘I don’t know,’ know when to ask for help — and most importantly know when a problem is out of scope for a given AI tool.
Cal Al-Dhubaib is head of AI and data science at Further.