back
We were taken aback by the overwhelming response to our first AI blog post, Deus ex Machina. Wow. As we parsed the comments and web site traffic analytics, we couldn’t figure out what was driving the outpouring of response. AI is a hot topic, but there have been lots of hot tech topics. The people who follow us are tech people, so…what’s all the angst? When even the non-tech people we know—relatives, friends, drinking buddies—began asking us questions about AI, we suspected there’s more at work here.
It’s time for a rational conversation about AI. What it is. What it isn’t. And what it means to you and your organization.
What AI Is…and Isn’t
Following AI conversations online can be enough to induce PTSD. Some are constructive, many are sales pitches, but the loudest parts of the hype cycle boil down to one of two positions. AI is a benevolent, magic e-wizard machine that will unlock the mysteries of the universe and transform the planet for “the good of humanity.” Or AI is a maniacal hell-bot intent on annihilating the world and transforming the planet for the not-so-good of humanity.
Maybe the question to ask is “where’s the line between fantasy and reality when it comes to AI and what it delivers?”
First, AI isn’t a monumental force with a mind of its own. As we said in Deus ex Machina, AI is a category of technology, under which are different types. Machine learning (ML) is a subset of AI and it’s already driving operational decisions for many companies. Generative AI is what’s fueling the Fear, Uncertainty, and Dread (FUD) hype cycle.
AIs are simply programs based on mathematical algorithms (algos) written by humans. They perform functions based on requests. ChatGPT, Dolly, and Bard are all AIs—but they each have less brainpower than an earthworm. They can only function when fed—or “trained”—with data.
AIs do not reason nor do they have context. Because computer models don’t have social norms, morals, or ethics, no one really knows what will happen when you try to exercise them on the human condition. Pure black-and-white thinking doesn’t always work well in a shades-of-gray world. As a result, poorly implemented generative AI applied to human problems usually delivers unintended results. What you do with those results either continues to condition the AI or cripples it. Either way, synthesized results continue to feed the AI with new data that can make the algo take unforeseen directions.
These facts point to one truth. Everything in AI depends on the data underlying the model. People assume they know their data—where it came from, what’s in it, and how an algo will see it. That’s where things start to go off the rails.
Well, That Escalated Quickly
Amazon found this out from its AI program intended to screen job applicants. They wanted it to review hundreds of resumes and spit out the top candidates for hiring. The computer models were trained to vet applicants based on patterns detected in resumes that had been submitted to Amazon over a 10-year time period. That’s a fairly large data set. Amazon had a good idea of the types of candidates it was looking for. The company had hired tens of thousands of people, so they weren’t new to the process. What could go wrong?
What went wrong was an unintentionally biased data set. When you’re screening resumes, you’re actually looking for the people behind them. Each resume represents a unique person, even though they’re often written and formatted in similar ways. In Amazon’s case, the majority of its resume data set came from men, which simply reflected the fact that there were far more men than women in technology in 2014. The program diligently selected a pool of white males educated at large, well-known universities. The model “learned” that male candidates were preferable and downgraded resumes that included the word “women’s,” as in women’s sports or women’s college names. Clearly, that didn’t fly in a modern, publicly traded corporation. Even after repeated changes attempting to make the program gender-neutral, it failed to deliver non-discriminatory results and was scrapped in 2017. Changing the results would have required changing the data set. In this case, you can’t change the data or you won’t end up with what you’re looking for to start with—a real person with real qualifications.
Another, more comical example shows what happens when an algo leans into a specific characteristic or feature of its underlying data set. This year, UK creative agency, Private Island, released an AI-generated video using a new model and an internal dataset of millions of images and video clips. Synthetic Summer is a 30-second distillation of 20 minutes of video, which starts as a backyard barbecue party, going off the rails at about 10 seconds and devolving into a fiery inferno with an uncertain ending—at least for the simulated partygoers. Cue the coroner.
Director Chris Boyle says, “At first glance, it passes for normal, but then when you look closer, you realize how much is wrong. That, I think, makes for an unsettling experience at an almost primal level as I think unconsciously a viewer picks up on it instinctively…you know it’s wrong, but maybe not why. It’s grasping for human, but not quite reaching, or overreaching, and visually encapsulates where we are right now with AI.”
If this is where we are with AI, where exactly is that?
On one hand, there is a lot of responsible work being done in leveraging data models and algorithms to solve real business problems. As we mentioned in Deus ex Machina, often what’s being touted as “AI” is actually rapid productization of ML technology to deliver data analytics and actionable predictions that improve the efficiency of everything from job applicant screening to fraud and risk management.
On the other hand, generative AI is fueling and inflating the hype cycle with examples that are alternately entertaining, freaky, or just plain distracting. Synthetic Summer is entertaining, if also a little disturbing. Getting married by ChatGPT is a distraction. Talking animatronic robots with creepy prosthetics, perfect makeup, and odd outfits holding a U.N. press conference and claiming to be better world leaders than humans? Hugely freaky, not to mention performative. Who’s really under the table pulling the levers? We weren’t sure which was weirder—the robots or the overly earnest, bought-in Global Summit attendees taking it all seriously.
Either way, it’s all driven by data. The underlying data is where we need to start. In our next post, we’ll dive into the provenance of data in AI data sets, aka “Where’d This Sh** Come From?” AI models can only act on the data they receive. To achieve the goals of your AI algo, you have to know where the data originates, as well as its purpose, level of sensitivity, structure, movement, and relationship to other data and users. It’s why data surveillance is the key to understanding what you have in your data set.