BRITE Ideas

Welcome to the BRITE Ideas podcast, where we discuss how brands build relationships with consumers and society through innovation, technology, and marketing. BRITE Ideas is produced by the Center on Global Brand Leadership at Columbia Business School, and is led by co-hosts Matthew Quint and J.P. Kuehlwein. We want to thank our supporters Lexicon Branding and Kogan Page. For more information visit: http://briteideas.co

All Episodes

BRITE Ideas

Don't lose the AI predictive forest for the genAI trees with Eric Siegel '98SEAS

June 13, 2024 • Columbia Business School Center on Global Brand Leadership • Season 1 • Episode 19

On Episode 19 of the BRITE Ideas podcast, Matt speaks with Eric Siegel '93SEAS '95GSAS '95SEAS '98GSAS '98SEAS (ah, the multiple degrees that accumulate during a custom-designed PhD effort), former faculty at Columbia University, and author of Predictive Analytics and, his latest, The AI Playbook.

At the core of the conversation is Eric's reminder that traditional AI/ML efforts -- "predicting who will click, buy, lie or die," etc. -- are highly effective and still underutilized AI tools within organizations. The hype of generative AI is certainly warranted, but Eric is eager to help the data scientists and managers of organizations learn together a business framework and paradigm so more established predictive AI models move beyond testing and into production. Common misunderstandings about the true predictive accuracy of AI are discussed as well as the vital role of ethical reviews of model use. In particular, Eric highlights how important it is to avoid using models that create higher false positive rates and thus deny or penalize groups without just cause.

0:00

[Eric Siegel] And that's sort of the I irony, is that generative at the same time as it's so seemingly human-like actually offers less potential autonomy in comparison to predictive, simply because predictive takes on essentially less ambitious but potentially more impactful tasks. Its goals are to increase a large scale process that's already been systematized.[Intro narrator] Welcome to the BRITE Ideas Podcast, where we discuss how brands build relationships with consumers and society through innovation, technology, and marketing. BRITE Ideas is produced by the Center on Global Brand Leadership at Columbia Business School.- I'm Matthew Quint, director of the Center on Global Brand Leadership.- And I'm JP Kuehlwein. Adjunct faculty here at the school and principal at Uberbrands Consulting.- BRITE Ideas is sponsored by Lexicon Branding, a specialized consulting firm that develops inspiring brand names and brand architectures for both the Fortune 500 and today's innovative startups and Kogan page. An independent award-winning publisher that delivers best practices and innovative thinking from global experts across every key business subject.[Matthew Quint] On today's BRITE Ideas podcast, I'm hosting Eric Siegel, author of Predictive Analytics and his most recent book, the AI Playbook, which came out early part of 2024. He's also the founder of Machine Learning Week. He has his PhD from Columbia University's School of Engineering and Applied Science. And I really look forward to discussing the future of machine learning and in particular the balance of applying AI within the need for organizational transformation. Welcome to the podcast today, Eric. Thanks for joining me.[Eric Siegel] Thank you, Matt. It's great to be here.[Matthew Quint] Excellent. I don't have my co-host with me JP here today. He is unavailable. So we'll have a little one-on-one session. So why don't you kick it off with a little bit about your background and the drivers that led you to your latest book that came out this year.[Eric Siegel] I was finishing my PhD in 97, I think it was awarded in 98 in the computer science department. And my focus was very much on machine learning and then I was immediately on the faculty in the computer science department for a few years at that point, teaching grad level courses in, in AI and machine learning and involved with some startups. But I've been an independent consultant for more than 20 years now and running this machine learning week conference series, formerly predictive analytics world. And I'm excited to get this new book out there.'cause I think that what's happened in the field is that the hype has sort of gotten a little bit of the better of it, whereas when I was an early consultant and then writing the, it was like, Hey, look, prediction learning from data to predict that's really valuable. People pay attention, let's get excited about this. So it's a little bit of a cheerleading thing that kind of evangelizing. Now it's a little bit of like, let's pull in the reins because the excitement right now about all things, quote unquote ai, which is an amorphous term at best, is is is kind of overkill. There's a little bit of mismanagement of expectations that is to say hype, although at the same time some really amazing, exciting advent, especially with generative ai. Most of my career focuses on predictive ai. We can discuss the difference and the conference machine learning week that focuses more on predictive. But we do have a new sister conference called Generative Generative AI Application summit, and that is gonna be co-located. Those are the first week of June in Phoenix.[Matthew Quint] Maybe this definitional element around machine learning, AI, et cetera is great. I remember in your preface in the book you do the arose by any other name quote in talking about realizing that prediction was a technique, a terminology that really drove the business attention to what machine learning was capable of. So just start off right now with a little, little more about sort of lingo that we'll talk through in the podcast about machine learning versus AI and you know, both your perspective on this and where you see challenges in this in sort of business and society using these terms.[Eric Siegel] Yeah. Well, when people use the word ai, the word intelligence tends to mislead the term. AI by in and of itself tends to overhype or hype they would say, whereas machine learning is well-defined. So it's learning from data to predict. And under the hood of both generative and predictive, you have that capacity being applied. In the case of predictive applications, predictive analytics, predictive ai, that's sort of the, the main function. That's the value you get is the ability to predict who will click, buy, lie or die, which car is going to break down, which satellite's gonna run out of battery, and which is the best location to drill for oil. So you're predicting the outcome or behavior that directly informs an operation. And this is the type of technology we turn to for improving with the best of science, the best of math, how to improve our existing large scale operations. Whereas generative AI is learning from data to predict what should be the next word while I'm writing, what, how should I change this pixel while I'm rendering an image when I, and in that case I is the computer computer and that's it. The generative of generative AI doesn't refer to a particular technology or or technical approach, it just refers to the sort of way it's you're using this technology to generate new content items. So writing code, images, video, music, spoken word, whatever it is, right? And the ability to do that is obviously it's got great number of applications, so long as we understand that you need a human in the loop, we need to review or proofread every item that it generates. And I think that that is a, a little bit of a missing link when people get so excited about generative and sort of expect the moon. There's this narrative in the public that we're headed actively towards what's called AGI, artificial general intelligence, which is a very simple concept. It's computers that can do anything a person can do, so let's just call it what it is. It's artificial people, which means you can literally ha use it to run a Fortune 500 company and onboard them just the same as a human employee and let 'em rip. I think that that's unrealistic. I don't think that it's technically impossible, maybe some number of centuries or millennia in the future, but I, my i I argue that we have not, despite how incredibly seemingly human-like generative AI is that it does not represent a concrete step towards a[Matthew Quint] I think the hype of it being AGI is has become controlled once again. Right? I think when chat GPT-4 came out in November, 2022, it's unbelievable leap from where it had been, even just a year before, struck that chord of feeling like it was truly a thinking machine as well as some pieces around, you know, some of the modelers and engineers sort of feeling intense power from some of the models that they developed. But I think the hype has peeled back a little bit because of things like failures that come.[Eric Siegel] Yeah. So the question is then what's the killer app? The, you know, the idea of a killer app is for the, for the personal computer, the Apple two and what have you. It took off because the killer app emerged eventually, which in that case was the spreadsheet and it was called VisiCalc was one of the first main ones. And then all of a sudden people were needed to buy this, have their own computer on their own desk. Right Now there are some really great value adds. So for example, if you're a coder and you're advanced enough, you can use its first passes on certain coding code, creating segments or, or even pages of code that solve certain kinds of problems. And then you can use that as a starting point and understand maybe this is helpful, maybe I just need to modify it a bit. And some who are coders say, well, that's the killer app, but I would argue that we're kind of waiting for Gadot, we're waiting for the killer app, which is gonna be something that somehow equals and is autonomous can do things that people normally do without human in the loop at each, at each intervention. And I think, so I wanna agree, I wish I could agree with you that the hype has been tempered, but I don't, I think that in a certain way, in certain pockets it has, but for the most part there's still the narrative that we're gonna get this killer app in that sense that we're headed towards AGI that the problems that it makes, you know, where it sort of makes things up and people call it hallucinate is just sort of a, an issue that needs to be worked out. Whereas no, that's everything. The fact is it's, it's not that it's surprising, it gets things wrong, it's surprising how often it actually gets things right, which is quite a quite remarkable, but it's not dependable in that way. And yes, there's lots of ways to have it actually figure out where, how to make citations. There's a lot of improvements to be made, but none of those are definitively towards the ability for it to autonomously take on. I mean, we're not seeing corporations release chatbots that are run by generative AI run fully autonomously and directly interact with the consumer instead, you know, it might be offering a paragraph of a draft copy for a human customer service agent to then consider pasting into the chat that they're conducting. So that's helpful, that can improve productivity greatly. But there's a big difference. I actually just published an article in Forbes called Three Ways Predictive AI Delivers More Value than Generative AI. And that's sort of the, I irony is that generative at the same time that it's so seemingly human-like actually offers less potential autonomy in comparison to predictive simply because predictive takes on essentially less ambitious, but potentially more impactful tasks. Its goals are to increase a large scale process that's already been systematized. That's what it means to be a mature organization. You've already got these operations in, in targeting marketing, fraud detection, credit risk management operations, supply chain that are already systematic. What you do from learning from data is you make those decisions more correctly, more often, you tip the odds in the numbers game that is business more effectively in your direction and you can actually deploy those models fully autonomously because there's more room for error. It's already making errors in the first place. Now it's gonna have le fewer false alarm ho holds on transactions due to fraud. It's gonna happen less often because the model's in place and, and the conversely, the other error where it allows a fraudulent transaction to go through those errors will also continue, but they'll happen left less often is as we deploy a better machine learning model, a predictive model, that's what, that's what you get from machine learning. So it's kind of funny that the two areas both belong under the quote unquote AI umbrella because it's really apples and oranges. The difference between writing first drafts for a human to then consider and use extremely helpful for many tasks. Although how helpful really depends on just sort of trying it out the context, which language model you might or whichever kind of generative model you're using. Whereas much simpler lightweight models for these predictive tasks have, can have a much bigger broad scale systematic rote impact over very large scale operations that have already been systematized. So now they do belong under the same umbrella and they both are built at the core on machine learning. And because it's not a competition between the two, technically it's not a zero sum game. It's sort of like saying, you know, water are parks and ski resorts so they don't compete. Right? But it is a competition for eyeballs and time and media coverage and, and budgets. And in that respect, I think the world has it wrong. There's, there's, there's too much in the realm of expectation, hyped valuations and investments and generative right now as, as cool as it is. I mean, when I was in, when I was in the PhD program at Columbia for basically six years under Kathy McKeown, who was my advisor and was running the natural language processing research group. So I was part of the natural language processing research group for six years and I never, and after that it was like I came outta that being like, well, it's all edge cases, you know, it's all, you know, we're never gonna see it start to be able to do things in a fluent way the way humans do. I really, I did, I did believe that. And when the IBM computer wanted the game Jeopardy, I thought that was the coolest thing ever. And my first book, Predictive Analytics, I spent a chapter on it, I think it's amazing. And yet that doesn't mean that it's a step towards re recreating human capabilities in general, and it's not necessarily a step towards allowing computers to autonomously take over nearly as much in the realm of how human activity as the narrative would suggest,[Matthew Quint] The traditional predictive machine learning is much more something that occurs behind the scenes and doesn't have the wow factor to the common man because what it's executing isn't something you can just play around with. And so I think that's exactly the hype has risen in large part because of the ability for anyone to play with the new generative AI technology that's out. I read Tom Davenport, who I suspect you know, his book The AI Advantage back in 2018. It's the quote I use sometimes when I'm, speaking about this in class or other environments, is, you know, at the time in 2018, AI was just an extension of the data and analytics that they were already doing, right? Crafting the better predictive models around systems with the data that existed within companies already. So let's talk a little bit about this framework. You call it bizML. So dial a little bit into your, your bizML framework to get into how, how are these predictive analytic systems that are still still scaling in terms of their use across all organizations happening and, and you know, what are the steps to, to effectively craft a biz ML effort?[Eric Siegel] Yeah, so everything you just said is spot on. I mean, it's a great way of putting it, you know, that there's something more intuitive, by definition it's more intuitive, it's dealing with human language, playing with a chatbot like chatGPT or what have you. And yes, the kind of corporate use of predictions on a per case basis for targeting marketing, fraud detection, and all those large scale operations is more behind the scenes, less visible and less sexy. So the, the turns out that the odst projects are actually the sexiest because they're potentially the most valuable and, and, and in, in many proven cases really turn out to be the case. So if predictive is where the greater established track record and potential value tends to be, and every problem you're trying to solve is a case by case basis, what's the particular technology that's best suited for this problem? But more often than not, and definitely more often than the public narrative, would iterate, predictive AI does trump does eclipse generative AI, if you had to compare, right, and I only, so I only say that because I think we need to temper the hype and make sure that we're looking clearly predictive is you, you refer to as classic, a lot of people call it that it's older, but it's not old school. It's where most of the money still is. The vast majority in investments proven returns and opportunities. And, and yet at the same time, it's also where the opportunities are because it's still failing to a large degree. Companies are not mature. There's, other than big tech and a handful of leaders, new machine learning projects actually routinely fail to deploy. And one of the main reasons, this is the intention of the titular AI the, the titular Playbook that you've already referred to called bizML in my book, the AI Playbook it means to address, which is that the world at large, you know, outside of, especially outside of data scientists, the business world haven't, hasn't come to realize that you really do need a very particular specialized business process framework paradigm in order to run machine learning projects. And I'm talking about predictive ai, those kinds of projects successively through so that not only do you do good number crunching and create a viable predictive model, statistically speaking, but it actually gets deployed, it gets integrated into, gets moved into production operationalized, it makes a difference, it makes a change. The existing large scale operations, it's meant to improve, actually do change by way of its predictions in terms of targeting marketing and fraud detection and all these kinds of things. And that last step is the big one. That's the whopper and it's the one that's tends to not get planned rigorously enough for we have to put our money where our mouth is and actually plan these projects. Not so much that they're just technology projects and we're using the best technology. Great. But that actually, this is a process change project. It's an operations improvement project that happens to use machine learning. So there's a reframing needed and there's an understanding that we need to have a clear, clearly described business practice. So I break it down into six steps in the book, corresponding to six main chapters. But the, the biggest takeaway before I break it into those steps that culminate with the deployment is to understanding that we need a deep collaboration between the quants and the business people, between the, the, the beauties and the geeks between the data scientists and their client. The stakeholders who are, are, are in charge of whatever large scale operation is meant to be improved with the predictions output by machine learning model. And that they, that collaboration must be deep end to end across all six steps so that, you know, we actually can plan for a feasible, meaningful, and all. And everyone's on the same page agreeing to how operations are gonna change according to the math, according to the predictions, the probabilities output by the model, which means that the business side people need to ramp up on a certain semi-technical understanding and in, and, and basically that consists of, for any given project, what's predicted, how well, and what's done about it. So that trio is what it really comes down to. So for example, I'm gonna predict which customer is gonna cancel in the next three months in order to target a retention offer, a discount that I couldn't afford to give to all my entire customer base. But if I target it, you know, prediction is the only recourse to target that kind of offering effectively. And then how well is, what are the metrics that describe how good the model is, how well it works, its performance level, both in terms of pure predictive performance and also in terms of the potential business metric improvements, the KPIs, the, the, you know, the profit, the return on investment and, and the savings or whatever the project's meant to do. So everyone's gotta get on the same page with regard to those semi-technical, this isn't the rocket science, this is what's, this is how to use the rocket science, how to capitalize on it. And you need to get involved with that level of detail as a stakeholder so that you can be involved in an informed way across the project.[Matthew Quint] Yeah, and I think that's what, you know, I enjoyed out of the book being on the non-techincal side, but capable of understanding the basics of technical discussion was the need to link those elements of what is the model predicting and what is the business value and how to apply something that is making good use of the predictive model and including where the prediction, what I liked very much was right, a prediction that has any value, any better capabilities than quote unquote gut or executing something with no attempt at predicting what's gonna happen. Just blasting out and seeing what lands is value. It reminds me of, you know, being a fan of Richard Dawkins and Dawkins has a very evolutionary standpoint of like, if you can never, if you only see darkness, just being able to see a little bit of light is an incredible value that eventually turns into this crazy mechanism that is the human eye, for example. But each step is a little bit of an improvement and can do wonders. Talk a little bit about those, the how you contemplate these small improvements that prediction can have, but then there's some large value from it. Maybe give one example.[Eric Siegel] Yeah, sure. I mean, I, I mean I I show arithmetic where, you know, if you, if you have a, if you're doing direct marketing like mass marketing, you might have let's say in a, in the best case scenario, 1% response rate. So you're literally wrong, 99 of a hundred times every that you're send spending $2 to send a brochure, a color brochure. If you can target that more effectively with prediction and for example, find 25% of the customers that are three times more likely than average to respond, right? That's, that's a significant lift. That's called a lift of three. So that's one of the metrics, it's one of the technical metrics. It doesn't directly tell you the business bottom line, but it's the kind of thing data scientists work with. It's sort of just the absolute relative pure predictive performance. You've identified a pocket three times more likely than average to respond to marketing, but 3% is not a slam dunk. You still don't have any high confidence about any individual one who definitely will buy however you're tipping the numbers game. So if you use that and target your marketing more effectively to, to a group like that, you know, under the right circumstances, the bottom line profit of the marketing campaign could easily skyrocket by a factor of more than five. So that just takes them back of the nap and arithmetic to actually show that. So I call that the, the prediction effect, which is that a little prediction goes the long way.[Matthew Quint] You have a couple of these nice terms, you have another one called the accuracy fallacy, which I think gets back into that hype thing. And this, this issue we have where a lot of the language and dialogue is actually about kind of the sort of the wrong things for the operational and business value elements that come out of effectively, you know, deploying a well run model. So maybe you can dive into that where we get, you know, the press gets caught up in this accuracy of a model, but that's not necessarily what you care about in terms of its down the road impact for business decisions.[Eric Siegel] Yeah. So what I call the accuracy fallacy is that people researchers publish articles in, in a way that it sort of ensnares the media by purporting quote unquote high accuracy. Now accuracy is just how often are you correct? And it turns out that that's usually not a pertinent metric, it's only a technical metric doesn't tell you directly the business value and it also tends to be very, very misleading. So here's a, here are a few real headlines. Newsweek said "AI can tell if you're gay, it predicts sexuality from one photo with startling accuracy." Spectator "linguistic analysis can accurately predict psychosis." The Daily Mail said "AI powered scans can identify people at risk of a fatal heart attack almost a decade in advance with 90% accuracy." And one more example in Next Web, the next web said this scary AI has learned how to predict, excuse me, has learned how to pick out criminals by their faces. So the fact is, all of these projects were able to get lift, like I just mentioned, you know, they were able to predict better than guessing, which is oftentimes very valuable. But the way these things are phrased, using the word accuracy implies that they tend to do a good job discerning positive and negative cases. And it's usually a binary classification you're trying to predict yes versus no is a criminal, is not a criminal will will purchase our product. Will not, is fraudulent, is not, it tends to be yes no predictions and that it can discern between those two classes and usually be correct a great majority of the time for positive cases and for negative cases, which would be clairvoyance. We don't have clairvoyance and we can't expect computers to have clairvoyance, but the way these things are presented, even casual technical readers who don't dig into the detail will potentially come away with, with that misleading narrative. It's totally a lie. And I call it the accuracy fallacy. The fact is, when you're trying to predict outcomes and behaviors such as for many of these human behaviors who will click by lie or die or commit an act of fraud, what you can do is predict better than guessing, but you cannot get it. So that it's discerning in that in the sense of being correct of vast majority of the time for both positive and negative cases. So I break that, that's how I open the chapter on metrics. I'm like, look, let's get real people. And if you look at the back of the napkin arithmetic, you're like, oh, okay, yeah, you know, it can tell who's gay two thirds of the time, but if you do that, it's actually gonna be wrong half the time that it says somebody is gay. And, and the problem is, whatever you're trying to predict, there's typically a minority class. So in that particular example that came out of Stanford predicting sexual orientation by photograph in their base data, it was heterosexual 93% and and gay 7%. And that tends to be the case. Who's gonna buy? Maybe that's 1% 0.1%. How many cases are fraudulent? That tends to be one out of a thousand. The things that are most important and are the things that are most unusual, the things for a business that a business is gonna be able to leverage are things that happen less often, who's our very most valuable customer, this kind of thing. And because of that imbalance, it makes it very hard just mathematically speaking to discern in instances of that minority class without having lots of false positive, without incorrectly labeling a lot of the majority class as the minority class. So that just, it just turns out to be really hard. So it's really important to be realistic of about how well we can actually predict and avoid this misleading stuff. I mean, I, I literally, so it's the opening of that chapter. I've also got it as a Scientific American blog article and that they both link to a blog article that I have on my own blog where I then list like 70 other examples where it's just a headline that's just absolutely ridiculous if you take it at face value and if you dig into the research paper behind it, it's just not doing what it, what it's conveying has been done.[Matthew Quint] The Stanford one that you used as an example, I thought was great, where it's like, well, it's great at predicting if it knows it, it's given two faces and knows in advance that one of them is gay and one of them is not. It, it's really good, but that's not the real world, you know, the real world is to your point, only 7% or whatever it is, you know, take a number, a small percentage of people are actually gay. So, so show two random faces and it doesn't know if either of them's gay, the computer is never gonna have that level of accuracy.[Eric Siegel] Yeah, whatever, whatever you're predicting, that does tend to be one of, of probably the most popular technical metric that, that you just described. I call it the pairing test. So you, you give it a positive and a negative negative example, it's already pre pre-ordained and known that one's positive and one's negative, which of course you couldn't do in normal normally unless you've already solved the problem. And then how often would it correctly discern one from the other? That's a misleading metric depending on how you phrase and deliver it. And especially when they incorrectly use the word accuracy to, to refer to it. For those technical listeners who are familiar with, with predictive modeling and this machine learning, you'll know it as AUC, area under the receiver, operating characteristic curve. What most people, even the most technical listeners don't know is that that metric AUC is equivalent to the pairing test. So when people tell that narrative of the pairing test, it tells a sweet story, they're alluding to that very popular technical metric, which is useful to, to in a way to show, hey, this thing's predicting better than guessing, we've gotten some value for from the data, but it's very indirectly related to any potential business value, whatever operation you're trying to improve. So let's taking a bigger step back, what we're talking about is clickbait, right? We're talking about the clickbait culture has infiltrated the best of running business with science. Let me say that again, that's a really big deal. The clickbait culture has infiltrated the best of running business with science. I should have put that in the book. If truth is not necessarily the goal, but clicks is the goal, right? Yeah. Which, which I know you'd mentioned earlier in an email that you wanted to potentially get into that sort of, what does it mean when we're using this? What I just refer to, you know, as the best of science for business. And when I say the best, I mean the most potent, right? But what does it mean when we use that to optimize, you know, social media, eye eyeballs at all costs or at least at, not necessarily at all costs, but with disregard to, to sort of the side effects or any other goals if the only objective and and broadly speaking, when you wanna get into the, the ethical ramifications, the, you know, broadly speaking, the problem is when you only are pursuing one goal and you're not also quantifying other considerations and taking them into account, right? And that's when people start being concerned about social injustice that's perpetrated by predictive models.[Matthew Quint] I know for example, I come from a nuclear science policy background, Eric, prior to my time at Columbia and you know, in the energy field and, and in other areas, right? What I learned was this term externalities. And I don't think the business world has done nearly enough business and society, right? Governments, business, society to contemplate externalities. What are the spinoff effects of what you're doing? How do you see, how do you work with tackling those challenges with AI projects or providing recommendations in the field, or who do you think is doing it well, that kind of thing. What, what, you know, the, these spinoff implications of AI. How, how do you recommend companies think about that?[Eric Siegel] Yeah, I mean I think what you just described, you know, is, is what, what does it mean to align ex, excuse me, align incentives, right? And where should that governance come from, right? Does it come from the government, right? Is it internal governance, is it industry governance? And I know I, that's above my pay grade. It would be nice if we all held hands and, and and had everyone's best interests in mind. And I think there's a lot of efforts headed in that direction, broadly speaking. You know, there's sort of two different ethical considerations that we've alluded to. One is that if it's just optimizing eyeballs and you, and as you mentioned, it can be sneaky, it can kinda sneak up on you what the ethical problem there is. You know, we're just trying to get as many advertisements seen by as many eyeballs as possible, which means we want to engage with content. And that sounds innocent enough maybe, I mean, if you say it depends on your phraseology and tone of voice, right? But it's really very closely analogous to an online gambling site trying to make sure that the customers upload their next 10 bucks to lose, right? And in that sense, retain the customer where there's a certain, at least a certain segment of the populations is particularly vulnerable to the, the highs. Just as we're all, it's so heady and intoxicating to, to buy into the AI hype and be like, these machines are just gonna become all powerful, right? Because once it's human level, it's also almost god-like,'cause it can self-improve, right? So it's a very he heady narrative. Likewise, gambling is very heady, right? There's an an analogy there and there's an analogy between that and just the addictiveness of social media. It's not something we can ignore. On the other hand, there's also just ethical considerations of look at the, the human decisions that are consequential being driven as far as who gets approved for a loan, who gets housing, and even who gets free from incarceration when you're using a predictive model to drive parole and sentencing decisions as in predictive policing applications. And in that case it's making consequential decisions. So I feel that in this latter case, and I've, I've done a lot of writing about that. I've, I've, I've published a dozen op-eds in Thes in San Francisco Chronicle, the Scientific American blog on, on those kinds of civil rights, social justice issues that come with the deployment of predictive models. And they're all available, if anyone wants to read my 12 op-eds, they're all at civil rights data. So you can just go to civilrightsdata.com and you can see those and some videos that also go explain it a little better. And in that realm, you know, I think that basically you can quantify and address quote unquote bias, which would exist, for example, where there's a disparate false positive rate. In other words, wrong, costly decisions are made more often for some one group than another. And that's where it becomes what they call bias. And I, I think that in most cases you can address that without much cost, without any significant cost to the overall sort of primary business objective or organizational objective. That's my opinion. That that's, that's the trend I think that we're moving towards. Whereas this decision about, Hey, should we make social media less addictive? Should we make online gambling less addictive? Those questions, I'm not sure there's such a, such potential for addressing the, those ethical, ethical concerns without costing the corporation a bottom line.- Right? And then that becomes something that society takes a role in, right? In terms of government regulation. And we're gonna, I don't think we'll dive into that, but we are gonna see, and the, the wheels are spinning right now on already as we know on how these kind of models are going to be viewed from a regulatory standpoint and where there'll be permissions, will there be exceptions to these things? What will need to be dialed down in certain ways for the betterment of society? It gets into another, I think, really important thing in the book that you talk about and for the fields, both on the business side and as you bring up a lot in the book on the data science side, which is the data on the ethical side. A lot of these issues we're talking about, for example, predictive models for parole decisions. The challenge there are human decisions are baked into the data that's being analyzed for predicting a likely recidivist. But if in, as you note in the book, we have more police in areas in which there are more African Americans, they're by nature are gonna get, you know, incarcerated again more. And so the data is based on a human, human decision making, which creates a kind of bias data set which goes in. So moving aside from the ethics, what that data is, is hugely important to create the model as important or more important as you argue in the book, than the model. So talk through that, the, the importance of the data contemplation overall, and then the spinoff, which is the matching the data as you're building to the data you're expecting to deploy on. Yeah,- Let me clarify a couple things. I think people in general sort of familiar with, Hey, okay, look, there's sy there's systemic racism and that's gonna pan out in the data. So anything that learns from the data is gonna somehow inherit that. That's true in a way, in ways that people maybe haven't quite realized and is untrue in other ways that people, so the way in which it's sort of untrue is that you don't typically simply try to get the computer to predict what the human decision maker would do. So you're not just trying to replicate what an HR director's hiring decisions would be, or you know, what a parole board's decision would be. Instead you're actually trying to predict the pertinent outcome or behavior. So once released, will this convict be prosecuted and convicted again, that's called recidivism. Once hired, will this prospective employee excel at the job, stick around for at least two years, whatever the outcome of behavior you care about as an organization. So in that way, the whole point is that ma very, very, there's very much potential that the computer's gonna be less biased than the humans because it's gonna make those decisions based on the particulars of the individual. And it's, or rather it's gonna make those predictions based on the particulars of the individual. It's gonna learn to do that as best as possible without whatever preordained biases the humans making those decisions might have been doing. But it turns out that there's a couple ways in which historical data, of course, is heading things in a wrong direction, which by the way, we can account for, right? If we're, if, if that's our goal, if we care about such things. So one is the ground truth. So for example, in the case of recidivism prediction, the problem there is that what we really care about is whether they're gonna commit a crime again. Whether they get convicted again is only a proxy for that. And, and, you know, with certain, with certain subgroups being more heavily policed and investigated, they're gonna get away with the crime less often. So it's gonna artificially inflate those groups' representation across the quote unquote positive ca examples from which the computer learns the cases where they did recidivate. That's one problem. Even if you somehow eliminated or adjusted for that problem, it just turns out that underprivileged groups who are higher risk for whatever the outcomes are that are gonna drive these decisions in pre predictive policing, loan applications and job hiring and everything, because they're kind of, because those, these communi communities are in that cycle of disadvantage. They're gonna have, they're gonna, depending on the population, they may display negative outcomes more often, which turns out to lead directly to higher false positive rates. So even if the model predicts more negative outcomes for this population, it's gonna also do something which is horrid, which is gonna make those dreaded false positives where somebody basically gets stuck in jail longer than they should, gets declined a loan that they would've paid back, right? Whatever it is. And those errors, those costly individual errors are gonna happen more often for people of underprivileged groups. And that's, there's an article in ProPublica called machine bias. Yeah. And that's what it's about. It's showing that African American convicts are getting those false positive flags that basically those high risk flags that are gonna directly inform parole and sentencing decisions more often than white convicts. [Matthew Quint] I hate When it's called machine bias to be honest, Eric, because I wish the dialogue was more what you said, which is when can the machines who don't have an innate bias be used to counteract the human biases that exist within the data because of the human decisions prior, you know, the human decisions that are made in certain areas, right? That are, that are creating a data set that the machine is analyzing with no bias. It's just this is the data I have and I'm gonna analyze based on the data, right? And so I'm hoping at some point we'll get into how can we craft the models to help combat some of these flaws in the data. So let's talk a bit about the data. It's the question you ask of the data that matters, right? Is what comes out a lot. And then I think there's the corollary from reading the book, which is also a question around what data is necessary to begin asking the right questions of,[Eric Siegel] Well, I mean the most important thing is, is that thing you're trying to predict, right? So what the behavioral outcome is, so the, the data from what you, the system learns is a bunch of examples where you already know the positive or negative that it's labeled in that sense, sometimes manually labeled, right? So in the case of medical diagnosis from radiology, imagery, imagery, is there a traffic light in this picture to help train something that would help a autonomous vehicle? But for a lot or maybe the majority of these kinds of predictive use cases that we've been discussing, history has spoken. So you don't need to manually label it. You're learning that's data is experience, it's a long list of previous events and that's experience from which to learn. So it turns out that what's really important is that that la those labels, what's called the dependent variable to data scientists is sound and is not too problematic of a proxy. As I mentioned with recidivism. Another proxy that's problematic for example is healthcare spend where it's sort of like how much healthcare are these patients gonna need? There have been studies where they used the amount spent on the patient, which is sort of related to how much treatment they got, not necessarily related to how much treatment they needed. Right? And if there's an underserved population, right? Then once again you've got this proxy that's problematically off. But if you can get the labels correct, it turns out that the other parts of the data, the characteristics of the individuals that are fed into the model, that the, the factors on which the predictions are based, there's a lot of robustness to error there. There can be a lot of noise, there can be incorrect values as long as it tends to be consistent. We're really talking about predicting better than guessing, tipping the odds. And there's always gonna be quote unquote noise in the sense that we don't understand all the values or why they happen some, whether the difference between that and them being technically incorrect quantitatively or mathematically isn't that big a difference. So the really important thing is to make sure the labels are correct, but even if you kind of get rid of that proxy problem, even if the labels are correct, fully correct and sound, there are still the kind of societal issues I mentioned a moment ago. We need to be paying attention. I think one of the main things is, is the false positive rate. How often are damaging incorrect decisions made the the computers humans are never gonna predict perfectly, neither will computers, but if it's systematically making costly errors more for one group than another, that's a pretty big deal.[Matthew Quint] I think with the data, what I was also trying to get at was, you know, making sure you're independent variables our robust to actually analyze, to get at what, what do I need to tweak to get to the, you know, the binary yes. That I want or the binary no that I want. So talk a little bit about that, that sort you now I just blanked on what it's the model engineering, is that the, you you had a phrase in the book around it? Yeah,[Eric Siegel] Data engineering, I mean that's, that's sort of the, the sort of dirty, the dirty secret to data sci, the life of a data scientist who's doing machine learning is that most of the work is actually preparing the data and that is getting into that right? Correct form and format so that you can use machine learning software. The core rocket science, the fun part, the using the machine learning software to produce a model that's very, very fast in comparison to what it usually takes, at least for a new project to pull together the right data. And a big part of that is trying to figure out what are those input variables? You, you refer to them as independent variables. That's, that is the technical name for it. And if there's this general question, especially for newcomers, you know, well what data do I need? It turns out that you need whatever data you can get. It's really just pragmatic, you know, any and all data could potentially help it. Data tends to be siloed or expensive to get if it's not already put there. Usually you start with whatever the lowest hanging fruit is and try it out. There tends to be diminishing returns as you expand the data, but not entirely because behavioral data tends to do better behavior, predicts proje behavior better than demographics, which sort of goes along with this, making the model less discriminatory based more on things, you know, predict what you've, what you're gonna do based on what you've done. But it really does. There's, there's not, it's an art, not a science. I mean, you never know what other inputs could help predict for this particular project better until you try it. But it can be expensive or cumbersome to cumbersome to expand your data to widen in that sense. So the answer just sort of what's the best data is just like, use what you've got and get more if you can. I mean it's really, that's really how it ends up working out.[Matthew Quint] The training versus the deployment can be distinctive and talk about where that can be an error where you've, you've got a data set, you've worked to kind of build your model and do it, but then when you put it out in the real world and the actual data you're getting from the real world is different from the training model data. So the model's not gonna work as effectively and and talk about how those dangers can pop up.[Eric Siegel Yeah, I mean you never really know for sure how well it's gonna predict on tomorrow's, in tomorrow's situations. That is to say on tomorrow's data until you try it. But there is one best practice that tends to do a good job approximating how well it's gonna do, which is that you always evaluate the model on held aside test data. So you quarantine a bunch of examples that look, that you take the training data and you pull out part of it, you just, you take a random sample out and that's called the test data. And it's quarantined in the sense that it, that the learning process, the modeling software doesn't get to cheat, it doesn't get to see those examples. So they serve as a, the basis for an objective estimation of how well the model will perform in general over cases never before seen. So that, and that's not just best practices, it's only, it's, it's essentially only practice the, when you see people report the model performance on the same data used to train the model, that's totally cheating and it's an egregious error and you almost never see it at this point these days. So it's really the only process. And in one way or another, the model's performance definitely will degrade eventually it may degrade next week, it may degrade next we year. It depends on the context, the nature of the problem and how much the world is changing. You always need to, to periodically up update the model, refreshing it by training over more recent data.[Matthew Quint] So I wanna jump into one thing we always ask, which is always a little different. For someone like you who's an author by nature, you have presented what we would call a BRITE idea. You have crafted a book around a concept that you think is impactful to the industry. You work on the world at large, whatever it is. But we still, even with authors, like to give you a chance to share something that you think of as a bright idea, what is not happening now? Where, where do you see the potential for a change, you know, within the industry you work in with a problem that you see in the world that, gosh, you wish that this could be something that would be, that would move forward that an organization or a group would move forward and, you know, help improve society?[Eric Siegel] Well, I have one that I'm, I'm diehard passionate about. So much so that I, I've, I've, I've, I've co-founded a startup company and I, I know I, maybe I'm bearing the lead here, but it's very early, so we're still stealth in the sense that there's nothing much on on the website. It's, it's gooder gooder, ai make your AI more gooder. And, and the issue is that it's something we started to allude to, which is that these predictive AI projects tend to measure the performance of the model on the test, always on the test set, which is a good thing only in those technical terms, right? In terms accuracy, precision recall the area under the curve thing we discussed and that's where they stop. What about the business value? It's almost like nobody's asking how good is ai, how much business value would it deliver depending on how you deploy it. So that difference between a technical metric, which is helpful, tells you it's better than guessing, it tells, it only tells you the relative pure predictive performance compared to a baseline link such as random guessing, right? Versus the absolute value of the model, how much value we deliver in terms of profit, top line revenue, number of customers saved, number of dollars saved, whatever, any, and all of the business metrics, the KPIs that pertain for the intended purpose of the model deployment, the operationalization of those predictions, that seems like such a fundamental missing step. So I've actually enco encapsulated that question, that issue, that problem in a recent MIT Sloan management review article and it's called What Leaders Should Know About Measuring AI Project Value. Now this is adapted from the book, The AI Playbook. It's the one part of the metrics chapter that I'm, I'm focused on most heavily now and moving forward, which is that, look we've got data scientists, they're only trained to deliver, to report on these technical metrics. The software they use pretty much only does those metrics and then you've got their stakeholders, the decision makers who need to see something that relates to the business goals, the business strategy like profit and it's not being done. Why isn't it being done? I believe that we need, well part of the reason is that it's very easy and comfortable for the techie to report on the technical metrics because you can do that in a vacuum. It's just the absolute predictive performance. But what if you wanna, you, you can't say, hey look, this model's worth a million bucks. It doesn't have an absolute value. It depends on how you use it. That is to say the parameters of the the deployment, you're gonna use this model to triage order and prioritize customers or transactions to either contact or audit or whatever the operation you're changing is. So how far down the list are you gonna go? Are you gonna spend the money to market to the top 10% of customers most likely to buy the top 40%? Where do you draw that line? Are you gonna automatically stop 0.1% of transactions as potentially fraudulent or 0.2%? There's a huge difference and banks are making those decisions all the time. So that's an example of the kind of parameters that defines the deployments operationalization, the way in which you're gonna act on or use the model and therefore you what, when it comes to actually moving from those arcane technical metrics to the thing we all care about, which is the business metrics like profit, you actually need to be able to sort of interact in a friendly way and try what if scenarios, how, well what if I use the model this way or that way and see what it does amongst what may be a bunch of competing business metrics. So that, that's my thing. It's like, let's ask how good AI is.[Matthew Quint] Thank you for your time. I'm gonna leave you with one last quick question that we ask all our guests, which is, you know, being a brand center, we always love to hear from each guest to talk about a favorite brand that you have, whether it be related to your professional or personal, like this is a brand I love, could be any category, et cetera. The only caveat we have is if it's gonna be a iconic brand like Disney or Nike or something, you need to be even more detailed about the why you love that brand.[Eirc Siegel] Well, I mean, just my subjective, honest, personal answer is Apple, because I grew up on it. I was, I was 11 and I got my Apple two plus. So listen, I, I think my iPhone is very Star Trek. I don't think that means other things in Star Trek are gonna come true. And I love my MacBooks. I, I don't mean that, I think everything Apple does is perfect, but these products are amazing and it's just incredible to me that this is the same company that I got my first computer when I was 11. An Apple two plus with 48K and 40 columns of only uppercase text and one floppy, five and a half inch disc drive connected to our household television, 16 inch color television. So yeah, I mean, so I've got an affinity there for sure.[Matthew Quint] Many of our favorite brands, the why is Nostalgia. Well thank you again so much Eric for your time. It was really great to host you on the BRITE Ideas podcast today and look forward to, you know, talking to you again and wish you the best for your future of AI efforts.[Eric Siegel] Yeah, thanks so much for having me on the show, Matt. I really appreciate the platform being able to talk to your audience.[Outro narrator] Please subscribe to BRITE Ideas on your favorite podcast service. We'd like to thank once again our sponsors Lexicon branding and Kogan Page. For more information about the BRITEIdeas Podcast and Columbia Business School Brand Center, please visit briteideas.co.