In Oakland, Calif., credit scoring giant Fair Isaac (NYSE: FICO) Corp. and its artificial intelligence software are used by two-thirds of the 100 biggest banks in the world to help them decide who to lend money to. If something goes wrong, it can cause a lot of trouble.
Early on in the pandemic, that crisis almost happened. FICO told Reuters that the AI tools that the Bozeman, Montana, company uses to help banks find credit and debit card fraud came to the conclusion that fraudsters must have been busier than usual because online shopping was up.
At a time when people were scrambling to get toilet paper and other necessities, the AI software told banks to turn down millions of good purchases.
FICO says that few consumers were turned down in the end, though. The company said that a group of 20 analysts from around the world who constantly watch its systems suggested temporary changes that stopped spending from being blocked. When there are unusual purchases that could throw off the AI, which is used by 9,000 financial institutions to find fraud on 2 billion cards, the team is automatically notified.
Machine Learning Operations (MLOps) is a new field of work that includes these kinds of corporate teams. FICO and the consulting firm McKinsey & Co. both did surveys last year and found that most organisations don’t keep an eye on AI-based programmes after they are launched.
Scientists who run these systems say the problem is that AI can make a lot of mistakes when real-world situations “drift” away from the examples used to train it. In the case of FICO, the company said that its software expected more in-person shopping than online shopping, and the fact that the ratio was reversed led to a higher number of suspicious transactions being flagged.
Seasonal changes, changes in the quality of the data, or big events like the Pandemic can all cause a string of bad AI predictions.
Imagine a system that told summer shoppers to buy swimsuits even though COVID lockdowns made sweatpants a better choice. Or a face-recognition system failing because people wore masks.
Aleksander Madry, who runs the Center for Deployable Machine Learning at the Massachusetts Institute of Technology, said that the pandemic must have been a “wake-up call” for anyone who wasn’t paying close attention to AI systems because it caused so many changes in behaviour.
He said that drift is a very big problem for organisations that use AI. “That’s what really keeps us from our dream of AI changing everything right now.”
The European Union plans to pass a new AI law as soon as next year that will require some monitoring. This will make it even more important for users to deal with the problem as soon as possible. In new rules for AI released this month, the White House also said that systems should be watched to make sure that their performance doesn’t drop below an acceptable level over time.
If you don’t see problems right away, it can cost you. Unity Software Inc.’s ad software helps get people to play video games. In May, the company predicted that it would lose $110 million in sales this year, or about 8% of its expected total revenue, because customers stopped buying after its AI tool for figuring out who to show ads to stopped working as well as it used to. The company said that its AI system was also to blame because it learned from bad data.
Unity, which is based in San Francisco, wouldn’t say anything more than what was said on its earnings call. Executives there said that Unity was putting alerting and recovery tools in place to catch problems faster. They also said that expansion and adding new features had been more important than monitoring.
In November 2016, real estate marketplace Zillow Group (NASDAQ: ZG) Inc. announced a $304 million writedown on homes it bought for more than they could be resold for, based on a price-forecasting algorithm. The Seattle company said that the AI couldn’t keep up with the fast and unexpected changes in the market, so it left the business of buying and selling.
NEW MARKET
AI can go awry in many ways. Most people know that training data that is skewed by race or other factors can lead to predictions that are unfairly biassed. Surveys and industry experts say that many companies now check data before using it to stop this from happening. Those sources say that few companies think about the risk of a model that works well at first but breaks down later.
“It’s a very important issue,” said Sara Hooker, who runs the research lab Cohere For AI. “How do you keep models fresh when the world around them is changing?”
In the past couple of years, a number of startups and big cloud computing companies have started selling software to track AI’s performance, set alarms, and make fixes. IDC, a global market researcher, thinks that by 2026, spending on tools for AI operations will have grown from $408 million last year to at least $2 billion.
PitchBook, a Seattle company that keeps track of financings, says that venture capital investment in AI development and operations companies rose to nearly $13 billion last year and has already reached $6 billion this year.
Arize AI, which got $38 million from investors last month, lets companies like Uber (NYSE:UBER), Chick-fil-A, and Procter & Gamble monitor their customers (NYSE:PG). Aparna Dhinakaran, the Chief Product Officer, said it was hard for her at a previous job to see quickly when AI predictions were wrong, and her friends told her about their own delays.
“In the world we live in now, you don’t know there’s a problem until it affects your business two months later,” she said.
FRAUD SCORES
Some people who use AI have built their own monitoring systems, and FICO said that was what saved the world at the start of the pandemic.
As more purchases were made online, which is called “card not present” in the business world, alarms went off. Scott Zoldi, FICO’s chief analytics officer, said that in the past, more of this kind of spending was likely to be fraud, and the increase pushed transactions higher on FICO’s 1-to-999 scale, where the higher the number, the more likely it is that it is fraud.
Zoldi said that people’s habits were changing too quickly for the AI system to be rewritten. So, he said, FICO told U.S. clients to only look at and reject transactions with scores above 900, up from 850. It kept clients from having to look over 67% of legitimate transactions that were above the old threshold. Instead, they could focus on cases that were really a problem.
Zoldi said that during the first six months of the pandemic, clients found 25% more fraud in the United States than would have been expected and 60% more fraud in the United Kingdom.
“You can’t be responsible with AI if you’re not keeping an eye on it,” he said.