Nobel Prize-winning economist, Daniel Kahneman once remarked:
“By their very nature, heuristic shortcuts will produce biases, and that is true for both humans and artificial intelligence, but their heuristics of AI are not necessarily the human ones”. This is certainly the case when we talk about “shortcut learning”.
Despite careful testing on the data side, model bias can reveal itself more directly in what the computer vision system learns. This issue of a computer vision model using the wrong visual features for prediction is referred to as shortcut learning.
The black-box nature of many computer vision models renders such shortcuts difficult to find, and as a result, trained models tend not to generalize well to unknown environments. In the paper Recognition in Terra Incognita, Caltech researchers showcase a classification model that does well at finding cows on an evaluation set but fails when asked to classify cows by the beach or other unusual environments. For a computer vision models, visual features indicating grass and mountains may contribute to detecting a cow in the image, while beach or indoor features may heavily weigh against it. It is expected that the model uses such features, but their impact should be understood before deploying such models in production. A company building a cow detector unaware of this fact would disappoint some coastal clients, creating reputational risk.
In this paper, the authors show that face detection benchmarks achieve above-random performance even after removing the hair, face, and clothes of subjects. This indicates that irrelevant background features are being used for prediction. Another piece of research identifies an initial list of such biases that can appear in practice for medical applications. Similar ablation experiments, where the parts of the image relevant for prediction are masked out, can be useful in identifying such shortcuts. Metadata can be a powerful tool to detect and test for some of these shortcuts as well. Statistical dependence between metadata dimensions and the performance of the model can surface concerning shortcuts: if the demographic of a patient is highly correlated with performance then further investigation is needed!
To summarize, shortcut learning happens when your computer vision system is looking at the wrong visual features to make predictions. Such shortcuts can be detected from image data alone, for instance, by measuring reasonable performance despite masking out the regions of the image that matter for prediction. They can also be detected by referring back to your metadata: if there is a strong link between metadata parameters and the performance of the model, then it’s worth taking a closer look. Having practices in place during the machine learning model evaluation process to detect these shortcuts is key to a high-performing model.
Download this guide to delve into the most common LLM security risks and ways to mitigate them.
Subscribe to our newsletter to get the recent updates on Lakera product and other news in the AI LLM world. Be sure you’re on track!
Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.
Several people are typing about AI/ML security. Come join us and 1000+ others in a chat that’s thoroughly SFW.