You Need an AI Risk Scorecard; Here's How to Create One
The path to Responsible AI starts with Awareness
Every day, we interact with Artificial Intelligence. In most cases, multiple times a day. Sometimes we are aware of the fact we are working with AI, and other times we aren't. But when was the last time you really thought about whether using that AI tool was actually worth it? Did you consider the harms it might cause (or have caused), or whether using it was ethically aligned with your own personal ethics?
You've probably guessed that I evaluate AI for harm and ethical alignment more or less constantly. Most of the time, I ask myself a series of questions, dig around to find some information, and make a decision based on my gut.
But that's inefficient. I wasn't always considering the same things and when I would try to justify my rationale as to why I wasn't using something or limiting my use, I would stumble through trying to trace back my logic. And I want to help others to be more intentional in their use of Artificial Intelligence.
But I couldn't do that if I was going off of my gut all the time. I needed a way to help me articulate how I was using AI and why and help other people be able to do the same thing. More people using AI with intention means more Responsible AI in the long run, which benefits everyone. So I came up with the AI Risk Scorecard.
What is an AI Risk Scorecard?
I took the idea from a security risk profile from cybersecurity. Risk refers to how vulnerable you are to threat actors, data loss, or breach— anything that can harm the company. Profile is a little bit confusing, and I wanted to make this more accessible by calling it a Scorecard.
Our AI Risk Scorecard contains a set of criteria that we want to evaluate the use of AI against. Our criteria need to be broader than just “harm caused to me,” we also want to look at harm potentially caused to others, and how well the usage aligns with our own ethics and morals. Below, we’ll talk about some criteria you can use in creating your own personal scorecard.
Once we have our list of criteria, we need to do the evaluation. We’ll assign a score of 0-5, 0 being no harm or perfectly aligned to our values, and 5 meaning that the potential or actual harm for me or others is very high, or the use of AI is very out of alignment with our values. This score is subjective, and unique to you. Someone else can review the same use of AI with the same criteria and come up with a different score.
In the end, we come up with a score for that particular use of AI that allows us to say, “I’m OK with this,” “I’m really not OK with this,” or “I’m really not OK with this, but I also don’t have a choice so at least I’m aware.”
What does an AI Risk Scorecard actually do for us?
It brings intentionality and awareness to our use of AI tools. Taking the time to think through how the use of AI might hurt you or might have hurt others, or how well it aligns with your own values makes you a more informed individual, which lets you make better decisions overall.
It puts structure around how you decide whether to use a tool and how you use it. You start evaluating AI tools using the same language and same understanding instead of treating every new use of AI with a bespoke set of criteria. It gets easier to understand what’s acceptable for you, and what’s not and gives you the words to articulate why that is. It helps frame up how you want to use the AI tool.
It forces you to stop and consider. This might seem a little bit like the first point, but it’s important enough it deserves its own special callout. Humans tend to rush, we want smooth sailing and little friction. It’s just what we do. But there’s a lot of value to pausing, stopping, and taking a breath before taking an action. You’re more aware of what you are doing, you are surprised less often, and you have fewer things you regret.
What doesn’t an AI Risk Scorecard to for us?
An AI Risk Scorecard isn’t going to be the end-all, be-all and it’s not going to account for everything that’s important when evaluating AI. It’s going to have gaps, and that’s OK and expected. It’s also not going to cover every interaction you have with AI. You are likely interacting with AI hundreds of time of day, and you don’t realize it. You can’t evaluate what you don’t know is there.
Finally, the AI Risk Scorecard is going to show you that some uses of AI are have an unacceptable risk associated with them - and you have to keep using them whether its because of lack of options, or your employer, or any other reason. But at least with the scorecard you can define and articulate your objections, even if you can’t stop using the tool.
Building your AI Risk Score Card
Now that we've discussed what our scorecard is, and what it can do for us - let's talk about how to build one and use it for yourself.
Step 1: Decide what matters to You
We need to determine our criteria - how do we actually want to evaluate AI? We want this scorecard to be easy to use, so we don't want 100 different criteria to evaluate. If you pick too many things to consider, it'll be too hard to use, and you won't do it.
While this is a methodology to put some lightweight structure around how you evaluate use of AI, your scores are still subjective. You and I might rate the same criterion differently, based on our own ethics and ethical alignment.
You should aim for 10 or fewer criteria. I break the criteria down into 3 categories, with some examples:
Harm to Me - Based on what the AI does and the information it has about me, how harmful could that be to me directly?
Local Privacy - If just the information I provided to the AI was leaked, could that hurt me somehow?
Global Privacy - If the information I provided to the AI was leaked and combined with other information online, could that hurt me?
Harm to others/society - By using this AI, am I increasing the risk of harming others around me, my community, or society overall?
Training the Model - A tool like OpenAI has an army of folks paid minuscule wages and exposed to psychologically damaging content. Does this use of AI justify that human cost?
Reinforce Biases/Amplifies Harm - Does my using this model somehow reinforce biases or negative interactions with others, or does amplify existing harm?
Environmental Impact - AI takes a lot of power that has a demonstrable carbon impact
Ethical Alignment - Based on the answers in the other two categories, does this usage align with my values?
Sustainability - if your values include being environmentally responsible, how well does this use of AI align with those values?
Sociotechnical - if your values include being socially responsible, how well does this align?
Personal Autonomy - if you value your privacy and your ability to control the use of your data, how well does this align?
Use of AI - more than anything else, do I think that AI should be used in this case?
These are examples and some of the things I personally use to evaluate AI usage. You can start with these 9, and then customize as you go.
What if I can't find info to evaluate?
This is where we are likely to run into the most trouble. There's no transparency requirement or obligation for companies, so we are at the mercy of what they willingly volunteer. Some areas, like the EU, have legislation in the process but there are no obligations right now.
You might need to do some sleuthing to answer your questions. Google searches and taking a look through privacy policies and terms of use can help you evaluate most of the criteria above, and its generally well worth the 10 minutes it takes, especially for high-risk use cases.
There are two ways to approach this if you can't find information. First, the lack of information is an answer in and of itself. If you can't find answers, that means that the company doesn't value it or is trying to hide it. If the risk of something is unknown, you rate it as riskier because you can't define it-- and the unknown is always risky.
Second, you can make assumptions contextually. For example, if you are interacting with an LLM, you can always assume the sustainability impact is high -- regardless of whether the company says it or not. Similarly, the human cost of an LLM also tends to be high. So if you are using ChatGPT or any tool that's powered by ChatGPT as having high impact.
Running the Scorecard
Once you've evaluated all your criteria, you should come up with a score. If you follow my 10 suggestions above, your score will be somewhere between 0 and 45. The higher the number you have, the higher your risk. For example, if I run through an AI Risk Scorecard for the use of ChatGTPT, I come up with 27.
Since 27 is a little bit above my middle value, I put this as a "Moderate Risk." That means I use it sparingly. My biggest concerns with ChatGPT right now are the environmental impacts because of how costly it is to train and provide answers to queries. Knowing that a ChatGPT query is several magnitudes more expensive than a Google search, I default to Google in most cases. When I get stuck, I might escalate to ChatGPT to get unstuck, and then pivot back into other less expensive tools.
The Scorecard helps me 1) articulate my concerns with the use of the AI, and then 2) adjust my behavior so that I can use the tools in ways that are more ethically aligned with my values. It's a win-win.
When to Use the Scorecard
While it might seem like a really good thing to do all the time, the reality is that going through the process for a scorecard is burdensome by design. You want to slow down and consider the tool. But we interact with AI, both knowingly and unknowingly, many times throughout the day. Stopping to do this scorecard for every instance of AI, you'd never get anything done. So you have to prioritize.
You should always do the Scorecard when interacting with things that are really important - anything to do with your health, finances, or employment. Those are high-stakes, high-risk areas, and using AI with them should always be done with consideration.
When you are sharing personal information with a company, beyond the basics of name and email, you could do a Scorecard here. Why? Because if your personal data was leaked, there's a high risk of harm to you. Threat Actors can social engineer a lot about you, with very little information.
For everything else that's low-risk, you might run through your AI Risk Scorecard every so often, but not every time you encounter AI. For example, getting an automated support agent whenever you call in isn't high risk (even if it is exceptionally annoying).
There you have it, a Quick-Start guide to evaluating AI using an AI Risk Scorecard. It gives you a starting point to understanding the harms and ethical alignment of the use of AI to help you determine if to engage with it, and how you want to. Hope you liked it.
Let me know what you think in the comments!