Algorithmic Observability Should Work Like Machine Readable Food Labels

I’ve been doing a lot of thinking about algorithmic transparency, as well as a more evolved version of it I’ve labeled as algorithmic observability. Many algorithmic developers feel their algorithms should remain black boxes, usually due to intellectual property concerns, but in reality the reasons will vary. My stance is that algorithms should be open source, or at the very least have some mechanisms for auditing, assessing, and verifying that algorithms are doing what they promise, and that algorithms aren’t doing harm behind the scenes.

This is a concept I know algorithm owners and creators will resist, but algorithms observability should work like food labels, but work in a more machine readable way, allowing them to be validated by other external (or internal) systems. Similar to food you buy in the store, you shouldn’t have to give away the whole recipe and secret sauce behind your algorithm, but there should be all the relevant data points, inputs, outputs, and other “ingredients” or “nutrients” that go into the resulting algorithm. I talked about algorithm attribution before, and I think there should be some sort of algorithmic observability manifest, which provides the “label” for an algorithm in a machine readable format. It should give all the relevant sources, attribution, as well as input and outputs for an algorithm–with different schema for different industries.

In addition to there being an algorithmic observability “label” available for all algorithms, there should be live, or at least virtualized, sandboxed instances of the algorithm for verification, and auditing of what is provided on the label. As we saw with the Volkswagen emissions scandal, algorithm owners could cheat, but it would provide an important next step for helping us understand the performance, or lack of performance when it comes to the algorithms we are depending on. Why I call this algorithmic observability, instead of algorithmic transparency, is each algorithm should be observable using it’s existing inputs and outputs (API), and not just be a “window” you can look through. It should be machine readable, and audit-able by other systems in real time, and at scale. Going beyond just being able to see into the black box, but also be able to assess, and audit what is occurring in real time.

Algorithmic observability regulations would work similar to what we see with food and drugs, where if you make claims about your algorithms, they should have to stand up to scrutiny. Meaning there should be standardized algorithmic observability controls for government regulators, industry analysts, and even the media to step up and assess whether or not an algorithm lives up to the hype, or is doing some shady things behind the scenes. Ideally this would be something that technology companies would do on their own, but based upon my current understanding of the landscape, I’m guessing that is highly unlikely, and will be something that has to get mandated by government in a variety of critical industries incrementally. If algorithms are going to impacting our financial markets, elections, legal systems, education, healthcare, and other essential industries, we are going to have to begin the hard work of establishing some sort of framework to ensure they are doing what is being sold, and not hurting or exploiting people along the way.