I read Alper’s book on conversational user interfaces over the weekend and was struck by this paragraph:
“The holy grail of a conversational system would be one that’s aware of itself — one that knows its own model and internal structure and allows you to change all of that by talking to it. Imagine being able to tell Siri to tone it down a bit with the jokes and that it would then actually do that.”
His point stuck with me because I think this is of particular importance to creative tools. These need to be flexible so that a variety of people can use them in different circumstances. This adaptability is what lends a tool depth.
The depth I am thinking of in creative tools is similar to the one in games, which appears to be derived from a kind of semi-orderedness. In short, you’re looking for a sweet spot between too simple and too complex.
And of course, you need good defaults.
Back to adaptation. This can happen in at least two ways on the interface level: modal or modeless. A simple example of the former would be to go into a preferences window to change the behaviour of your drawing package. Similarly, modeless adaptation happens when you rearrange some panels to better suit the task at hand.
Returning to Siri, the equivalence of modeless adaptation would be to tell her to tone it down when her sense of humor irks you.
For the modal solution, imagine a humor slider in a settings screen somewhere. This would be a terrible solution because it offers a poor mapping of a control to a personality trait. Can you pinpoint on a scale of 1 to 10 your preferred amount of humor in your hypothetical personal assistant? And anyway, doesn’t it depend on a lot of situational things such as your mood, the particular task you’re trying to complete and so on? In short, this requires something more situated and adaptive.
So just being able to tell Siri to tone it down would be the equivalent of rearranging your Photoshop palets. And in a next interaction Siri might carefully try some humor again to gauge your response. And if you encourage her, she might be more humorous again.
Enough about funny Siri for now because it’s a bit of a silly example.
Funny Siri, although she’s a bit of a Silly example, does illustrate another problem I am trying to wrap my head around. How does an intelligent tool for creativity communicate its internal state? Because it is probabilistic, it can’t be easily mapped to a graphic information display. And so our old way of manipulating state, and more specifically adapting a tool to our needs becomes very different too.
It seems to be best for an intelligent system to be open to suggestions from users about how to behave. Adapting an intelligent creative tool is less like rearranging your workspace and more like coordinating with a coworker.
My ideal is for this to be done in the same mode (and so using the same controls) as when doing the work itself. I expect this to allow for more fluid interactions, going back and forth between doing the work at hand, and meta-communication about how the system supports the work. I think if we look at how people collaborate this happens a lot, communication and meta-communication going on continuously in the same channels.
We don’t need a self-aware artificial intelligence to do this. We need to apply what computer scientists call supervised learning. The basic idea is to provide a system with example inputs and desired outputs, and let it infer the necessary rules from them. If the results are unsatisfactory, you simply continue training it until it performs well enough.
A super fun example of this approach is the Wekinator, a piece of machine learning software for creating musical instruments. Below is a video in which Wekinator’s creator Rebecca Fiebrink performs several demos.
Here we have an intelligent system learning from examples. A person manipulating data in stead of code to get to a particular desired behaviour. But what Wekinator lacks and what I expect will be required for this type of thing to really catch on is for the training to happen in the same mode or medium as the performance. The technology seems to be getting there, but there are many interaction design problems remaining to be solved.