Monday 17 July 2017

Dear Disney; Let Me Help You VR

Disney posted a video recently from some researchers getting people to catch real balls in virtual reality (VR). It was a nice demo of some technology, and I don't actually want to be down on these researchers, but of course the psychology was lacking and there were some weird moments which I thought I would note for posterity. Also, Disney researchers, if you're reading, call me :)



The purpose of the research (detailed on the web here, paper here) was to compare three different visualisations of a throw to see if people could be assisted to catch real balls while viewing a virtual environment. They could display, in all combinations, either
  1. the real time flight of the ball as measured by motion capture 
  2. the predicted trajectory of the ball, computed (and continually recomputed) using the physics of projectile motion
  3. the predicted catch location of the ball, computed (and continually recomputed) as the point in the predicted trajectory where the ball fell to some unspecified constant height.
They threw 20 tosses in each of the 7 combinations involving showing something on the display; Ball, Trajectory, Target, Ball & Trajectory, Ball & Target, Trajectory & Target, and Ball & Trajectory & Target. 

What they found was that 95% of the balls in the Ball Only condition were caught; the system worked well in the most natural world situation. In addition is was clear that people were tracking the ball in similar ways to the real world; they weren't watching their hands and they brought their hands smoothly towards the target to intercept in a 'just in time' fashion. 

Then their question was whether any of their 'assistive' visualisations would help. 

First problem: performance is already basically at ceiling so they have evidence people need no assistance from the virtual environment if the ball is moving correctly and even if they did and could benefit you wouldn't be able to see it. Never send an engineer to do a psychologist's job, namely designing a behavioural study.

Two catching strategies emerged. They didn't do any kinematics on the person but if you watch the videos the difference to look for is smooth just-in-time reach (prospective online control) vs jerky straight-to-target-and-wait behaviour (more 'predictive' control, although the prediction is done by the VR system). 
  1. The just in time strategy occurred in the Ball, Trajectory, Ball & Trajectory, Ball and Target, and Ball & Trajectory & Target conditions. 
  2. The straight-to-target strategy occurred only in the Target and Trajectory and Target conditions.
If the ball was present, that supported prospective control and people did it. If the Trajectory was present by itself it also supported prospective control and people did it. When all you could see was the Target, surprise surprise people used it instead of tracking the ball, because they couldn't. The one slightly surprising result is that with the Trajectory + Target condition the person used the predictive target only; apparently the target was relatively better than the trajectory. This may have been because the trajectory prediction was noisy and while it improved over the flight of the ball, it was only accurate to within 2cm with 350ms to spare.

Some Problems with the Framing
The authors think their additional sources of information are enhancements, because they've added some stuff. They don't notice that their baseline Ball only data show they were not required, nor did they help. They squeeze an 'enhancement' out by claiming that 
[in the target only condition] rather than having users predict the trajectory of the ball to make the catch, we have reduced cognitive burden and transformed the task to a simpler pointing task. Thus, this result shows that this task model for catching appears to be more efficient from a psychomotor perspective
Er, no. First, we know people don't predict trajectories. Note that all the manipulations were about providing predictive information and unless it was the only source of information, people did not use it because that is not how perception works. People can use weird information when it's all there is to go on but they don't typically use it as a cue and then integrate it later; they just abandon it when something better comes along. Second, while they have clearly transformed the task they have not necessarily made it simpler. They think they have because the person's hand gets to the target location faster. That's no index of interception stability though; I bet if they poked this set up they'd find the prospective catching mode supports much more flexible and adaptive behaviour. So they have not made the task more efficient, just less ecological.

Some Quick Thoughts
The biggest thing they missed is the fact that people caught the ball 95% of the time when all they did was display the real time flight of the ball correctly. This is the case where they simply presented information about the task dynamic, instead of manipulating or augmenting it. To achieve their goal (of integrating dynamic interactions with real objects into virtual experience) all they need to do is preserve perceptual access to the task dynamics. 

Now, this is clearly non-trivial - they had to use high end motion capture, for example. But the whole reason perception and action basically work in the world is that a) there are behaviourally relevant properties of the world and b) they often create information about themselves. If you want to get that working in VR, you don't need to add in physics knowledge; just provide the relevant information!

The fun begins when you are trying to VR physically impossible things. This is why researchers use VR, in general - to break the ecological laws governing the creation of information and seeing what happens. For Disney, the question would be, can you create information about something that wasn't possible before, and what kind of behaviour can that information support? That would be a fascinating project to work on (*cough cough* :) 

No comments:

Post a Comment