Vision Pro, future in music?

for me, interacting with physical space/objects is not really about eye tracking / gaze, this is a separate tech.

mapping is done with the ‘spacial’ cameras, then as you say, the gaze is just used as a ‘controller’.

the Quest already does physical object mapping, but then its used with conventional controllers, rather than gaze (it lacks eye tracking) , Quest also has hand/gesture tracking.

(psvr2 does not attempt any kind of AR, given Sony are really only interested, currently?, in VR space)

however, I should clarify…

Vision Pro is exceptional in having more processing power, more accurate tracking etc, being wireless, better pass thru, whilst being standalone. ( * )
so, whilst not perfect, it is a leader in the xR tech, and with this tech it means there are certain things are more fluid, to the point of ‘high utility’ that are simply not viable on other headsets.

however, with a bit of experience in VR, what Ive noticed is, even on ‘lesser’ platforms, tech is frankly not the issue, even if you might want it better.
ok, Quest 2 was ‘fine’, but perhaps a bit limiting. however, the current generations of PCVR, PSVR2, Quest 3, whilst not perfect, really do it all ‘well enough’ to see whats possible.
yet there is still something ‘missing’ , and frankly, Vision Pro does not address this… its just better at doing whats already possible.

the main issue with xR is finding use-cases that actually really benefit from it, to the point it overcomes the limitations ( all headsets are cumbersome too put it mildly) .
so often, apps feels like tech demos, simply because they are solutions looking for problems to solve.

this is where VR wins big time, there is no doubting the immersive benefits of VR.
AR struggles a bit, even with Vision Pro and its good optics, you are not going to wear it, or carry it around all day. so its task oriented.

and here’s the thing.
Im happy to spend a few hours in VR watching a film, or playing games… its fun.

there is absolutely no way on earth, I would want to spend hours in VR/AR on productivity tasks.
in this sense, most of the use-case in Apple’s trailer were absolute nonsense :wink:

sure, as a developer, I could have a ton of workspaces float around me, without having to have multiple monitors etc - but it’d be dreadful to have a headset stuck to my face for hours on end ‘for work’, let alone the health issues at looking at a screen for hours on end that is 3cm from your eyes, and fans blowing (drying eyes out) to keep it cool.
its a nightmare , not a dream.

so yes, its dreamlike… but still in this form factor , I think our imagination is much stronger than the tech !


( * ) none of this is surprising, Apple have never been revolutionary in tech…
rather they are fantastic and packaging up high tech, into a user friendly experience. they only enter a market when it gets to a certain point of ‘utility’
(eg. there were not the first in the tablet, smartphone space, or even mp3 players… and little they did there was revolutionary… except perhaps the App/iTunes Store i.e. user experience)

3 Likes

One interesting aspect will be latency. E.g. I looked a little at theremins at some point. Usually those are fully analog, so no perceivable latency. Then there is e.g. the Theremini, which does the sound generation digitally. And this is looked down on from “serious” Theremin players, mainly because of the latency.
There are other digital theremins like the OpenTheremin or D-Lev, the first being based on an Arduino and the latter on an FPGA. And those seem to be fast enough.

Will be interesting how much latency one can squeeze out of the handtracking over synthesis to audio out path.
But even if a sub 10 ms end-to-end loop should not be possible, one could perhaps still have fun with more ambient-like music.
At least VR headset systems are more likely optimized for low latency than e.g. most smartphones, as visual/hearable latency makes dizzy very fast. iOS is known for good latency. And I have high hopes for the Quest3 specific Android distribution for said reasons.

1 Like

Yes, I absolutely agree to this but in the sense that it is the main something that is missing from our regular expreience of interacting with the world (and sounds).

Gaze, on the other hand, as output/control is an addition to that and not nececarily connected to haptics or our regular muscle memory, but an addition to what we are already proficient at. This is what I find exiting with it, a (somewhat) new horizon if you will! :wink:

This would be in a form, though, where the technology of gaze as control is used in a way that extens beyond just pointing and snapping with fingers to interact with the mapped space and objects. I wish for it to be explored further.

Yeah, this is something that comes from the hopes of what this could become someday. I am also personally less interested in complete immersion than in new ways of interacting with what I am already immersed in all around.

(and lighter/no fans/screens close to eyes et cetera et cetera :slight_smile: )

I found a slight hint of this in the approach of the Vision Pro, and as the only thing that really stood out to me apart from being a more powerful (and expensive) version of what is already there.

Since this is only one side of the device, and one that perhaps more points at something in the future than what is right now, I don’t want to drive the topic too much either.

This is something I also think will be fundamentally important to both adoption and application of whichever technology that will spread. I really like single player games, though. (And especially point-and-click.) :slight_smile:

1 Like

Geert noodling around with Animoog Galaxy :slight_smile:

1 Like

I agree that gaze will be huge, but not for expression. Flat out, the eye doesn’t travel smoothly enough.

But it’s fantastic for ensuring that a hand is manipulating the controls you intended. Or in a broader sense, for reconfiguring your gear without taking your hands off of it. eg, switching modes and loading presets; essentially hands-free keyswitches.

(I think we need something like a quantized “slow blink” gesture to replace the “gaze and pinch” selectors, but that should be easy enough to implement)

Integration with something like Divisimate or Camelot Pro would make the use case dramatically obvious.

1 Like

honestly, I really don’t like this pinch gesture apple are using… Ive some ideas about why they are using it
its not very immersive/gestural… but, its way to much like ‘clicking a mouse button’, just with your fingers.

frankly, to me it feels very ‘low effort’ on Apple’s part.
ie… how can we convert apps that use mouse pointer/buttons to xR with least effort!

look at VR apps/games and you’ll see that there are many years of experience with vr gestures… what works, what doesn’t… and apple have frankly ignored that experience.

I know… early days, but lets remember, Apple have being doing R&D in this area for a very long time, we know this from various rumours/leaks… and this is supposed to be a $4k premium experience :wink:

1 Like

Could it partly also be about general familiarity for a very broad audience, something close to the sense of clicking and tapping most people are very comfortable with? It looks very ‘effortless’ and ‘casual’ in the promotional material.

I am not sure why the first iPhones had a home button, but I imagine part of it was to not remove all the buttons at once and only present a blank surface.

Right now, I am borrowing an old iPhone with a home button in order to have an international phone number and I have noticed I nowadays barely can use it anymore! :slight_smile:

2 Likes

One aspect is afaik also privacy. As far as I understood, e.g. eyetracking information isn’t continuously available to apps, only if the user does this pinch gesture. Also not sure whether you even get the full hand model or just one “click point” per hand, if the fingers are together. That would have an impact on possibilities (and is afaik different on Quest). But I haven’t investigated too deeply yet, so perhaps more is possible?

2 Likes

I think there are a bunch of different reasons…
two design decision I think are:

a) discrete
imagine being in a coffee shop using your VP, you are already looking like a bit of a ‘nerd’.
but now imagine waving your hands around the air, touching (invisible) stuff.
you’re going to look like a right wierdo :wink:

many tech companies wouldn’t care about this… but this is the kind of thing Apple are ‘concerned’ about.

b) space constraints
there are few use-cases the VP really works well for, but one I can really see…
imagine an exec, who is constantly flying… and spending time in hotels, its a fantastic media consumption device ! this is also the kind of person that $4k is not an issue for :wink:

but, even in business class, you dont have enough room to be stretching your arms out and waving them around.

so , Apple reverted to a mouse pointer, just its gaze and click.
this is why there are tracking sensors at the bottom … so it can accurately track when you do the ‘click’ in your lap.

also as I mentioned, it makes it much easier to convert apps, or have hybrid apps.

dont get me wrong, this is a typical Apple move… they’ll always hedge on the side of practicality.


BUT… gaze tracking for selection, as @greaterthanzero mentioned, can be problematic. just due to the way our vision works, and how we use gaze (from evolution)

from PSVR2, I can give examples , good and bad, I’ve seen.

a) GOOD
gaze selection works fantastic for target selection in games, esp. ones that move…
its super fast, as you naturally look at stuff that moves, and then ‘click’ you target.
it feels effortless…
this works, because evolution, has made us pay more attention to things that move… as it could be prey or predator.

play Synapse, it makes you feel super human… like you are selecting stuff directly with your brain.

another fantastic, technical use case is dynamic foveated rendering (DFR), basically because it simulates how we natural use gaze for focus.

B) BAD
menus… so many examples where menu selection is done using gaze and its really bad.
because often, you are trying to focus on something else, but having to look somewhere else to select.

recently, Journey to foundation released, which is an adventure game… when you talk to characters, you are given option on how to respond… you look at the right one, and select.

almost everyone has had a bad time with this due to incorrect selection.

(its become so common an issue, that recently gaze selection has been made optional in titles)

why? because we dont stare at things we want too action, so even when you see the option you want to select, often you’ll casually glance at the other ones. you really have to force yourself to stare it unnatural.
(we are already hearing some VP users repeating this, and ‘hoping’ they get used to it)

again, this is already known in the VR space, there have been eye trackers for a while.

but for sure, I get the compromises Apple is making … and as I said, they are past masters of making tech usable and acceptable.

But I just hope, this metaphor is used in restricted cases, and doesn’t drive how interaction is done… I think an over reliance on gaze will be problematic, as has been seen before.

3 Likes