Hacker News new | past | comments | ask | show | jobs | submit login
How does the Hololens 2 matter? (stevesspace.com)
203 points by xwipeoutx on Feb 26, 2019 | hide | past | favorite | 99 comments



"The field of view (FoV) is the extent of the observable world that is seen at any given moment."

FOV is better characterized by a 2-dimensional measurement, like square degrees, or steradians, than a single 1-dimensional measurement, like diagonal angle. "2x improvement" sounds like a perfectly valid way to describe a 2.4x increase in FOV area to me.


The author discusses square degrees, but then makes the following statement:

"if we stack 19 Hololens 2 units perfectly, we have ourselves a fully Holographic sphere)...So don’t get hyped on that!"

...and it's rediculous assesments like this (e.g., which have no bearing on a viable AR display) that cause me to stop reading.


Right?

> It’s not even double the perimeter


As someone doing AR research, I can definitely echo the importance of higher FOV for getting closer to the "holograms." I am currently doing some investigations involving tracked hand-held devices (like smartphones/tablets) with the AR HMD, and I've found that users tend to unnaturally hold the device really far out in an attempt to see any AR content that is placed relative to the tracked object. Hopefully even a modest improvement in FOV can overcome this subconscious tendency and open up a wider range of interactions.


That's interesting; something I never really paid attention too when I was working with an mobile device AR app company. Is this research somehow tied into the relationship between UI and imaging AR head mounted displays or are you focussing on tablet/phone AR?

I too feel FOV is a key attribute, but for me it's all about it being an important "immersion cue". Objects that we don't first (subcontiously) notice in our peripheral vision, we simply don't believe.

This is actually why I think Magic Leap made a smart decision by making their device so 'enclosed' (as opposed to MSHL that is extremely open and unoccluded), which effectively artificially narrows the peripheral vision. So, by simply blocking off the area of a display window through/from which your device cannot generate image content, they improve immersion. Cheating, but who cares, so long as effect is better than without.


>Is this research somehow tied into the relationship between UI and imaging AR head mounted displays or are you focussing on tablet/phone AR?

It's focused on having virtual/AR content that is rendered by an AR HMD like the HoloLens, but which is rigidly anchored to a handheld object. So, for example, having extra windows or panels on a phone that float beyond the rectangle that defines the physical smartphone shape.

>This is actually why I think Magic Leap made a smart decision by making their device so 'enclosed' (as opposed to MSHL that is extremely open and unoccluded), which effectively artificially narrows the peripheral vision.

I think this was an issue that the HoloLens 1 had/has in terms of the plastic "eyeglasses" region -- because it looks like it wraps around the full eye FOV when looking at the headset before putting it on, users are disappointed when they wear it and it's revealed just how (relatively) low the FOV is. Setting proper expectations is important for users.


The UI for Hololens 1 was terrible. Having to hold your gaze perfectly at tiny buttons and trying to get the Air Tap to work successfully was super annoying. You'd think they'd be able to project your fingers into 3d space from your approximate eye location.

Glad they've revisited those flaws!


To be fair, it was state of the art... five years ago. HoloLens 1 was an amazing feat of engineering. Yes, it was a terrible user experience, but we wouldn't have the latest hardware without it.

As for projecting the fingers, it is possible. It's just not a great experience. Nose-pointer is a (application level) compromise. I've built apps in HL1 where direct hand gestures are essential. And it's definitely a master interface. It's not something you can easily train new users on.


No question it was cutting edge, the spatial mapping feature worked 99.999 percent of the time, something which other platforms haven't come close to achieving.


To be fair, using their clicker/remote rather than Air Tapping makes this much better.


Instead of getting rid of the keyboard, mouse and standard 2D display, why not start with augmenting the reality around those? Keep the keyboard, keep the mouse, keep the high-speed, high-definition display, there's nothing more efficient. Touchscreens don't even come close to matching that combo in any domain, let alone exceeding it. What chance do laggier, less touchable figments have? If you can't find something that augments that reality for the better, it's time to move on.


Ever watch a bartender ring up a complex order? That's a domain where touch exceeds keyboard / mouse hands down. Digital art, such as drawing (or even just whiteboarding an idea).

There is no one universally great input method.


> Ever watch a bartender ring up a complex order?

Too many times to count and I'd disagree it's better than a keyboard equivalent. It's easier for new/casual staff and can be easier for visual people but it's awful compared to someone with a bit of training for complex things. They seem to fail at making the complex things achievable for the less trained as well IME.

Plastering drink logos on big buttons is easy for some, but for non-visual people and those not familiar with the products it's harder/slower than an alphabetical list or typing the first 2 characters. Want to put something on my account? Good luck with the touch keyboard.

Compare those modern touch screen apps to something like a TUI that used to run in the local video shop 30 years ago and there is no comparison in speed and efficiency.


I like TUIs because their limits have made their creators strive for extremely efficient UX.


Or a sufficiently complex establishment with a sufficiently trained bartender could do better with keyboard macros. Sure it may be rare but the point is touch interfaces have a complexity limit whereas keyboards and grammars don't.


How can a touch interface have a complexity limit that a keyboard doesn't? You can literally emulate a regular keyboard on a touch screen.


The answer is agility of thought and execution, and that if you emulate a keyboard on a touch interface as its main interaction paradigm it's to all effects a keyboard, but with less tactile feedback thus harder to get agile with.


Hard to imagine a more confined problem domain.


Ever watch anyone using a FreeDOS based POS system?


Honestly, when I first heard about the Hololense project, I just wanted to be able to code in the ceiling while relaxing.


You should check out Virtual Desktop with the Oculus Go. It’s astonishing.

The Go’s screen is appreciably better than the Rift’s, and the network streaming utility that brings your Windows computer’s screen into the Go is excellent.

One thing that was interesting was the zen of using it instead of a monitor - it’s just you and your screen. Nothing else in visible space. With a pair of noise cancelling headphones, I’m convinced that this is the first glimmer of productivity in the commercial space for VR that has wide appeal.

I tried it for a fair amount of time and I’m convinced that when the Oculus Quest comes out, I’ll use that instead of a monitor for some of my work.


I was gifted an Oculus Go for Christmas and typically use it to play games and watch VR YouTube, etc. I've always thought it would be cool to do some of my programming work in a VR space. Just picked up this app to try it out after your review and it does seem pretty neat. What "environment" do you prefer to work in?


Do you feel side effects from over-stimulus of the brain? From my experience with VR, it seems like an hour is about all I can handle before my eyes start hurting. Whereas with AR/MR, I feel like the lack of visual input is better for nausea and strain.


Elevr did some research on VR/AR interaction design, and this is the office they came up with: http://elevr.com/the-office-of-the-future/


That looks incredibly dystopian.


More than just the goggles?


Avatars in VR meetings that match your RW posture


Perhaps this could work for some, but for front line workers (the target of Hololens) it absolutely would not. The entire current mission of Hololens is to empower front line workers that currently have access to no computers at all, people that work with their hands and can't have a smart phone, keyboard, monitor, etc right in front of them.


I see frontline workers use those thing every time I interact with them, far more efficiently than anything real demonstrated in a Hololense video so far. If you're working with your hands, it's very likely you can't afford to have you FOV obstructed by much. For example, I'd freak out if I saw a surgeon wearing one. I'd much rather they be looking at a magnified view, fed to a high-speed 2D display than a laggy 3D approximation of reality.


If your idea is inherently visual or spatial, a keyboard is terrible and a mouse is clumsy. Try signing your name with a mouse. Try doing any of this with a keyboard: http://elevr.com/experimental-still-lifes-and-landscape-inte...


Try coding by voice.



Programming languages of today are built for text. I will not be surprised if a PL is designed for voice. It can be done


Alexa, create a module for search with O(n log(n)). :)


Sounds like an XKCD joke, but something like pip install search-something will probably be trivial and common.


What's the point if you have a screen in front of you? Why wouldn't you just update the monitor?


There's definitely room for laggy, non-interactive 3D views in just about everything. Pick anywhere is space to display a clock/calendar you can glance at. Draw with the mouse and keyboard but see it rendered on your desk. Overlay keyboard shortcuts for the new program your learning on your real keyboard. Show a battery level overlay on the cell phone on your desk. Overlay the title of the song currently coming out of the speakers you're looking at. Overlay the estimated calories of the meal you're eating. Animate the solution to the puzzle sitting on the floor.


Eh... the author doesn't know how to compare solid angles.

Anyway, I wonder what happens when VR manufacturers increase FOV to the point where the distortion from the traditional perspective transform (homogeneous linear transform) becomes impractical due to distortion towards the edges (example: https://www.youtube.com/watch?v=ICalcusF_pg).

You need >120 FOV for an immersive experience, but current graphics pipelines are built on the assumption that straight lines in world space map to straight lines in screen space, so you can't do proper curvilinear wide-angle perspective with the existing triangle rasterizer architecture.


>"straight lines in world space map to straight lines in screen space"

Then you post-process the screen space, with something like a fish-eye shader to warp your image appropriately. Sure you'll lose some resolution near the borders but the human eye won't care because it's not in its high resolution area.


The problem is without eye tracking, nothing stops you from just looking at the borders and seeing artifacts.


When I was using VR a lot I just kind of stopped looking with my eyes outside of a small area around the middle and mostly used my neck. When looking towards the edge doesn't look great I think most people adapt and just start subconsciously working around the limitations. Maybe it partially comes from being a glasses wearer almost all my life (since around 7-8) so my vision is already basically useless for detail around the edges but it wasn't a huge jump for me.


In practice, people don’t make incredibly long saccades. Instead, there’s some coordination so that the neck or body movements do the bulk of the work.


Then you render with multiple cameras rotated around their centre, using the same techniques as cubemap rendering. With high field of view, rendering a single spherical view with homogeneous resolution would be inefficient for VR, not only due to the lack of hardware support (without noticeable artefacts or some subdivision scheme) and the created distortion, but also due to the foveated nature of human vision. You'd want a small view frustum rendered at a much higher pixel per angle blended with a lower resolution rendering of the entire field of view (optimally with a medium resolution view frustum surrounding the smaller one for a more progressive blending). See: https://www.youtube.com/watch?v=lNX0wCdD2LA

With current GPUs and game engines, you need multiple cameras to achieve such effect, but it has the advantage of drastically lowering the computing demands for high-resolution VR graphics (compare 8k per eye, vs just two blended 1080p views).

You could also ray trace a single view of non-homogeneous resolution that directly takes into account the distortion characteristics of the HMD lenses, but it would likely a lot less efficient than traditional rasterization with multiple blended cameras.


You need >120 FOV for an immersive experience

Where does this come from? Is this a reference to research? I find the 110-120 degree FOV is pretty immersive in current day VR.

Does it have to do with: "Objects that we don't first (subconsciously) notice in our peripheral vision, we simply don't believe." (From another thread.)


Human vision has 120 degree FOV when looking forward. However you can also move your eyes, so it has to be >120 if you don't want to look at the edges of the headset.


> but current graphics pipelines are built on the assumption that straight lines in world space map to straight lines in screen space, so you can't do proper curvilinear wide-angle perspective with the existing triangle rasterizer architecture.

Isn't that approximation still fine as long as a single triangle doesn't span more than, say, 5 degrees of your FOV? If you are trying to represent a real scene, your triangles will rarely get that big. If you really want to display huge triangle, then you can just sub-divide it into a thousand smaller triangles.


You should still take it into account because when you move your head around it causes things to get warpy and it may be at best weird and at worst dizzy.


Check out 360 degree Panorama formats, especially "Cubic" [1]. A similar approach was used back in 1992, in CAVE [2], using 6 virtual cameras.

[1] https://wiki.panotools.org/Panorama_formats#Cubic [2] https://en.wikipedia.org/wiki/Cave_automatic_virtual_environ...


Very happy to see the holo lens improving. I'm hoping this can bring a growth in labour jobs which require technical skill for individuals who do not have any high-level training (e.g. electricians or mechanics). From a safety perspective, there is a lot that can be learned in envirnments like power plants, oil refineries or factories.


Why would you assume that electricians and mechanics don't have any 'high-level training'? Is it just because they didn't get a BS from Stanford?

Tech schools are a thing - and many of them are very good. And apprenticeship has been around for millennia, and is a wonderful way for someone to learn a trade or craft.


I think the parent's point was that unskilled people could preform those tasks. For example, I could look at my breaker box with a hololens and it could shade in the deadly parts.


Further to your point, make previously unemployable employable through augmented reality assistance.


And the convenience of AR over an annotated picture is what makes the former much safer than the latter?


I would think so, when you consider the lack of convenience involved in the following steps:

- Realize that you should try to research which parts of your breaker box are deadly in the first place

- Find the model number on the breaker box

- Google it, and sift through the results to find relevant information about which parts to touch/not touch

The mental loops you have to jump through would convince most people not to try at all, and just call up a professional. An annotated picture would be just as good, but having the hololens understand what you're looking at and the context of your situation (no doubt a difficult set of problems to solve) would make a huge difference.


Yes, those are all steps needed to get your AR glasses to superposed an annotated picture on top of your breaker box; unless you just postulate the glasses come with such software preinstalled and the manual for the box doesn’t have annotated pictures.


I think this is why MS is starting with the corp/industrial space. The comments I see here (for better or worse) do demonstrate the kind of up hill battle MS would experience when dealing with consumer critique and expectations.

I would postulate _exactly_ that the glasses would come with software preinstalled. Similar to how YouTube videos supplement so many instruction manuals today. Even then it's no directed primarily at the person that wants to repair a breaker box but instead the technician that would _install_ it. Though once the form factor is as ubiquitous as cell phones I'd imagine every support service will have an offering. I bet every Verizon tech would love to see exactly what Grandma is looking at when asked to reset her router. Which _then_ us to the opportunity for first responders to assist one another remotely. It's a value multiplier when we reduce time and increase quality in the same swing.

Look at the story about Azure Kinect helping to reduce hospital bed falls from 11,0000 a year to 0. We'll see similar reductions to: House fires do to rushed wiring jobs, defects in nearly any manufacturing process.

Remember that this is 'augmented' reality. What I hope isn't lost on folks is that the thing we're augmenting is ourselves. Granting a super human level of awareness and cognitive ability.

tl;dr - This is Tony Stark level tech.


You sound pretty hyped about AR. The Hololens seems to have considerable shortcomings. But I agree that eventually, AR can do a lot of good. I believe that truly intelligent AR headsets have the ability to support e.g. certain cases of people with mental disorders enough in their daily lives to become much more independent and free. These devices operate in real world contexts and can record them, process them and act on them. For starters, imagine an assistant that remembers perfectly where you left your car keys or your glasses while you are still half asleep in the morning. In the long term, these devices can do much more. The flipside is that they need to have always on cameras and maybe always on micriphones to be useful. This has ramifications of its own.


Well, in my experience I was speaking to engineering firms which explained that metrics on their machinery would be visible from hundreds of metres just by looking at the structure. This vastly improves safety as it allows workers to see if there is hazards/dangers.


Classism runs strong in the tech industry.


I really don't get the factory use case. What kind of productivity boost do people expect, that could make up for the cost of modeling, both up-front and continuous with every change to the physical setup? Make up for the cumulative cost of the extra delays incurred by adding an AR content maintenance team to the already long list of people you need to coordinate in the project plan when retooling for e.g. a new product? Making the few people left on a high tech high automation factory floor even more productive is an insubstantial gain compared to retooling latency. The former just happens to be easier to quantify, wages are neatly tallied in Excel or SAP, whereas missed opportunities don't appear in your imagination unless you actively think about them.

Maybe AR will be the missing building block that makes high tech manufacturing breach the complexity wall where things start to be getting easier, but without more compelling arguments for that side than I have seen so far I will continue suspect that it will just be the opposite of keeping it simple. (This is a general pair of polar opposites I like to use as a mental model in programming: either KISS or climb the complexity/abstraction mountain until it gets easier again - the problem is that you don't know in advance how high the mountain will be, how low the trough on the other side will be, could be higher, could be lower than the default state of KISS and that it is easy to get lost on the way up)


My brother (a welder) had to constantly correct design flaws by the architects and engineers involved so education is something but not everything. This ranged from having to "fix" the specified measurements to ordering more material because the highly educated apparently faltered on that note.


I mean that in the sense of being able to contact a specialist at your office if there is something you are unfamiliar with. For example, if I am an astronaut and a part fails, I can have someone direct me who designed it. In a more simple capacity, a mechanic can have support from an expert on a specific car.


I wonder how text rendering performs. If it's good, a Hololens 2 IDE would be an improvement over a laptop for working while traveling.


Text readability in specific context was passable on the original so I think this should be pretty good! The problems are going to be more in that it’s not quite a desktop window manager yet. It’s not designed to be a screen replacement. That is coming in the next couple years, but the bumps you’ll experience to make it your environment involve implementing and establishing paradigms that don’t exist yet and are maybe availible as a collection of weird bit rotting tech demos.

That said, you could probably wire it up to terminal and split out movable windows with a tmux wrapper. Impress your friends and strangers while you hack the Gibson, then take it off and use your laptop screen because you’re frustrated.


It always surprises me how much these technologies try to run before they can walk. Giving people a great desktop experience equivalent to multiple high-end monitors just seems like the obvious first baby step. Everything else seems like fluff if they can't get that to be desirable.


You would think that, but as someone who has been in the Natural User Interface R&D field for a decade, I’ll tell you that this is what walking looks like. Our tools as of this year are just on the edge of being able to deliver what a consumer expects. Adding a 3rd dimension makes everything harder and none of the existing assets, let alone functional principals from the 2D world translate.

To get what you just called a baby step is an act of barn raising. By thousands of dedicated professionals, tinkerers, artists, scientists, and large corporations. If we could have willed the “baby step” into being before now we would have. Actually many people have, but then quickly notice what’s lacking and get back to raising the barn.

Here are a couple projects I pulled with a quick google. I have a bitbucket wiki with dozens of similar projects, some in AR, VR, projection mapped, single user parallax, with gesture tracking, with custom controllers, etc... This is definitely one of those problems where it's easy to imagine so people imagine it's easy. Maybe we need you to help! Get a spatial display of some sort and lets get to work!

https://github.com/SimulaVR/Simula

https://github.com/letoram/safespaces


Yes but I'm saying _don't_ add a 3rd dimension in any but the most basic sense. Just give me what is effectively a very large 2D workspace, as if I was staring at one or more nice big monitors. No fancy metaphors, no gestures - with Hololens I can still touch type and use my existing mouse.

I am a curmudgeonly luddite, obviously, but I would pay thousands of dollars just to have a multi monitor setup that required zero effort and worked on the move (to the extent that I'd be willing to go out with the equivalent of a Segway attached to my head).

If _that's_ still impossible (i.e. it's impossible to accurately and quickly orient the device in space, even in the controlled environment of my desk, or it's just not possible to display text that's nice to read on these devices) then I don't see the point of even attempting the more esoteric stuff as anything but pure research. Clearly several multi-billion dollar companies disagree, so I'm happy if all this comes to pass either way.


Making a display with that high a resolution that stays in the same place with normal head movements and has a 60fs refresh rate and something like ClearType is far, far more difficult than what is being done now. This IS “obvious first baby steps”.


A great desktop experience is hard. VR lens blur. AR display dimness and fov. Cost of resolution. Fixed focus depth. Heavy thing on your head. Tethered.

A nice VR gaming experience is hard. Immersive means you can't see the real world, so your balance rides on rendering, requiring high fps, low latency, and thus lots of gpu. Means a high bar for avoiding "immersion breaking" hardware and software visual oddities.

The perceived near-term market for VR gaming is larger than that of VR or AR desktops, so that set of hard has gotten investment, and the other, not so much.

Market structure, especially patents, discourages commercial exploration of smaller markets. Say you want to build and sell your great system, and that it happens to need eye tracking. One eye tracking provider sells very expensive systems to industry and government, and your volumes are far too small to interest them in cheaper. Its competitors have been bought by bigcos, and no longer offer you product. Patents, and your small market, block new competitors. So eye tracking is not available to you or your envisioned market.

Community communication infrastructure is poor. Suppose all the pieces exist somewhere. People interested in desktop experiences, willing to throw a few thousand dollars at panels, tethered gpus, eye tracking, and so on, willing to be uncomfortable, to look weird, to tolerate a regression in display quality; a 2K HMD with nice lenses; a similar 4K panel; the electronics to swap the panels; some tracking solution; and so on... Even if all those pieces exist, the communication infrastructure doesn't exist to pull them all together. Neither as forums, nor as markets.

The poor communication degrades professional understanding of the problem space. People are unclear on the dependency chains behind conventional wisdom. So high resolution is said to require eye tracking or high-end gpus. VR is said to require high frame rates. And the implicit assumption of immersive gaming is forgotten. I've found it ironic to read such, while in VR, running on an old laptop's integrated graphics, at 30 fps. Wishing someone offered an 4K panel conversion kit, so a few more pixels escaped blurry lenses. A panel I could still run on integrated graphics, albeit newer. So part of the failing to pull together opportunities, is failing to even see them.

Perhaps in an alternate universe, all patents are FRAND, all markets have an innovative churn of small businesses, and online forums implement the the many things we know how to do better, but don't. And there's been screen-comparable VR for years. But that's not where we're at.


That alternate universe really did almost exist. You can still see the remnants over at http://www.nuigroup.com/go/lite.

10 years ago there was a booming scene of open source natural user interface projects. People building huge multitouch interfaces, experimenting, releasing open SLAM tools, and exploring what you could do with DIY AR/VR/projection maping/natural feature tracking/gesture interfaces. Post the success of the first Oculus dev release all of the forums went quiet, the git repos started being unmaintained and outright scrubbed. The main contributors to the community got scooped up by the motherships and any supporting technologies locked down in what felt like about six months to a year. Leap Motion was a standout company from that time. They have been selling the exact product they built then with almost no improvement until recently. Somehow they weathered the storm, didn't sell, and are doing some really neat stuff now. Structure IO took up the stewardship of OpenNI and if you look hard enough you can still find cross platform installers that have the banned original kinect tech that Apple bought and is extremely litigious about keeping off the internet.


Sigh. Yeah. Last year I was kludging some optical hand-and-stylus tracking above laptop, keyboard as multitouch surface, and head-tracked screen-and-HMD 3D desktop, sort of laptop-nextgen for my own use... but didn't find a community left to motivate sharing or demos. :/ Thanks for your comment.


A possible baby step is a household system of stationary cameras with a mobile Hololens. The stationary cameras observe and attempt to classify objects you hold and put away; you can also teach it objects you name. Kind of like a visual Mycroft/Alexa/Google Assistant/Siri The Hololens guides you to the objects when you can't recall where you left them.

Elderly, memory-impaired (dementia, Alzheimer's, etc.), households with children would find immediate use with the assistance, even with not-great accuracy: any kind of reminder that scores greater than zero success at finding misplaced objects would be welcome than the alternative of zero effective recall.

If only the battery life was better, then small business warehouses would find immediate use for these as merchandise locators, as a lot of them have very haphazard methods to lookup merchandise locations.


That is doable right now. I am only sort of being glib here. I just want to outline what's involved in what people think is a simple task in a spatial system.

TLDR: As of right now, your wildest dreams are pretty possible. In the next 2 years, how we compute is going to get strange. Nothing is nearly as simple in an XR environment as a traditional computing one and currently most people don't really want what they'd ask for. Building a simple (useful) XR app makes launching a web product look like assembling a lego set vs. landing on the moon.

---------

Imagenet[0] works just fine for the object identification. You would probably want to use RGBd sensors like the kinect[1] or Intel Realsense[2] instead of regular cameras, but tracking like what the Vive[3] uses could also work. The thing you just proposed would involve a network of server processes handling the spatial data and feeding extracted relevant contextual information to a wireless headset at a pretty crazy rate. Just to give you an idea a SLAM[4] point cloud from a stereo set of cameras or a cloud from a Kinect2 or Realsese produces a stream of data that is about 200mb a second. Google Draco[5] can compress that and help you stream that data at 1/8 the size without any tuning.

Extracting skeletal information from that is really something that only Microsoft has reliably managed to deliver and it's at the core of the Kinect/Hololens products. NuiTrack[6] is the next best thing, but registering a human involves a T-Pose and gets tricky. Definitely you could roll something specific to the application, maybe just put a fiduciary marker[7] or two on a person and extrapolate skeleton from knowing where it is on their shirt. You will also want to be streaming back the RGBd, IMU, hand and skeletal tracking from the headset back to the server. This could help inform and reduce the tracking requirements from the surrounding sensors.

Out of the box, you'd probably need a base i7, 64Gb+ of ram, and a couple GTX 1080s to power 4 sensors in one room. The task of syncing the cameras and orienting[8] them would be something you'd have to solve independently. After having all of that, you would have an amazing lab to reduce the problem further and maybe require less bandwidth, but very probably to get where you're going you'd need to scale that up by 2x for dev headroom and maybe run some sort of cluster operations[9] for management of your GPU processes and pipeline. Keeping everything applicable in memory for transport+processing would be desirable so you'd want to look at something like Apache Arrow[10]. At this point you are on the edge of what is possible at even the best labs at Google, Microsoft, or Apple. The arrow people will gladly welcome you as a contributor! Hope you like data science and vector math, because that's where you live now.

After getting all of this orchestrated, you now have to stream an appropriate networked "game" environment[11] to your application client on the hololens, but congrats! You made a baby step! Battery life is still an issue, but Disney Research has demonstrated full room wireless power[12].

Now all you have to do is figure out all the UI/UX, gesture control, text entry/speech recognition, art assets, textures, models, particle effects, post processing pipelines, spatial audio systems[13], internal APIs, cloud service APIs, and application build pipeline. The Unity asset store has a ton of that stuff, so you don't have to get in the weeds making it yourself but you will probably have to do a big lift on getting your XR/Holodeck cluster processing pipeline to produce the things you want as instantiated game objects.

Once that's done, you literally have a reality warping metaverse overlay platform to help people find their car keys.

What's crazy is that you can probably have all of it for under $15,000 in gear. Getting it to work right is where the prizes are currently living and they are huge prizes.

[0] https://en.wikipedia.org/wiki/ImageNet

[1] https://azure.microsoft.com/en-us/services/kinect-dk/

[2] https://github.com/IntelRealSense/librealsense

[3] https://www.vive.com/us/vive-tracker/

[4] http://webdiis.unizar.es/~jcivera/papers/pire_etal_iros15.pd...

[5] https://github.com/google/draco

[6] https://nuitrack.com/

[7] https://www.youtube.com/watch?v=JzlsvFN_5HI (markers are on the boxes not the robot)

[8] https://realsense.intel.com/wp-content/uploads/sites/63/Mult...

[9] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus...

[10] https://arrow.apache.org/

[11] https://docs.unity3d.com/Manual/UNetClientServer.html

[12] https://www.youtube.com/watch?v=gn7T599QaN8

[13] https://en.wikipedia.org/wiki/Ambisonics


This sounds like something a warehouse could afford to buy but for most people organizing their stuff properly is going to be the winner. (even at high salaries like $300k yearly 15k is still 100+ hours of your time)


Yeah, it's not a consumer product yet. Did you miss that it's a lab to build it and to make the most "basic" solution you have to have a functional holodeck for the price of a used sedan? That's pretty bonkers.


Yes, same goes for using the walls in the Mixed Reality house as windows. It's readable and workable, but 3 realworld HD monitors are 'better' for getting work done.


I talk from my personal experience with V1 and other devices using waveguides display, I felt like it is not the best type of display to do "heavy work" such as reading/coding. It is quite annoying to have everything ghostly, even if the tracking is stable and the resolution is good.


I've never tried any of these devices, but the one thing that comes to mind is that I wonder if you might get eyestrain trying to use it for 8 hours a day. When working on my laptop, I'm constantly switching focus -- looking out the window, then back to the lap top, then at the wall, etc, etc. I think that kind of thing might end up being more difficult.


You can still switch your focus with these because you still see the real world. However it is true that the objects are rendered ("OpenGL-like") the same way no matter where the object is located in the scene, so while it gives you the feeling that you can focus your eyes on the object (because of their 3D position and the tracking), the objects always appear "clear" which can be annoying. That said, using them 8 hours must be hard : the virtual objects emits unreal lighting (different and stronger than the light reflected by real surfaces)


My biggest question on it, isn't FOV, resolution, etc etc, but it is how long does it take to make content for it? A fraction of my job involves AR/VR content creation and I don't have days to program in clever features. Everything needs to be drag and drop, 1-click solutions with a few hours of programming to tidy things up. Since not only am I making content for VR/AR, but then I have to go sell its business use


I saw a demo of a company working on this in 2018 on the automatica Munich. They built an editor for easy setup of training software using the hololens. Looked really promising. Unfortunately I can not remember the name.


I think it's like with VR, but a magnitude more expensive right now.

Hololens is the Vive/Rift of AR and Vuzix Blade is the Focus/Quest of AR.

No end user will buy a Hololens if it looks like that and costs thousands.

But something like google glass in good looking, with more power and good voice control could really replace Smartphone in the future.

Still the Hololens will sell to enterprise customers and AR play-rooms where you can use them for a small fee for an hour (like we have VR rooms right now).

So I don't expect much more from new Hololens generations than evolutionary improvements.


I could see smartwatches like Apple Watch integrating well with smart glasses, somehow, in the future.


> Hololens 2 now has a 52° diagonal FOV and a 3:2 aspect ratio - so 43° horizontally and 29° vertically. [...] but it IS more than 4x the area (525°² vs 2236°²)

It looks like the author got confused and multiplied the wrong numbers. 43 x 29 = 1247, so this is about a 2.4x increase in area.


The Pythagorean theorem only holds on a flat surface, not on the inside of a sphere. For small angles, it won't make much difference, but these angles are big enough to matter.

Using this solid angle calculator to compute the solid angle covered by a rectangle [1], I get that a 43° x 29° rectangle subtends 0.335 steradian (sr), while a 30° x 17.5° rectangle subtends 0.151 sr, making for a 2.2x increase in solid angle.

However, the numbers 43° and 29° apparently come from applying Pythagoras to the 52° diagonal field-of-view (fov). That's also incorrect, and I haven't done the math to correct it. (As an extreme case, for example, a 180° diagonal fov gives 180° horizontal and 180° vertical fov, so Pythagoras clearly breaks down.)

[1] http://tpm.amc.anl.gov/NJZTools/XEDSSolidAngle.html


I left this comment in a prior HoloLens related thread on HN the other day, but I'd like to reiterate (so I'll just copy/paste it) so I can get some more opinions on the concept. I'll add that the functionality I describe below would have limited use cases without a GPS or a device that's really useful/wearable outdoors and available to the public. But I think this should be the long term vision for holographics, as opposed to "apps" that run individual experiences.

> We need a search engine for holographic layer services. It would be like Google, but for MR experiences. Holographic services would use a protocol that defines a geofence to be discovered by the layer search engine's crawler over the internet (this could just be a meta tag on a classic website). The HoloLens or whatever MR device would continuously ping the search engine with its location, and the results would be ordered based on their relevance (size of geofence and proximity are good indicators). The MR device would then show the most relevant available layer in the corner of the FOV. Selecting the layer would allow enabling it either once or always, and the device would then deliver the holographic layer over the internet. The holographic layer would behave like a web service worker (in fact, it could be a web service worker) and would augment a shared experience which contains other active holographic layers. For example, your Google Maps holographic layer could be providing you with a path to walk to the nearest Starbucks, and once you're outside Starbucks, the Starbucks layer is also activated, which allows you to place an order.

> This concept of activated layers, I think, is a great way to avoid a future where we're being bombarded with augmented signage and unwanted experiences. In fact, you could go further and enable blocking notifications about specific/certain types of available services. (ie. don't notify me about bars or fast food restaurants.)

I also think this could have applications within intranet environments in corporate/enterprise contexts: several teams could each develop their own layers used for different purposes. That would make something like this worth while to pursue today, seeing as HL2 is solely targeting those environments for now.


It doesn't matter if it is 'layers' or 'apps' basically the problem begins when you have more than one layer/app at a time.

So I guess for a long time we will stuck with one app a time, and some kind of manual switching.

Also there is privacy angle, I don't think nobody wants pinging their location all the time, maybe some QR code/ beacon solutions can help, but I am not much optimistic there too.


> Also there is privacy angle, I don't think nobody wants pinging their location all the time, maybe some QR code/ beacon solutions can help, but I am not much optimistic there too.

Maybe you could have something like K-anonymity? You discretize your location into chunks of maybe 100x100 m, take the hash of the chunk ID, and request results for all chunks whose hash starts with the same 4 or 5 digits as your chunk's hash.

Would probably require some more thinking since a consecutive series of requests for adjacent chunks could be used to uncover which particular chunks you were interested in.


If you own a cellphone you’re already carrying a device that pings your location all the time. I don’t think anybody actually cares enough to change what they’re familiar with, let alone some future potentially amazing technology.


While the face blocking of a Hololens would be a nonstarter for this role, as a high school level teacher I think an AR toolset for instructors could be extremely useful.

For instance, it’s always more efficient to intervene/support a student in the moment they are having trouble/ready for the next idea, but there’s a natural limit to how much information about a class full of students is available/understandable; better tools for allowing me to know how my students are engaging/progressing/struggling/succeeding while they do the work would be wonderful.


I've got a question about AR goggles. If someone knows, please tell me:

My main use case would be to free me from sitting by a screen. I want to be able to access mainly Slack, a browser and some IDEs so I can casually do some code review or chat to team members as I do some chores around the house. I'll need notifications as well. But mainly I want to be able to read stuff while being mobile.

Are there AR goggles that will let me do that?


Not yet, in the way you describe.


There are many applications into enterprise in regards to Mixed Reality use cases and this product is targeted for them (and military as MS employees tried to argue). Nevertheless MWC19 was for China and Huawei to position themselves in front of EU telcos and bend imagination with foldable phones, difficult to find Hololens in between 5G logos everywhere


Don't know if it's possible with Hololens, but one thing I thought would be pretty neat is a flight sim with physical instruments you can touch. Render the world outside the plane in a gaming rig and project it in Hololens as a video stream where the windows would be.


If someone could figure out a way to bring more "real world" into people's lives.. it would probably do better than more digital.. people are already full of digital and want more real life.


I wish I could improve the FOV on my glasses.


Did Google Glass 2 matter? it was marketed for the enterprise/industrial use case too.

What makes them think there is a market for these beyond gaming and game like scenarios like military ?


The device is marketed for industry applications. The presentation never mentioned any other kind of application for it. The price point precludes gaming outright at 3500$. The applications are in fields where contextual information in the real world can make a difference. Consumer AR is a few years off and might be something very different. The Hololens isn't that.


Does any of the Hololens technology trickle down into Windows Mixed Reality?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: