Skip to content

Mixed Reality Overview

Nick Saw edited this page Dec 7, 2022 · 7 revisions

Platform for Situated Intelligence supports the development of mixed reality applications, from processing the rich multimodal sensor streams to rendering output on the device. For now, only the HoloLens 2 device has been tested and is supported.

The overview below is structured in the following sections:

  1. Underlying Technologies - describes the underlying technologies that enabling working with the HoloLens 2 device
  2. Prerequisites - describes the set of prerequisites for building MR applications with \psi
  3. List of Components - describes the list of \psi components available for Mixed Reality.
  4. Capturing HoloLens Sensor Streams - describes tools for collecting data with HoloLens 2.
  5. Coordinate Systems - provides an explanation about basis conventions for coordinate systems used.
  6. Example App - briefly describes an example app.
  7. Visualization of Mixed Reality Data Types - provides information about how to visualize Mixed-Reality data.

1. Underlying Technologies

Integration between \psi and the HoloLens 2 is enabled by Windows Runtime APIs, as well as HoloLens Research Mode and StereoKit.

HoloLens Research Mode

HoloLens Research Mode provides access to the visible light and depth cameras (including both depth and IR images), and IMU sensors. As its name suggests, this functionality is meant for developing research applications and prototypes not intended for deployment, and there are no assurances that Research Mode or equivalent functionality is going to be supported in future hardware or OS updates. The same holds true for applications developed with \psi using Research Mode. A GitHub repository of samples demonstrating how to use Research Mode (outside of \psi) can be found here.

HoloLens Research Mode functionality is provided for \psi applications via the HoloLens2ResearchMode project, which wraps selected HoloLens 2 Research Mode APIs in Windows Runtime classes. The generated Windows Runtime component may then be consumed by C# Universal Windows Platform (UWP) apps.

StereoKit

StereoKit is an open source mixed reality library for building mixed-reality applications with .NET. It provides easy access to inputs like the user's head, hands, and eyes, and includes a large set of easy-to-use UI and rendering capabilities.


2. Prerequisites

In addition to the basic prerequisites for building the \psi codebase, you will need to follow these steps to install required mixed reality build tools. In particular, you will need to make sure you have the following installed in Visual Studio:

  • Workload:
    • Universal Windows Platform (UWP) development
  • Individual components:
    • MSVC v142 - VS 2019 C++ ARM64 build tools (v14.27)
    • C++ Universal Windows Platform support for v142 build tools (ARM64)

NOTE: Later on, Visual Studio may at some point prompt you to also install the C++ (v142) Universal Windows Platform tools. Go ahead and install them when prompted.

Enable Developer Mode and the Device Portal

Follow these steps to enable Developer Mode and ensure you are able to use the Device Portal, as well as set yourself up for WIFI or USB access (USB is optional).

You can then try to connect to the HoloLens 2 device over WiFi by using its IP address, and access the device portal. You will have to setup a username/password by requesting a PIN which will be displayed on the HoloLens 2.

Enable Research Mode

Research Mode is required for accessing data from the IMU sensors, visible light cameras, and depth cameras (including IR). Follow these steps to enable research mode on the device.


3. List of Components

The Microsoft.Psi.MixedReality and Microsoft.Psi.MixedReality.UniversalWindows projects (and corresponding NuGet packages) provide several components that expose the various HoloLens 2 sensors as \psi streams and enable rendering various holograms. These input sources and renderer components can then be utilized in any custom \psi pipeline that comprises an application that can be run on the HoloLens 2 device itself. Note that Microsoft.Psi.MixedReality targets AnyCPU, and Microsoft.Psi.MixedReality.UniversalWindows targets Arm for UWP (\psi does not currently support Arm64 projects).

The set of components provided, based on the underlying technology used by the component, is listed below:

Using Windows Media Capture APIs (in Microsoft.Psi.MixedReality.MediaCapture namespace):

Name Description Platform
PhotoVideoCamera Component that captures and streams RGB images from the device's PV (photo-video) camera. UWP
Microphone Component that captures and streams audio from the device's microphone, using Windows.Media.Capture APIs. UWP

Using Windows Perception APIs (in Microsoft.Psi.MixedReality namespace):

Name Description Platform
SceneUnderstanding Component that captures and streams Scene Understanding information. UWP

Using OpenXR APIs (in Microsoft.Psi.MixedReality.OpenXR namespace):

Name Description Platform
HandsSensor Component that tracks the user's hand poses. AnyCPU

Using WinRT APIs (in Microsoft.Psi.MixedReality.WinRT namespace):

Name Description Platform
GazeSensor Component that tracks the user's head and gaze poses. UWP

Using Research Mode APIs (in Microsoft.Psi.MixedReality.ResearchMode namespace):

Name Description Platform
DepthCamera Component that captures and streams depth images from the device's depth camera. UWP
VisibleLightCamera Component that captures and streams grayscale images from the device's visible light cameras. UWP
Gyroscope Component that captures and streams gyroscope data. UWP
Accelerometer Component that captures and streams accelerometer data. UWP
Magnetometer Component that captures and streams magnetometer data. UWP

Using StereoKit APIs (in Microsoft.Psi.MixedReality.StereoKit namespace):

Name Description Platform
Microphone Component that captures and streams audio from the device's microphone (currently only supports 1-channel WAVE_FORMAT_IEEE_FLOAT at 48kHz) AnyCPU
HeadSensor Component that tracks the user's head pose. AnyCPU
EyesSensor Component that tracks the user's eye gaze. AnyCPU
HandsSensor Component that tracks the user's hand poses. AnyCPU
SpatialSound Component that renders audio at a particular 3D location. AnyCPU
Handle Component that implements a moveable StereoKit UI handle. AnyCPU
Box3DRenderer Component that renders a Box3D using StereoKit. AnyCPU
EncodedImageRectangle3DRenderer Component that renders an encoded image placed on a Rectangle3D using StereoKit. AnyCPU
HandsRenderer Component that controls how StereoKit renders the user's hands. AnyCPU
Mesh3DRenderer Component that renders a Mesh3D using StereoKit. AnyCPU
MeshRenderer Component that renders a single StereoKit mesh. AnyCPU
Rectangle3DRenderer Component that renders a Rectangle3D using StereoKit. AnyCPU
TextRenderer Component that renders text on a billboard using StereoKit. AnyCPU

4. Capturing HoloLens Sensor Streams

Some useful tools and example apps for capturing, remoting, and exporting HoloLens sensor data, are provided in the HoloLensCapture folder. These include:

Data can be collected by running the capture app on the HoloLens device and remoting the streams of sensor data to the server app which is running on a different machine. Communication may be over WiFi or via USB tethering. Data stores written by the server may then be examined and analyzed in PsiStudio or may be processed by other \psi applications. While \psi stores are optimized for performance and work well with PsiStudio and other \psi applications, you can also use the exporter tool to export to other standard formats.


5. Coordinate Systems

HoloLens assumes a different basis for its coordinate systems than \psi does. In HoloLens, it is assumed that Forward=-Z, Right=X, and Up=Y:

 Y
 |   -Z
 |  /
 | /
 +-----> X

As a reminder, in \psi, the basis assumption (as inherited from MathNET) is that Forward=X, Left=Y, and Up=Z:

    	Z
    	|   X
    	|  /
    	| /
 Y <----+

In addition, StereoKit assumes by default a root coordinate system frame that is essentially the pose of the headset at the moment the application started running. However, it is useful to define a "world" coordinate system frame that is consistent across sessions of running your mixed reality application. To accomplish this, simply make sure to call

MixedReality.Initialize();

at the beginning of your application. This call will generate a world spatial anchor that is persisted across sessions (if the app is being installed and run for the first time), or loads the world spatial anchor from memory if the app has already been installed and run before (simply uninstall the app on the HoloLens if you wish to create a new world spatial anchor from scratch).

In the provided sensor components, all coordinate system poses are immediately transformed (before being published on streams) into \psi basis in world coordinates (using the world spatial anchor). Similarly, the provided StereoKit rendering components automatically render in the correct coordinate system. By following these existing patterns when developing new apps and components, you will be able to reason fully in the familiar \psi coordinate system basis without worrying about the HoloLens basis or how the device happened to be posed at startup.


6. Example App

Let's walk through a simple example of a mixed reality application using \psi. In this example, we'll render a spinning virtual marker in the user's environment. The user can also grab the marker and move it to a new location. The project containing this example will be made available shortly in the samples repository.

In Program.cs, create a main function with some boilerplate code at the top for initializing StereoKit and the world coordinate system:

static void Main()
{
    if (!SK.Initialize(
        new SKSettings
        {
            appName = "DemoApp",
            assetsFolder = "Assets",
        }))
    {
        throw new Exception("StereoKit failed to initialize.");
    }

    // Initialize MixedReality statics
    MixedReality.Initialize();

    ...
}

Then, once we have initialized a \psi pipeline (perhaps in response to a StereoKit UI event), we can instantiate and wire together components in the usual way.

In this example, we will first instantiate a rendering component for the virtual marker (using a pre-loaded mesh file that is included in the sample). We want the marker to start out 1 meter forward and 30 centimeters below the world origin:

// Instantiate the marker renderer (starting pose of 1 meter forward, 30cm down).
var markerPose = CoordinateSystem.Translation(new Vector3D(1, 0, -0.3));
var markerMesh = MeshRenderer.MeshFromEmbeddedResource("HoloLensSamples.Assets.Marker.Marker.glb");
var marker = new MeshRenderer(pipeline, markerMesh, markerPose, new Vector3D(1, 1, 1), System.Drawing.Color.LightBlue);

Next we will instantiate a Handle component in the same position as the marker, and with bounds defined by the marker mesh. This handle can be grabbed by the user and moved around.

// handle to move marker
var handle = new Handle(
    pipeline,
    markerPose,
    new Vector3D(markerMesh.Bounds.dimensions.x, markerMesh.Bounds.dimensions.y, markerMesh.Bounds.dimensions.z));

Next we will use a Generator source component to generate a stream of rotations that will be used to slowly spin the marker.

// slowly spin the marker
var spin = Generators
    .Range(pipeline, 0, int.MaxValue, TimeSpan.FromMilliseconds(10))
    .Select(i => CoordinateSystem.Yaw(Angle.FromDegrees(i * 0.5)));

Finally, we will Join the spin rotation with the user-driven Handle movement, and pipe the final transformation to the marker mesh renderer's pose input receiver:

// combine spinning with user-driven movement
spin.Join(handle, RelativeTimeInterval.Infinite)
    .Select(m => m.Item1.TransformBy(m.Item2))
    .PipeTo(marker.Pose);

And that's it! This is but a simple example, but by bringing to bear all of the capabilities and affordances of \psi, one can start to imagine many more interesting and complex applications that can be quickly developed for mixed reality scenarios. While this sample focused more on driving the output/rendering with \psi, the infrastructure provides easy access to all of the sensor streams as well (see list of source components above) which can be processed, combined, and manipulated using the full power of the components and operators available in the \psi framework. Have fun!


7. Visualization

Hand Visualization

The Microsoft.Psi.MixedReality.Visualization.Windows project provides visaulizers for the Microsoft.Psi.MixedReality.OpenXR.Hand and Microsoft.Psi.MixedReality.StereoKit.Hand types. To make use of these visualizers, you'll have to include the Microsoft.Psi.MixedReality.Visualization.Windows.dll into the list of additional assemblies that are dynamically loaded by Platform for Situated Intelligence Studio. For more information on how set this up, see this page on 3rd Party Visualizers.

A Note on Back-compatibility

Some mixed-reality data classes have been renamed and migrated to a new namespace with release 0.18 (see also the Release Notes). If you have \psi data stores containing data streams collected with version 0.17, in order to be able to load and visualize these streams with version 0.18, you may need to add corresponding type mappings to Platform for Situated Intelligence Studio. You can do this by going to File > Edit Settings ..., and then editing the collection of Type Mappings. For instance, to visualize the streams of the previous Microsoft.Psi.MixedReality.HandXR type, you'll need to add the following type mapping.

Microsoft.Psi.MixedReality.HandXR, Microsoft.Psi.MixedReality, Version=0.17.52.1, Culture=neutral, PublicKeyToken=null:Microsoft.Psi.MixedReality.OpenXR.Hand, Microsoft.Psi.MixedReality

This tells PsiStudio to map the previous Microsoft.Psi.MixedReality.HandXR type into the new corresponding type Microsoft.Psi.MixedReality.OpenXR.Hand.

Clone this wiki locally