Depth Perception Using Intel Realsense

Depth Perception


Depth is a key prerequisite to perform multiple tasks such as perception, navigation, and planning in the world of industries. Our human eyes are capable to view the world in three dimensions, which us enables to perform a wide range of tasks. Similarly, providing a perception of depth to machines that already have computer vision opens a boundless range of applications in the field of robotics, industrial automation, and various autonomous systems.

Essential specs for Depth Perception

Range & Accuracy

  • The range of the depth perception sensor. It is the most important constituent in most cases as they determine the usability of the component.
  • It is important to understand the sensor’s accuracy as they help in identifying objects in addition to just detecting them. And also improves the overall functionality of the system.

Resolution & Field of view

  • Resolution paired with accuracy determines the overall precision of the system.
  • They are useful in computing the scope of the sensor, as a wide field of view can facilitate processing more data simultaneously but impacting the processor, on the other hand when a limited area needs to be monitored, opting for a sensor with a narrower field of view will provide a competitively lesser data to be processed and thereby having a positive impact on the processor.

Frame rate

  • For applications involving fast-moving objects or use-cases requiring continuous monitoring, a frame rate of up to 90 fps is supported by most sensors. Similar to the field of view, increasing the frame rate will also have a negative impact on the processor.

Processing power

  • Sensors that have an in-built processor are available in the market. But using an in-built processor will have its limitations by offering only a standard capability, without room for adjustments.
  • They are very suitable for a fixed type of application, but lack usability when comes to multiple applications using the same device. At the end of the day it is the processor that does all the computations, so choosing one a little more than the required specs is advisable.

RGB Camera

  • To project the output of objects in a human-understandable format, we also need an RGB camera aka a standard visible light camera to identify objects with both computer vision and the naked eye.

Pedestrian detection and distance estimation using intel depth sense technology – vision intelligence

Depth Field of View at Distance (Z)

Depth Field of View (Depth FOV) at any distance (Z) can be calculated using the equation:

Depth FOV = Depth Field of View
HFOV = Horizontal Field of View of Left Imager on Depth Module
B = Baseline
Z = Distance of Scene from Depth Module

Depth start point (ground zero reference)
The depth start point or the ground zero reference can be described as the starting pointer plane where depth = 0. For intel real sense cameras, this point is referenced from the front of the camera cover glass.

Depth Camera Functions

Depth camera functions


The firmware contains the operation instructions. Upon runtime, Vision Processor D4loads the firmware and programs the component registers. If the Vision Processor D4is configured for update or recovery, the unlocked R/W region of the firmware can be changed.

Initializing & setting up the device
For this build, we will proceed with an intel real sense depth camera. After quickly unboxing the camera, plug them into any PC, preferably running a windows operating system. Now download and install Intel’s SDKs which includes the following:

Intel provides Software Development Kits (SDK) which includes the following: Intel® RealSense™ Viewer — This application can be used to view, record, and playback depth streams, set camera configurations, and other controls.

Depth Quality Tool – This application can be used to test depth quality, including distance to plane accuracy, Z accuracy, the standard deviation of the Z accuracy, and fill rate.

Debug Tools – These command-line tools gather data and generate logs to assist in debugging the camera.

Code Examples – Examples to demonstrate the use of SDK to include D400 Series camera code snippets into applications.

Wrappers – Software wrappers supporting common programming languages and environments such as ROS, Python, MATLAB, node.js, LabVIEW, OpenCV, PCL, .NET, and more.

Programing the sensor to detect the person and estimate his depth

To begin with, we shall use any code editor that has a python 3.7 or above installed. Now following the below steps will help us achieve our target of detecting and identifying a person and measuring his/her distance from the camera.

Step 1: Import

Initially, we shall import all the necessary libraries: pyrealsense2, NumPy, cv2.

Step 2: Configure depth & color streams

Then we can start streaming image data from both the cameras (depth & RBG), along with fps and resolution specifications.

RGB / BGR image frames & Depth point-cloud frames

BGR image frames & Depth point-cloud frames

Step 3: Defining the point of measurement

point of measurement

Using intel’s real sense we can measure the distance on any given pixel, so it is important to define the pixel on which the measurement is to be taken.

person detection

Step 4: person detection

We have used MobileNetSSD as the model to detect persons as it’s lighter and most compatible.

Step 5: Extracting depth of the detected object

detected object

The points of measurements are passed on to the detected object, which will now return the depth values of that particular pixel in the output screen.

detected object - Output Screen

Step 6: Displaying the depth values on the detected object

The points of measurements are passed on to the detected object, which will now return the depth values of that particular pixel in the output screen.

Step 7: Displaying the depth values on the detected object

Finally, the pipeline is stopped to end the streaming.

OptiSol DataLabs - RGB Depth Output

Benefits of Depth Perception

Improved object recognition

With the help of depth sensing technology, Intel RealSense cameras can detect and recognize objects more accurately than traditional 2D cameras. This makes them ideal for applications such as object tracking, gesture recognition, and facial recognition.

More immersive gaming and virtual reality experiences

Intel RealSense cameras can capture the depth information of the surrounding environment, allowing for more realistic gaming and virtual reality experiences. This can enhance the user’s sense of presence and immersion in the virtual world.

Improved security and surveillance

Depth sensing cameras can be used for security and surveillance applications such as tracking the movement of people and objects, and detecting intruders. The accuracy of depth information can provide better detection and tracking capabilities compared to traditional 2D cameras.

Enhanced robotics and automation

Intel RealSense technology can be used to improve the accuracy and reliability of robots and automation systems. The depth information can be used to guide robots and other automated systems through complex environments, allowing them to navigate and interact with objects more efficiently.

Improved 3D scanning and printing

Intel RealSense cameras can be used to create accurate 3D scans of objects, which can be used for 3D printing, modeling, and other applications. The depth information can be used to create more detailed and accurate models of objects, improving the quality of 3D prints and other applications.

Related Insights



Connect With Us!