Bird's eye view for wheelchair

Github: https://github.com/Juhyung-L/bird_view

Background

For my university's capstone project, I was tasked with creating a device that can help our client in visualizing the surrounding environment while driving his power wheelchair. Our client suffered from paralysis below the neck, which meant that his only source of mobility was his power wheelchair. He specifically needed a device that could assist him in visualizing the back half of his wheelchair.

Our team took inspiration from the bird's eye view that some car models provide. The working principle behind the bird's eye view is to use multiple wide-angle (fisheye) cameras attached to the sides of the car and computationally combine all video streams to provide a top-down view of the car.

Although the system does not look too difficult to implement at first, we quickly realized that making a system that is both durable and reliable for long-term use while being attached to a moving wheelchair was extremely difficult. Thus, we decided to divide the team into different sub teams. I was in charge of leading the software engineering sub team, which handled the code behind processing the video frames and combining them into a single bird's eye view.

The purpose of this blog post is to explain the detailed inner workings of the GUI application that my sub team developed.

Implementation

Because our client could see in front of him, we decided that installing 3 cameras at the left, right, and back sides of the wheelchair was sufficient.

Gazebo model used for simulation

When developing software for physical systems, it is often beneficial to recreate the system inside a simulation. We used Gazebo to create a simplified model of a wheelchair with 3 cameras attached to its sides. The model is a simple two-wheeled rectangular prism shown above. The small red cubes are the wide angle cameras simulated using Gazebo's sensor plugin.

The image processing pipeline is undistortion --> projection --> combination. The raw image frames from each camera goes through all three steps to produce the bird's eye view.

Undistortion

Wide-angle cameras have the benefit of being able to capture a wider field of view, but they come with the cost of producing heavily distorted images.

Gazebo's camera plugin

Gazebo's wide-angle camera plugin (180 degree field of view)

(The two cameras are the same distance away from the object)

The reason for using a wide-angle camera instead of a pinhole camera, which has much less distortion, is because pinhole cameras do not provide the field of view required to make the bird's eye view.

Overlapping fields of view highlighted in red

The fields of view of the cameras need to overlap to produce a continuous view of the surroundings.

Hence, the first step in the image processing pipeline is undistorting the raw image frames. This is a fairly simple task using OpenCV's fisheye calibration module.

Using the cv::findChessboardCorners() function, you can find the corner points of a chessboard (a corner point is defined as a point of intersection of two white tiles and two black tiles). The coordinates of these corner points contain the information about the severity of the distortion and using the cv::fisheye::calibrate() function, you can find the 4 distortion coefficients that fully describe the distortion mathematically (details about the math is at https://docs.opencv.org/4.x/db/d58/group__calib3d__fisheye.html)

Using the 4 distortion coefficients, cv::fisheye::initUndistortRectifyMap(), and cv::remap(), you can produce the undistorted image frames. Notice that the field of view decreased due to the undistortion process. However, it is still sufficient to produce the bird's eye view.

An important note about computing the distortion coefficients is that the chessboard must be shown to the camera in as many orientations as possible for accurate results, which is why I made a moving chessboard in Gazebo.

Projection

The next step in the image processing pipeline is the projecting the image frames into a top-down view. This means transforming the perspective of the cameras from a side view into a top-down view. OpenCV provides the functions cv::getPerspectiveTransform() and cv::warpPerspective() for this specific task.

The cv::getPerspectiveTransform() function takes in 4 source points and 4 destination points to calculate the 3x3 matrix that describes the intended perspective transform. The 3x3 matrix can then be fed into the cv::warpPerspective() function to produce the image frame with transformed perspective.

Back camera view

Left camera view

Right camera view

Combination

With the projection matrices calculated, the final step is to combine all projected images into a single bird's eye view. This is a simple process where you generate a blank image matrix with the specific size that can fit in all the projected images then drawing the images onto it.

Search This Blog

Juhyung's Projects