As we all know, the mobility of the blind and the visually impaired is very difficult. For example, if they want to eat, drink, exercise, massage, … besides determining the location of the destination, the lane to move is very difficult, arduous, even impossible, not to mention not seeing obstacles (motorcycles, parked cars, trash cans, light poles, …) to avoid collisions. Even though they have a smart cane, there are still many limitations (for example, they have to constantly check the road without knowing what’s in front of them, which could endanger them).
Stemming from that practical need, our team has proposed the idea, building products to help the blind, the visually impaired have a life like normal people, they will be comfortable in moving to many places. place without fear of collision or danger. Besides, they also know what is in front of them! So our team named it virtual eyes for the blind.
The blind or the visually impaired will say the desired destination (or press some of the default settings on google class) => The system will recognize the voice => Find the nearest location based on GPS and google map API (ensure that the locations do not exist, virtual locations) => Some stars will find the shortest route (based on A * algorithm) => Voice guidance helps the blind to go close to the roadside, besides, if there is an obstacle in front of you, you will recognize and warn). The product is a combination of artificial intelligence (AI) for obstacle detection and LIDAR to calculate the distance to an obstacle, sound 8D . The product is built on glass so it is very small and light, easy to use.
1. Voice recognition
The system will recognize the voice of the blind (or the visually impaired) and transmit data to the app for processing => From there, the next action will be performed. For example, show them the way to the gym. However, as you know, speech recognition is a very difficult topic that requires a large amount of data (because each person will have a different voice sound, not to mention when they are sick, hoarseness). This voice form can be replaced with a number of buttons to set functions available on glass.
2. Google map, GPS navigation
We can use google map API, GPS to locate that person. From there, find the coordinates of the place they need to go there => Find the shortest way to get there, but ensure safety (can use algorithm A *).
Link algorithm: https://quangnle.com/thuat-toan-a-thuat-toan-tim-duong/
In addition to guiding them how to go, we also add the function of communicating with loved ones in case of an emergency.
3. Object detection, calculating distance, alert collision
a. Detect the curb
After finding the shortest path to the destination => Then we use OpenCV (specifically using hough tranform) to determine the curbs.
Link algorithm: https://towardsdatascience.com/lines-detection-with-hough-transform-84020b3b1549
=> Then give warnings to go left, or take a few steps to the right if the blind (or the blind) is in the middle of the road
b. Object detection, Calculate distances
- Object detection:
We will use YOLO v4 (you only look once) to identify objects and obstacles such as parked cars, pedestrians, light poles, trash cans. If there is no object, we have to get more data to train again. Why don’t we use other models like EfficientDet, RestinaNet with high accuracy? Because besides the highly accurate detect we have to ensure real time to apply real time. I will not go into the model architecture, you can refer to the link here:
Link code: https://github.com/Tianxiaomo/pytorch-YOLOv4
- LIDAR-Camera Transformation
After using Yolov4 to define the bouding boxes of obstructed objects, we will use LIDAR to calculate the distance to those objects. LIDAR is based on waves (eg bats sound waves: bats often broadcast sound to the wall and then receive back to determine the terrain => fly fast without ever hitting a wall). However, object detection is processed on 2D images and LIDAR on 3D images, so we have to convert from 3D images to 2D images through the algorithm:
- Object Locator
From the bouding boxes of the object we will determine the center coordinates by the formula:
- Kalman Filter
The Kalman filter is an algorithm that allows you to evaluate the accuracy of unknown variables using a single measure of input and output based on statistical noise measurements.
- Collision Detection and Alert
After identifying the object and calculating the distance to the object, based on the distance to output danger warning sounds.
c. Warning collision
The system will warn users through headphones with 8D sound => For a real, lively feeling.
Link code reference:
- https://repositorio.comillas.edu/xmlui/bitstream/handle/11531/24628/TFG – Rioja GarcAa, Roberto.pdf? sequence = 1
I presented the ideas, techniques, and processes to build the product ‘Virtual eyes for the blind’ . In this article, I will not present the code, if you need, please leave a comment below. In terms of the cost of doing this project, it is very expensive (the cheapest LIDAR cost is 1 million, glass, server, …). Besides this project, you can do a project to assist the blind, the blind to catch (pick up) the bus. The reason why blind people take buses is difficult because they do not know which bus is the bus to take. You can use object detection to recognize the car number (ride) or OCR to identify the trip => Play a sound to help the blind or the visually impaired know which ride they need to take.
I mentioned it here! If you find it interesting, comment so that I have the motivation to write more articles.