Recent developments in the field of training Neural Networks (Deep Learning) and advanced algorithm training platforms like Google’s TensorFlow and hardware accelerators from Intel (OpenVino), Nvidia (TensorRT) etc., have empowered developers to train and optimize complex Neural Networks in small edge devices like Smart Phones or Single Board Computers. This has led to a profusion of initiatives to use such trained models in the domain of Health and Safety (HSE) at workplace. Here is a summary of initiatives that Optisol Datalabs has been working on in the past year to improve safety and avoid accidents.Computer Vision Analytics helps to improve Health and Workplace safety
Computer Vision Process Flow
Computer vision analytics model typically follows this process flow. The first step is to generate or collect training data. The training data can be generic images collected from google searches or propriety data collected in industrial premises. The training data is then labelled. This involves manually going over image by image and marking the objects (humans, trucks or other labels that need to be detected) with labels. The labeled dataset is converted into a format that the computer vision models needs to be trained. Typically training happens in a cloud server like Amazon Web Services (AWS) servers. This is because the training requires lot of processing power from Graphical Processing Unit (GPU) enabled servers that are very expensive to purchase but cheaper to rent. Using a powerful cloud server reduces the training time to hours instead of days. The training is monitored using a dashboard called Tensor Board that provides real time visualization of the model training parameters. The model thus trained is downloaded into a local machine. The next step is to optimize the trained model to run in a specialized hardware. This hardware could be Intel, Nvidia or Google Edge TPU processors that run on small single board computers. The hardware optimization is done using optimization packages provided by the corresponding hardware vendors. Thus, optimized model can be loaded into an edge processor and deployed in hand-held, portable low power units that can run the models and perform inference in real time. The output of the model can be used to send notification or drive other processes.
The following are some of the interesting initiatives that we are working on training computer vision models to improve health and safety in the workplace.
1. PPE (Personal Protective Equipment) Compliance
Improper adherence to Personal Protective Equipment in industries is a big problem. Employees don’t wear PPE either due to lack of knowledge on safety or due to the perceived inconvenience in wearing PPE. This is a major concern for employers and Health and Safety managers. We have assembled a detailed data set of workers wearing different PPE and have trained models that can accurately detect if the required PPE is worn by an employer or not. We have built an small edge device with integrated camera that can be mounted on doors that will check if the person is wearing the required PPE or not and based on that it will unlock the door for the person to enter the industrial premises. We are working on integrating this PPE compliance model in a drone so that we will have wider coverage than a fixed position camera. The drone can monitor and report PPE violations as it flies around the industrial premises.
2. Monitoring of Hazardous Chemicals Storage and Transport
Hazardous chemical storage and transport is another scenario where we are training computer vision-based models to detect and notify any safety violations of the prescribed procedures. A client of us came up with this scenario where in trucks carrying Sulfuric acid comes to an unloading bay and the truck driver after wearing all the necessary PPE operates valves and hoses in sequence to unload the acid into storage containers. The environment is very harsh and corrosive, and the truck drivers need to be very careful and diligent in following the sequence of procedures to unload the cargo. We are building a computer vision analytics model that is trained to detect the back of the truck, backing into the unloading bay, detect if the driver is wearing necessary PPE, then monitor the sequences of valve openings and hose connections to validate if the entire unloading processing is done according to the specifications. Any violations or deviations is identified, captured and notified for remedial measures. This monitoring is done automatically, unobtrusively that it doesn’t disturb the normal operations of the plant.
3. Activity Recognition
Activity recognition is more complex than detecting PPE, trucks and other objects mentioned in the above two initiatives. Activity Recognition Computer Vision models need a sequence of frames (images) to determine what activity is being performed. We use pre-trained pose estimators like Open Pose, PoseNet, TF Pose Estimators to build a 2D or 3D pose data of humans on the frame. From this we train a machine learning classifier that takes a sequence of pose data as input and learns to classify what the activities is being performed. Such activity recognition models can be used in the following ways
- Sign Language Interpretation:
Activity Recognition models can be trained as translators to understand and respond to Sign Language speakers by interpreting their sign language gestures
- Ergonomics Assessment:
Ergonomics assessment to prevent repetitive injury prevention is another area that is ripe for automation using computer vision-based activity recognition models. There are published protocols for assessing injury risk due to bad posture and ergonomics. These protocols are based on measuring shoulder, elbow, wrist, neck and truck angles to see how much rotation and extension are exacted on the body while performing repetitive tasks at workplace. These exertions are converted into risk sore and necessary changes to the workstation and tasks are made to reduce the score if it is high. The whole process is automated by calculating the body angles and injury risk scores from the pose data generated from the model.