2021-05-20 | Omair Sarwar, PHD | Academic and professional communities around the world are researching different aspects of Artificial Intelligence (AI) and share their results (i.e. AI models) highlighting improved accuracies, evaluated using public or private . However, the process of deploying such AI models to real-time safety-critical systems often faces a huge set of challenges.
Real-time means a late response of a system is considered wrong, where the response time is lower for most of our use cases. Safety-critical means a failed system response would cause severe injuries (or even death) to human-beings or huge damage to the infrastructure. The main reason is that academic datasets are usually collected under controlled environments and does not cover all the scenarios which the AI model could come across in real-life. Moreover, such AI models have access to huge computational resources, and do not have any real-time or safety-critical constraints.
The process of deploying AI models on real-time and safety-critical use-cases is a challenging task. The main objective behind this process is to find an AI model with an optimal trade-off between accuracy, latency, and memory footprint, so that the optimized AI model can run on a standalone embedded system. We call this process AI model optimization for real-time safety-critical embedded systems. In this blog, we discuss different strategies for AI model optimizations and highlight ME expertise in this domain.
There are different challenges for a real-time safety-critical AI system, which need to be addressed during design, development and testing phases of an AI project:
Optimization of AI models is essential for real-time safety-critical systems because an optimized AI model will not only take less time for inference, but also occupy less memory with negligible (or no) effect on the accuracy.
There are two strategies to develop an optimized AI model for a given use case:
and (2) post-training optimization.
The optimization-aware-training technique searches an optimal model architecture, i.e. the AI model’s width, depth, and number of channels in each layer, during the training process. Moreover, these strategies can also be constrained to finding the optimal weight quantization of the AI model.
The post-training optimization strategies usually take an off-the-shelf AI model and apply transfer learning to train it first for the given use case, irrespective whether the model is complex or not. After the training, the AI model is optimized using different techniques, e.g. weight-quantization, weight clustering, channel pruning, network scaling, shunt connections. As an example, please see Figure 1 for post-training optimization.
Optimization-aware-training techniques are complex and require a high level of expertise, while post-training optimization procedures are comparatively simple but may not give the optimal results for the given use case. Once the AI model is optimized, it is deployed in the field and tested against the required acceptance criteria. The AI model optimization process is usually an iterative process as it requires to find tune hyperparameters for the given application.
ME has successfully completed several projects involving AI for real-time safety-critical applications and has filed several patents.
ME areas of expertise:
Most importantly, ME has expertise in developing optimized AI models, employing the state-of-the art techniques including both optimization-aware-training and post training optimization. For example, in one of our previous projects, we optimized AI models to make them 2-4 times faster without significant loss of accuracy and achieved the required system latency.
ME can develop customized AI models that are optimized and fine-tuned for the given use case.
Get in touch with us to discuss your project idea!