
Nota (486990.KQ) said Wednesday it has successfully optimized the vision-language-action (VLA) model "SmolVLA 0.45B" to run on Qualcomm's latest edge artificial intelligence (AI) device, the "Dragonwing IQ-9075."
VLA models are computationally intensive models that must simultaneously handle image recognition, language understanding and action generation. They consist of multiple stages: understanding scenes captured by a camera, interpreting human commands and then generating robot movements.
Nota maintained the front-end recognition and understanding stages while focusing its optimization on the final stage, which generates actual robot movements. To do so, the company applied real-time inference optimization to reduce repetitive calculations in the action-generation stage, along with neural processing unit (NPU)-based graph optimization that streamlines the flow of operations to suit the execution environment of Qualcomm's edge AI device.
As a result of the optimization, the processing time of the action head, the stage that generates robot movements, was reduced from 218 milliseconds (ms) to 31 ms, a decrease of approximately 85.8%. This represents a speed improvement of up to seven-fold. Total inference time was also shortened from 505 ms to 310 ms. The task success rate remained at a similar level, edging down to 85% from 86%.
"For physical AI to spread to industrial sites, AI must be able to quickly and reliably handle the process of seeing, understanding and translating real environments into actions on edge AI devices," Nota CEO Chae Myung-soo said. "This VLA optimization achievement is a meaningful case that shows Nota's AI optimization technology can be extended as a core foundational technology for the physical AI era."







