Twelve Labs CEO Eyes Physical AI Dominance Through Video Understanding Technology

Finance|
|
By Kim Ye-sol
||
"Analyzing millions of hours of video in one second... Will seize Physical AI dominance through video understanding" - Seoul Economic Daily Finance News from South Korea
"Analyzing millions of hours of video in one second... Will seize Physical AI dominance through video understanding"

"Using Twelve Labs technology, we can analyze millions of hours of video in an instant. I believe the utility of video understanding technology will increase significantly in the era of physical artificial intelligence."

Jae-sung Lee, CEO of Twelve Labs, made these remarks in an interview with Seoul Economic Daily on the 19th at the company's office in Yongsan-gu, Seoul. Twelve Labs is a video understanding AI company that develops models applicable across various industries based on technology that analyzes and comprehends video data.

Lee co-founded Twelve Labs in Silicon Valley in 2021 with four colleagues. "While the AI industry was advancing rapidly in text and image domains, video technology was relatively lagging," he explained. "We determined we could widen the technology gap in the video understanding AI market." Twelve Labs has secured over $107 million in cumulative funding, including investment from Nvidia, based on its recognized video understanding AI capabilities.

The company's flagship products are "Marengo," which analyzes scenes and sounds within videos to enable data search, and "Pegasus," which converts scenes analyzed by Marengo into text. When used together, the two models can analyze millions of hours of video in one second to find the scenes users want. "We developed both models to work complementarily from the design stage," Lee said. "We plan to launch a video agent combining Marengo and Pegasus in the second quarter of this year."

Lee is particularly focused on the rapidly growing physical AI market. "Marengo and Pegasus models can be connected to physical AI that performs actual actions," he emphasized. "We are preparing a roadmap to develop a structure that understands video and then trains physical AI to take corresponding actions." His assessment is that companies with video data analysis technology like Twelve Labs will gain competitive advantage as the physical AI market expands.

Currently, Twelve Labs' technology is primarily used in entertainment and sports sectors. Global media, entertainment, and sports companies with vast video libraries are its main clients. "Companies are actively adopting AI technology because video content like sports games and dramas often directly impacts revenue," Lee said.

The company also sees potential for expansion into defense and security sectors. Video data is surging with developments such as drone deployment on battlefields, but technology to analyze it remains limited. "Our goal is to advance to technology capable of real-time video monitoring," he said. "We plan to expand our business along two axes: commercial sectors centered on media, and defense and security."

AI-translated from Korean. Quotes from foreign sources are based on Korean-language reports and may not reflect exact original wording.