Staqu’s Jarvis processes data from thousands of devices parallelly – Times of India
There are over a billion CCTV cameras across the globe and many billions of other devices that sense our world. These can now serve as the eyes and ears of AI systems. Like the one created by Gurgaon-based Staqu Technologies.
Read the post on
The company has unveiled the latest avatar of its AI platform Jarvis – a voice assistance system called Jarvis Help. Jarvis can be connected to any smart CCTV network or a device like a TV. It can then look, hear, interpret, and act with the intelligence of a 10-year-old. For example, if it detects any sign of distress, by analysing visual or audio cues, it can immediately notify authorities. In fact, you only have to wave at it to get its attention, says Staqu’s CEO and cofounder Atul Rai. It can also make logical inferences from situations – for example, the sound of breaking glass outside a store may be a sign of an ongoing robbery.
Jarvis can take inputs from drones, which could be handy in rescue operations. It can even do mundane tasks like alerting a user if a particular personality appears on TV.
Jarvis’s audio capabilities are what makes the platform unique. “We have the world’s most accurate audio setup – an accuracy level of 98. 7%, backed by the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) and VoxCeleb (an audio-visual dataset consisting of short clips of human speech extracted from YouTube interviews),” says Rai.
To build Jarvis, Rai and his team had to first create a video management system that could aggregate data from heterogeneous devices like cameras, TVs and drones. AI models then had to be designed that could interpret this data, like transformer models used in ChatGPT. “All of these AI models run concurrently and talk to each other,” says Rai.
One of the biggest challenges faced was figuring out how to process data from thousands of devices parallelly, “If there are 30,000 cameras, then each camera will be sending 10 frames per second (FPS). That’s 3 lakh frames per second. We had to build a frame processor to do it,” says Rai.
The company has patented some of its innovations. “We got a patent for reidentification and one for sketch-based identification. Both are part of our forensic module,” says Rai. Jarvis can reidentify a person, like a thief, if any footage of the person is available. It can identify people via sketches made by an artist.
Jarvis is used by many in India and overseas, including WeWork, Metro shoes, Crocs, and Tata Consumer Products. The police in several Indian states are customers.
Read the post on