ROCKIT is a strategic roadmapping project for research and innovation in the area of natural conversational interaction. The primary scientific focus concerns interactive agents which are proactive, multimodal, social, and autonomous. A second focus concerns systems which can extract and exploit rich context and knowledge from heterogenous data sources.
The main goal of ROCKIT is the development of a Research and Innovation Roadmap which integrates the vision and innovation agendas of those organisations (concerned with R&D and exploitation) in the field across Europe, with a broad coverage across sectors. A key goal is to bring together public sector research organisations with commercial organisations at all scales, with a particular focus on SMEs that represent the majority of fragmented commercial activity in Europe.
A key aspect of ROCKIT will be to organise a European research and innovation community in the area of conversational interaction technologies, integrating a wide-range of commercial organisations with application and use links to the area. ROCKIT will be structured around a set of sector-based clusters including mobile applications, healthcare, education, games, broadcast media, robotics, law enforcement, and security.
Research and Innovation Scenarios
As part of the strategic roadmapping action in the area of multimodal conversational interaction technologies, ROCKIT has arrived at a set of five target research and innovation scenarios , presented here. These scenarios represent a number of common themes arising from the workshops organized uring the process: accessibility, multilinguality, the importance of design, privacy by design, systems for all of human–human, human–machine, and human–environment interactions, robustness, security, potentially ephemeral interactions, and using the technology to enable fun.
Scientists at the University of East Anglia in Norwich, England, are working on the next stage of automated lip reading technology that could be used for deciphering speech from video surveillance footage. The visual speech recognition technology, created by Dr. Helen Bear and Professor Richard Harvey of UEA’s School of Computing Sciences, can be applied “any place where the audio isn’t good enough to determine what people are saying,” says Dr. Bear.