OSD/Navy awards IAI a new contract to develop a Discovery and Information Retrieval System from Distributed Multi-INT Data Sources in a Cloud Environment

Conducting real-time searches of large distributed data stores to quickly answer a Warfighter’s tactical question would be very useful. Data intensive processing cannot be done currently by a single machine and requires clusters. Thus handling massive data problems requires organizing computations on dozens, hundreds or even thousands of machines, which can be done by MapReduce. To address the challenges in using MapReduce to bring information to the Warfighter, IAI has been awarded a contract entitled, “Discovery and Information Retrieval from Distributed Multi-INT Data Sources in a Cloud Environment.” A multi-layer intelligent workflow is proposed, initiated by a Warfighter’s questions submitted via a PDA and combining it with metadata. This invokes other services, implemented as MapReduce processes in a Hadoop-based ecosystem. The query and metadata are translated into MapReduce jobs using two components, the Translator and IAI’s SensorCube. Translator uses a dictionary to convert questions into lexical frames and SensorCube uses ontologies to convert the keywords into native cloud computing queries. Existing distributed data querying and massive data analytics tools like Hadoop, MapReduce, Hive and Katta are used along with solutions from data management, cloud computing, data mining, data fusion and anomaly detection. The workflow ends back at the Warfighter’s PDA, where ranked meaningful information is presented after a data fusion process. Improvements to the Hadoop kernel are proposed, including dealing with bottleneck Map jobs, improvements for iterative tasks and tandem parallel databases-Hadoop operations. The resultant framework, which focuses on accuracy and speed, can run on heterogeneous platforms, incorporates dynamic and evolving workflows and leverages open-source software implementations.

