A consortium working on an EU-financed research project is developing a tool designed to reconstruct public events in 3D by drawing on multiple smartphone videos.

Reliving events in 3D with collaborative video

Devices such as Google Glass, which are able to receive and make sense of visual input, are nowadays helping 3D vision to gain traction. The SceneNet project, which began last year, is part of this trend. At most music concerts these days you can see a multitude of mobile phones blinking in the crowd as spectators film the show to relive or share it later. Given the increasing number of high-quality cameras out there, surely this mass of images could be put to good use? This was the thinking behind Chen and Nizan Sagiv putting forward the

SceneNet* project, whose goal is to enable all the videos filmed by people at a given event to be turned into a 3D reconstruction. The project, which is scheduled to run until January 2016, has received €1.33 million from the European Commission through the FET (Future and Emerging Technologies) programme, part of the European Union’s 7th Framework Programme for R&D. The consortium has five partners – the University of Bremen, Steinbeis Innovation, European Research Services, all in Germany, plus Switzerland's Federal Institute of Technology in Lausanne, coordinated by SagivTech, the Ra'anana, Israel-based company owned by Chen and Nizan Sagiv, which provides solutions for image and signal processing algorithms and their application over parallel computing platforms.

Participatory technology

The first year of the project has seen the consortium develop the mobile infrastructure for the video feeds, a mechanism for tagging them, and their transmission to a cloud server. In parallel the team has also developed basic tools for a human-computer interface that will enable users to view the 3D video from any theoretical vantage point ‘in the arena’ and edit the film themselves. SceneNet involves a number of severe technological challenges: on-device pre-processing that requires immense computer power; efficient transmission of the video streams; development of accurate and fast methods for registration between the video streams; and the 3D reconstruction. Moreover, all of these tasks need to run at demanding near real-time rates. Another aspect of this approach is that as the technology is based on a participatory initiative, there is the potential to build online communities who will wish to share the content and relive the concert experience together.

Towards new fields of application

The SceneNet consortium’s work, especially the accelerated computer vision algorithms being created for mobile devices, not only represents a real advance in image recording and 3D reconstruction, but also illustrates the great potential of the mobile-cloud model. At the moment the project is focusing particularly on music concerts and other similar audience events, but the technology might also be used to recreate other types of events in 3D, such as breaking news or sports competitions. In addition, leading manufacturers of video game components are said to be following the project closely. However, the SceneNet project does raise a number of issues around privacy and intellectual property rights, and the consortium partners are planning to study these aspects intensively over the next two years.