Glasgow, Milan and Delft turn time into vision

Author: EIS Release Date: Aug 7, 2020


A new way of imaging – harnessing AI to turn time into visions of 3D space – has been found by researchers at Glasgow University,  the Polytechnic University of Milan and Delft University of Technology.

The discovery could help cars, mobile devices and health monitors develop 360-degree awareness.

Photos and videos are usually produced by capturing photons – the building blocks of light – with digital sensors.

For instance, digital cameras consist of millions of pixels that form images by detecting the intensity and colour of the light at every point of space. 3D images can then be generated either by positioning two or more cameras around the subject to photograph it from multiple angles, or by using streams of photons to scan the scene and reconstruct it in three dimensions. Either way, an image is only built if one gathers spatial information of the scene.

The researchers have found an entirely new way to make animated 3D images – by capturing temporal information about photons instead of their spatial coordinates.

Their process begins with a simple, inexpensive single-point detector tuned to act as a kind of stopwatch for photons.

Unlike cameras, measuring the spatial distribution of colour and intensity, the detector only records how long it takes the photons produced by split-second flash of a pulse of laser light to bounce off each object in any given scene and reach the sensor. The further away an object is, the longer it will take each reflected photon to reach the sensor.

The information about the timings of each photon reflected in the scene – what the researchers call the temporal data – is collected in a very simple graph.

Those graphs are then turned into a 3D image with the help of a sophisticated neural network algorithm. The researchers ‘trained’ the algorithm by showing it thousands of different conventional photos of the team moving and carrying objects around the lab, alongside temporal data captured by the single-point detector at the same time.

Eventually, the network had learned enough about how the temporal data corresponded with the photos that it was capable of creating highly accurate images from the temporal data alone. In the proof-of-principle experiments, the team managed to construct moving images at about 10 frames per second from the temporal data, although the hardware and algorithm used has the potential to produce thousands of images per second.

Currently, the neural net’s ability to create images is limited to what it has been trained to pick out from the temporal data of scenes created by the researchers. However, with further training and even by using more advanced algorithms, it could learn to visualise a much varied range of scenes, widening its potential applications in real-world situations.