Recent rapid developments in atmospheric composition sensors lead to huge opportunities for science, in particular for the Earth Observation (EO) technology for monitoring our planet from space. The new generation of Copernicus satellites, e.g. Sentinel-5 Precursor, provides a large amount of data, which has to be processed coping with the near-real-time requirements. Radiative transfer models (RTMs) are key components of the atmospheric processors, i.e. software specifically designed for the retrieval of atmospheric constituents from remote sensing data. The RTMs encompass our understanding of the physics behind the measurement process and relate the optical parameters of the atmosphere with the signal to be measured by the satellite. Today the requirements to the efficiency of RTMs are higher than ever. The traditional approaches implemented in the current atmospheric processors are inadequate to deal with so large amount of data. In fact, the amount of satellite data increases faster than the computational power. The remote sensing data is recognized as Big Data since it satisfies IBM’s 4V criterion: significant growth in the volume, velocity, and variety and veracity. New efficient techniques have to be developed for the next generation of atmospheric processors to cope with these high efficiency requirements. The design of new generation atmospheric processors can be regarded as a disruptive development. Artificial Intelligence (AI) is certainly one important way how to drastically improve the performance of RTMs, thereby enabling faster exploration of remote sensing Big Data. There is the consensus in the scientific community that AI, being a powerful tool, should not be used as a black box, but it should rather become an integral part of RTMs. Such a synergy between RTMs and AI establishes a new field of radiative transfer, namely AI4RTM, which represents a new physical basis for the AI4EO concept. Within AI4RTM, physics-based models are combined with the AI approach.
The following methods successfully treated by the machine learning approach boost the RTMs: classification, regression, dimensionality reduction and feature extraction. In our talk, all these concepts are outlined in the context of AI4RTM used both for forward modeling and retrieval.
AI has become a paramount tool in hyper-spectral data processing. Usually, the hyper-spectral data has a lot of inter-dependencies and correlations between different channels and spectral bands. In this regard, the dimensionality reduction techniques substantially reduce the number of RTM calls required for processing the data (del Águila et al., 2019). Such methods as the principal component analysis (PCA) integrated into RTMs provide a special kind of PCA-based RTMs, which can mimic the multiple scattering impact at low computational costs. We developed a novel hybrid PCA-based RTM. The simulations start with the simplified RTM (e.g. the two-stream model). By analyzing the set of hyper-spectral radiances, we identify a subset of wavelengths carrying the most relevant information and establish a projection of the initial data space onto a lower dimension subspace. In the reduced subspace, the optical properties are analyzed by means of AI and a correction function for the simplified RTM is found. Such hybrid implementation, in which AI treatment surrounds the RTM, improves the performance of hyper-spectral forward modeling by several orders of magnitude. This approach proved its efficiency for simulating the Hartley-Huggins band.
To accelerate the RTM computations and improve the efficiency of retrieval algorithms, artificial neural networks are used to efficiently parametrize the physical processes in the model. In this context, we use feed-forward neural networks to learn the behavior of a radiative transfer forward model by using a set of representative training data, whose input vectors are selected by means of the smart sampling technique (Loyola et al., 2016) to optimally cover the whole input space, and whose output vectors are the radiances returned by the radiative transfer forward model for every given input vector. Such kind of implementation has been used successfully as forward model in the retrieval of cloud properties from different sensors, including the newest TROPOMI (Loyola et al., 2018).
Regarding inverse problems, a special type of algorithms called Full-Physics Inverse Learning Machines (FPILM) (Efremenko et al., 2017; Xu et al., 2018) have been developed, specifically designed for the retrieval of atmospheric properties from remote sensing measurements. Unlike conventional inversion approaches, the FPILM, for instance, formulates the ozone profile shape retrieval as a classification problem, resulting in a reliable accuracy and significant speed-up. Its implementation consists of a training phase, wherein an inverse function is obtained from synthetic measurements generated using a RTM, and an operational phase in which the inverse function is applied to real measurements. In particular, the radiative transfer calculations are carried out once and offline. Another advantage of FPILM is its “simplicity”: the retrieval of the atmospheric properties of interest can be done very easily after the inverse function (with its coefficients) is determined from the training phase. Accordingly, the computational time of FPILM is several orders of magnitude faster than that of conventional approaches based on Tikhonov regularization with online RTM simulations, without degrading the retrieval quality. The FPILM algorithms are used for the operational processing of GOME-2 and now S5P/TROPOMI data.
As compared to the traditional retrieval techniques, our approaches exploit new computational paradigms and upgrade retrieval algorithms to a level, when they can be applied for analyzing Big Data. From different perspectives of understanding Big Data, there is always a risk to rely just on AI and to lose ability to interpret the results from the physical point of view. In our case, despite the fact that we are confronted with a sort of disruptive development based on AI, the physical component expressing our understanding of physical processes in the atmosphere is still preserved.
References
del Águila A., Efremenko D.S., Molina García V., Xu J. (2019): Analysis of Two Dimensionality Reduction Techniques for Fast Simulation of the Spectral Radiances in the Hartley-Huggins Band. Atmosphere, 10, 142.
Efremenko D.S., Loyola R. D.G., Hedelt P., Spurr R.J.D. (2017): Volcanic SO2 plume height retrieval from UV sensors using a full-physics inverse learning machine algorithm, International Journal of Remote Sensing, 38:sup1, 1.
Loyola D., Pedergnana M., Gimeno García S. (2016): Smart sampling and incremental function learning for very large high dimensional data, Neural Networks, 78, 75.
Loyola D.G., Gimeno García S., Lutz R., Argyrouli A., Romahn F., Spurr R.J.D., Pedergnana M., Doicu A., Molina García V., Schüssler O. (2018): The operational cloud retrieval algorithms from TROPOMI on board Sentinel-5 Precursor, Atmospheric Measurement Techniques, 11, 1, 409.
Xu J., Heue K.-P., Coldewey-Egbers M., Romahn F., Doicu A., Loyola D. (2018): Full-Physics Inverse Learning Machine for satellite remote sensing of ozone profile shapes and tropospheric columns. ISPRS. ISPRS Technical Commission III Symposium 2018, 7.-10. Mai 2018, Beijing, China.