Multivariate Peaks-Over-Threshold (POT) Modelling of Non-stationary Air Pollution Concentration Data
Dr Leonid Bogachev (SoM), Dr Haibo Chen (ITS), Dr Georgios Aivaliotis (SoM), Prof. Jeanne Houwing-Duistermaat (SoM)Project partner(s): Leeds City Council (LCC), the Environment AgencyContact email: L.V.Bogachev@leeds.ac.uk
Peaks-over-threshold (POT) method is the preferred modern approach to analyse extreme values in a time series. This is due to a better usage of information as compared to the classic block-maxima method (which utilises only one maximum value in each block, e.g. year). Moreover, in many applications the impact of extremes is often implemented through a few moderately large values rather than due to a single highest maximum.
Threshold exceedances approximately follow a generalised Pareto distribution (GPD) with two parameters (scale, shape), which are constant if the data are stationary (i.e. the observed process is in statistical equilibrium). However, in many practical situations including the air pollution, parameters of the system are likely to significantly change with time. Following Davison & Smith (1990), threshold exceedances in non-stationary data should be modelled by treating the GDP parameters as functions of (time-dependent) covariates (e.g. weather and traffic conditions for air pollutants). However, the Davison-Smith regression model is not threshold stable, which means that the model parameters have to be re-estimated with every new threshold (which may need to vary with time). Recently, Gyarmati-Szabo, Bogachev and Chen (2017) proposed a novel model for non-stationary POT which is threshold stable. This has a strong potential to improve dramatically the computational efficiency of the POT model, making it into a versatile and powerful tool for dynamic estimation and prediction of extremes. In particular, this approach may serve as the basis for a semi- or fully automated computational tool designed for efficient on-line estimation and accurate prediction of future extreme events. Due to the property of threshold stability, such methods will work efficiently with variable threshold selection.
The present project will aim to develop a more general methodology of joint modelling of several observables such as different air pollutants, e.g. NO2, NO, O3 etc, which are highly correlated due to complex photochemical reactions in the atmosphere in the presence of sunlight. The principal innovation to be achieved is to design a suitable multivariate POT model for non-stationary data that will preserve the property of threshold stability. Data analysis based on such a model will involve the MCMC (Markov Chain Monte Carlo) simulations to obtain posterior distributions of the model parameters. The project is also likely to involve the development of an efficient computer simulator of the (class of) generalised Pareto distributions.
This project will be supervised jointly by the Department of Statistics and the Institute for Transport Studies at Leeds. Also, it has a strong potential to involve collaboration with external organisations, such as the Leeds City Council (LCC) and the Environment Agency.
Related undergraduate subjects:
- Applied mathematics
- Environmental science