*Result*: Deep Learning Gaussian Processes for Complex Simulators
*Further Information*
*Computer-simulated experiments have gained popularity among science and engineering disciplines for the modeling and study of complex processes. Computational progress and expansions in data availability have resulted in extremely realistic but costly models possessing high dimensional sets of inputs and outputs, stochasticity, and heteroskedastic behaviors, all of which impede routine optimization, uncertainty analysis, and prediction tasks. In the past several decades, a substantial body of work has been published that address a variety of sensitivity and calibration problems. However, most of these published works focus on deterministic problems, assuming that simulator outputs are fixed given the input parameters. Further, these techniques do not address heteroskedastic and non-stationary nature of the problem. Algorithms that ignore these features of the problem may not provide adequate solutions. The goal of this work is to address this gap that exists in the literature. Specifically, the primary contributions of this work include: (1) developing a surrogate model that combines the non-linear deterministic dimensionality reduction technique with the probabilistic GP model; (2) developing an inference scheme that jointly estimates parameters of those two components; (3) applying the sequential design experiment techniques along with the proposed surrogate model to implement Bayesian optimization for an activity-based transportation simulator; (4) extending the coregionalization technique to handle functional, heteroskedastic computer model outputs; (5) developing an estimation procedure that jointly estimates the parameters of the heteroskedastic coregionalization transformation as well as the univariate Gaussian Processes; (6) introducing an automated attention mechanism to our calibration framework to identify the most important contributors and encourage concentrated improvement searches. The first developed framework addresses optimization tasks for stochastic, heteroskedastic, and high-dimensional simulators. The outlined approach incorporates a deep learning dimensionality reduction model with a jointly estimated Gaussian Process model to approximate the simulator. The solution is applied to the calibration of an agent-based, activity-based (ABMs) transportation simulator, which rely on statistical modeling of individual travelers' behavior to predict higher-order travel patterns in metropolitan areas. The developed solution is successful in finding a 97% improvement within just 1 iteration. The second model uses Deep Learning Gaussian Processes (DL-GP) for analysis of computer models that produce heteroskedastic and high-dimensional output. Multi-output models have many areas of applications, including social-economic processes, agriculture, environmental, biology, engineering, and physics problems. A deterministic transformation of inputs is performed by deep learning, and predictions are calculated by traditional Gaussian Processes. The developed surrogate model is then applied to the prediction of an Ebola outbreak simulator. The results of the model demonstrates comparable improvement in 3 scenario predictions. The third model incorporates the multi-output surrogate to the thesis' first framework. The resulting Bayesian optimization algorithm is applied to an expanded 104-variable version of the first transportation calibration use case. The addition of the multivariate surrogate method finds better solutions in even shorter time frames than the first framework; a 14% improvement within the first iteration and a final improvement of 41% by the twelfth iteration. Although our work is focused on simulators, we believe similar approaches can be used to answer other decision-making questions of how to dynamically deploy a shared mobility fleet to help highly congested corridors or how to change frequency of transit services. The research also contributes more broadly to the challenges of optimal design of complex systems under uncertainty. For example, other applications include supply chains, computer networks, healthcare networks, and robotics.*