Structure-Aware Machine Learning
Classical black-box optimization treats the objective as a completely opaque function and builds surrogate models — Gaussian processes, neural networks — purely from input-output data. Yet in practice, many systems carry partial structural knowledge: known decompositions, conditional independencies, input-output monotonicity, or heterogeneous response types across regions of the search space. Ignoring this information wastes expensive function evaluations.
We develop structure-aware surrogate models that incorporate this knowledge directly into the model architecture. The Conditional Gaussian Process Tree (CGPT) uses a tree structure to partition the input space and fit locally adapted GPs, enabling efficient gray-box optimization. ClassBO extends Bayesian optimization to heterogeneous functions — objectives that behave qualitatively differently across domains — using a classification layer to route queries to the appropriate surrogate. On the deep learning side, our structure-aware architectures for RNA sequence-structure prediction demonstrate how biological structural constraints can be embedded in neural network design.
Representative Publications
- Jiang, M.M., Khandait, T., & Pedrielli, G. CGPT: A Conditional Gaussian Process Tree for Grey-Box Bayesian Optimization. WSC 2023. DOI
- Malu, M., Pedrielli, G., Dasarathy, G., & Spanias, A. ClassBO: Bayesian Optimization for Heterogeneous Functions. LION 2024. DOI
- Zhou, Y., Pedrielli, G., Zhang, F., & Wu, T. Predicting RNA sequence-structure likelihood via structure-aware deep learning. BMC Bioinformatics, 25(1), p.316, 2024. DOI
Structure-Aware Optimization
Beyond better surrogates, known problem structure can be exploited directly in the optimization algorithm — in how candidates are selected, how budgets are allocated, and how the search adapts over time. This is particularly impactful in high-dimensional settings, where standard Bayesian optimization degrades rapidly and problem decomposability or low effective dimensionality must be leveraged.
Our work on model aggregation addresses large-scale, high-dimensional optimization by combining multiple local models in a principled way, achieving state-of-the-art performance on problems with thousands of variables. We have also advanced Optimal Computing Budget Allocation (OCBA) methods for stochastic simulation optimization — developing theory and algorithms that optimally distribute a finite simulation budget across competing design alternatives. Applications span circuit design, manufacturing systems, and engineering design under uncertainty.
Representative Publications
- Wang, H., Zhang, E., Ng, S.H., & Pedrielli, G. A model aggregation approach for high-dimensional large-scale optimization. European Journal of Operational Research, 329(3), 890–907, 2026.
- Malu, M., Dow, D., Sharma, P., et al. High dimensional Bayesian optimization for circuit design. Intelligent Decision Technologies, 19(3), 1271–1282, 2025. DOI
- Pedrielli, G., Lee, L.H., & Chen, C.H. Stochastic Simulation Optimization with Optimal Computing Budget Allocation. Encyclopedia of Optimization, Springer, 2024.
e.g., multi-agent rollout diagram or multi-fidelity budget allocation schematic
Multi-Agent & Multi-Fidelity Approaches
Many real-world optimization problems have access to multiple sources of information at varying cost and accuracy — from cheap low-fidelity simulations to expensive physical experiments. Multi-fidelity methods intelligently allocate an evaluation budget across these sources, extracting the most information per dollar. Separately, many problems involve multiple interacting decision-makers, calling for game-theoretic or multi-agent formulations.
We have developed rollout-based policies for multi-agent Bayesian optimization, enabling parallelization of the search across multiple simultaneous queries with theoretical backing. On the game-theoretic side, we have introduced Monte Carlo fictitious play methods for efficiently finding pure Nash equilibria in large identical interest games. Our spatially-informed rapid testing framework (SIRTEM/RTEM), originally developed during COVID-19, demonstrates multi-fidelity data collection at scale for epidemic modeling — a different but structurally related problem class.
Representative Publications
- Nambiraja, S.S. & Pedrielli, G. Multi Agent Rollout for Bayesian Optimization. WSC 2024.
- Kiatsupaibul, S., Pedrielli, G., Ryan, C.T., Smith, R.L., & Zabinsky, Z.B. Monte Carlo fictitious play for finding pure Nash equilibria in identical interest games. INFORMS Journal on Optimization, 6(3–4), pp.155–172, 2024.
- Azad, F.T., Dodge, R.W., Varghese, A.M., et al. SIRTEM: Spatially informed rapid testing for epidemic modeling and response to COVID-19. ACM Transactions on Spatial Algorithms and Systems, 8(4), 2022.