Research

Publications

  1. Zhou, M., Yang, L., Tan, Vincent Y. F. & Zhang, M., “Age-Optimal Best Arm Identification,” IEEE/IFIP International Conference on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), (In Press), 2026.

  2. Yang, L., Tan, Vincent Y. F. & Cheung, W. C., “Best Arm Identification with Possibly Biased Offline Data,” Uncertainty in Artificial Intelligence (UAI), 41, pp. 4715 - 4730, 2025.
    [abstract] [bibtex] [url]

    We study the best arm identification (BAI) problem with potentially biased offline data in the fixed confidence setting, which commonly arises in real-world scenarios such as clinical trials. We prove an impossibility result for adaptive algorithms without prior knowledge of the bias bound between online and offline distributions. To address this, we propose the LUCB-H algorithm, which introduces adaptive confidence bounds by incorporating an auxiliary bias correction to balance offline and online data within the LUCB framework. Theoretical analysis shows that LUCB-H matches the sample complexity of standard LUCB when offline data is misleading and significantly outperforms it when offline data is helpful. We also derive an instance-dependent lower bound that matches the upper bound of LUCB-H in certain scenarios. Numerical experiments further demonstrate the robustness and adaptability of LUCB-H in effectively incorporating offline data.
    @inproceedings{yang2025best,
      title={Best Arm Identification with Possibly Biased Offline Data},
      author={Yang, Le and Tan, Vincent YF and Cheung, Wang Chi},
      booktitle={Conference on Uncertainty in Artificial Intelligence},
      pages={4715--4730},
      year={2025},
      organization={PMLR}
    }
    
  3. Yang, L., Gao, S., Li, C. & Wang, Y., “Stochastically Constrained Best Arm Identification with Thompson Sampling,” Automatica. 176: 112223, 2025.
    [abstract] [bibtex] [url] [slides]

    We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with multiple performance measures. The goal is to identify the arm that optimizes the objective measure subject to constraints on the remaining measures. We will explore the popular idea of Thompson sampling (TS) as a means to solve it. To the best of our knowledge, it is the first attempt to extend TS to this problem. We will design a TS-based sampling algorithm, establish its asymptotic optimality in the rate of posterior convergence, and demonstrate its superior performance using numerical examples.
    
    @article{yang2025stochastically,
      title={Stochastically constrained best arm identification with Thompson sampling},
      author={Yang, Le and Gao, Siyang and Li, Cheng and Wang, Yi},
      journal={Automatica},
      volume={176},
      pages={112223},
      year={2025},
      publisher={Elsevier}
    }
    
    
  4. Yang, L., Gao, S. & Ho, C., “Improving the knowledge gradient algorithm,” Advances in Neural Information Processing Systems (NeurIPS), 36, pp. 61747 - 61758, 2023.
    [abstract] [bibtex] [url] [slides]

    The knowledge gradient (KG) algorithm is a popular policy for the best arm identification (BAI) problem. It is built on the simple idea of always choosing the measurement that yields the greatest expected one-step improvement in the estimate of the best mean of the arms. In this research, we show that this policy has limitations, causing the algorithm not asymptotically optimal. We next provide a remedy for it, by following the manner of one-step look ahead of KG, but instead choosing the measurement that yields the greatest one-step improvement in the probability of selecting the best arm. The new policy is called improved knowledge gradient (iKG). iKG can be shown to be asymptotically optimal. In addition, we show that compared to KG, it is easier to extend iKG to variant problems of BAI, with the ϵ-good arm identification and feasible arm identification as two examples. The superior performances of iKG on these problems are further demonstrated using numerical examples.
    
    @inproceedings{yang2023improving,
     author = {Yang, Le and Gao, Siyang and Ho, Chin Pang},
     booktitle = {Advances in Neural Information Processing Systems},
     editor = {A. Oh and T. Naumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
     pages = {61747--61758},
     publisher = {Curran Associates, Inc.},
     title = {Improving the Knowledge Gradient Algorithm},
     url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/c272409133942e2f4b7631c8cb7e507e-Paper-Conference.pdf},
     volume = {36},
     year = {2023}
    }
    
    
  5. Yang, L., Zheng, Y. & Shi J., “Risk-Sensitive Stochastic Control with Applications to An Optimal Investment Problem under Correlated Noises,” Chinese Control Conference (CCC), 38, pp. 1356 - 1363, 2019.
    [abstract] [bibtex] [url]

    This paper is concerned with a risk-sensitive stochastic control problem, motivated by an optimal investment problem under correlated noises in the financial market. A new stochastic maximum principle for this kind of problem is obtained first, where the adjoint equations and maximum condition heavily depend on the risk-sensitive parameter and the correlation coefficient. Then the theoretical result is applied to the optimal investment problem with correlated noises, and the optimal investment strategy is obtained in a state feedback form, under a critical condition satisfied by the risk-sensitive parameter and the correlation coefficient. Numerical simulation and figures are given to explicitly illustrate the change and the sensitivity for optimal solution with respect to the risk-sensitive parameter and the correlation coefficient.
    
    @inproceedings{yang2019risk,
      author={Yang, Le and Zheng, Yueyang and Shi, Jingtao},
      booktitle={2019 Chinese Control Conference (CCC)},
      pages={1356--1363},
      title={Risk-Sensitive Stochastic Control with Applications to An Optimal Investment Problem under Correlated Noises},
      year={2019},
      organization={IEEE}
    }
    
    
    

Academic Exchanges

University of Southern California (USC)
Visiting Scholar in the Department of Industrial and Systems Engineering, School of Engineering,
Advisor: Prof. Sheldon Mark Ross
Research Interests: Multi-Armed Bandit, Online Learning
Feb 2024 – Jun 2024

Chinese Academy of Sciences
Research Assistant in the Academy of Mathematics and Systems Science
Supervisor: Prof. Dacheng Yao
Research Interests: Risk-Sensitive, Inventory Control
May 2019 – Aug 2019

Conference Presentations

  • 21 Jul 2025 – 25 Jul 2025: The 41st Conference on Uncertainty in Artificial Intelligence (Poster)

  • 10 Dec 2023 – 16 Dec 2023: The 37th Conference on Neural Information Processing Systems (Poster)

  • 27 Jul 2019 – 30 Jul 2019: The 38th Chinese Control Conference (Oral)

Academic Service (Reviewer)

  • Automatica

  • Journal of Simulation (JOS)

  • INFORMS Journal on Computing (JOC)

  • Advances in Neural Information Processing Systems (NeurIPS)

  • International Conference on Machine Learning (ICML)

  • Uncertainty in Artificial Intelligence (UAI)

  • IEEE Transactions on Automation Science and Engineering (TASE)