About the Research

Research Theme

Theory and application of multi-armed bandit


Algorithm, multi-armed bandit, machine learning, knowledge discovery and data mining
Research Outline

I am working on a research on machine learning, especially the theory and applications of multi-armed bandit algorithm, which is a kind of reinforcement learning.
Broadly speaking, machine learning can be divided into supervised learning, unsupervised learning, and reinforcement learning. In supervised learning, the task is constructing an algorithm that returns a correct output for a given input based on annotated training data. In unsupervised learning, the task is extracting the patterns for data without annotation.
On the other hand, in reinforcement learning, the task is learning by trial and error. In the given “environment”, the program selects “action”, and as a result of the selected action, “reward” is given and “environment” changes. Under this circumstance, the goal is to maximize the cumulative sum of rewards or learn the best behavior. As an example of applications for Game AI of reinforcement learning, AlphaGo, which beats the world champion of Go, is very famous.
By using the framework of multi-armed bandit, I am developing an algorithm such as a method to reduce the cost and time required for drug screening experiments and a method to accelerate a cancer diagnosis by microscopy.

Representative Research Achievements

