About the Research
Information/knowledge extraction from research papers and its utilization
I aim to extract information about chemical reaction network graphs contained in the collection of scientific papers and use it to analyze chemical reactions. Developing such tools and providing them to chemists will serve as a starting point to establish a growing cooperation framework between data scientists and chemists which can yield valuable insights into the global chemical research landscape for the chemists, and for the data scientist an inroads to understanding the research design process and needs of specialists.
Natural language processing is the traditional tool to extract this kind of information, but it is difficult to recruit the help of experts for the necessary annotation step that provides the labels computers can use to understand which parts of a text signifies what concept. Nevertheless, using the above-mentioned tools, I hope that the close collaboration with chemists at ICReDD will provide me with an opportunity to establish a framework of mutual benefit. In addition, the recently developed neural net-based tools promise to reduce the burden of annotation by being able to process the large amounts of data available in the ever-growing literature to extract characteristic chemical information automatically.
The Researcher’s Perspective
My PhD supervisor was Professor Hiroyuki Yoshikawa, who is an eminent figure in the field of General Design Theory. He always emphasized that scientific findings should be to the benefit of humankind and so he coined the term of the modern evil. Modern evils are particular design solutions that cause greater problems elsewhere, such as the production of cheap and labor-saving plastic straws that are serious pollutants for the environment. I would like to contribute to overcoming such modern evils by understanding the design process itself and then expanding that understanding in the view of the wider context.
Representative Research Achievements
- Construction of an In-House Paper/Figure Database System Using Portable Document Format Files
Masaharu Yoshioka, and Shinjiro Hara, In Information Search, Integration, and Personalization: 10th International Workshop, ISIP 2018, Dimitris Kotzinos, Dominique Laurent, Nicolas Spyratos, Yuzuru Tanaka, and Rin-ichiro Taniguchi (eds), Springer-Verlag GmbH, CCIS 1040, 2019, 142-156
- Extraction of Chemical and Drug Named Entities by Ensemble Learning Using Chemical NER Tools Based on Different Extraction Guidelines
Thaer M. Dieb and Masaharu Yoshioka, Transactions on Machine Learning and Data Mining, 2015, Vol. 8, No. 2, 61-76
- Framework for Automatic Information Extraction from Research Papers on Nanocrystal Devices
Thaer M. Dieb, Masaharu Yoshioka, Shinjiroh Hara, and Marcus C. Newton, Beilstein Journal of Nanotechnology, 2015, Vol. 6, 1872-1882
- On a Combination of Probabilistic and Boolean IR Models for WWW Document Retrieval
Masaharu Yoshioka and Makoto Haraguchi, ACM Transactions on Asian Language Information Processing (TALIP), 2005, Vol. 4, No. 3, 340-356
- Physical Concept Ontology for the Knowledge Intensive Engineering Framework
Masaharu Yoshioka, Yasushi Umeda, Hideaki Takeda, Yoshiki Shimomura, Yutaka Nomaguchi, Tetsuo Tomiyama, Advanced Engineering Informatics, 2004, Vol. 18, No. 2, 95-113