Congratulations to Wang Hui on her successful graduation!

Wang Hui graduated from Yanbian University and began her master’s degree in the School of Computer Engineering and Science at Shanghai University in September 2022. After joining the research group, she studied natural language processing and related technologies and applications under the guidance of Professor Han Yuexing, completing the following research:

  1. In order to exploit the potential of large language models for entity extraction in scientific literature scenarios, a context-consistent explicit entity annotation method with a two-stage training approach is proposed to address the discrepancy between the generative output of the large language model and the sequential annotation characteristics of the named entity recognition task. Then, the training phase is divided into two phases: supervised fine-tuning and direct preference optimisation. The supervised fine-tuning phase first learns the basic entity recognition ability on the existing annotation data; the direct preference optimisation phase tries to expand and contract the entity boundaries in negative sample construction in order to guide the model to correct the errors more efficiently, and screens the reasoning results of supervised fine-tuning to create category confusing samples, and enhances the model to respond to the errors by utilising the positive and negative samples pairs of preference difference to constrain and enhance the model’s ability to correct erroneous determinations.

  2. In order to solve the problem of insufficient recognition accuracy of named entities due to a large number of low-frequency terminology when general-purpose models deal with highly specialised domains such as materials science and biomedicine, this paper proposes a semantic fusion method based on domain language models, which enhances the deeper semantic understanding of scientific literature by semantically fusing different domain language models and domain word-level vectors, and verifies the effectiveness of the method on the complex specialised texts in materials science and biomedicine fields by experiments. The effectiveness of the method is verified experimentally for complex specialised texts in the fields of materials science and biomedicine. Finally, the method is applied to specific fields and three kinds of high hardness alloys are designed to show its practical value in scientific text mining and assisting R&D decision-making.

After graduation, Wang Hui joined Vivo Mobile Communications. During her three-year postgraduate career at Shanghai University, Hui Wang studied hard and participated in research projects. She was able to quickly analyse and propose effective solutions to complex problems, showing strong independent research ability and innovative consciousness. We hope that Hui Wang will not forget her original intention, overcome the obstacles and move forward in the future.

Essay: Research and Applications of Named Entity Recognition for Scientific Literature

Code: https://github.com/han-yuexing/2025-thesis-wh-code