Mike Zhang

PhD Fellow in NLP @NLPnorth, IT University of Copenhagen

& a Research Visitor @CIS, Ludwig Maximilian University of Munich

Hello there! My name is Mike Zhang. I'm a second-year PhD Student at the IT University of Copenhagen (ITU) under supervision of Prof. Barbara Plank. I am part of the NLPnorth research unit and also indirectly affiliated with the CIS (Center for Information and Language Processing) at the Ludwig Maximilian University of Munich (LMU).

In Spring 2023, I will also be a Research Intern at WING (Web Information Retrieval & Natural Language Processing Group) at the National University of Singapore, advised by Prof. Min-Yen Kan. I will work on NLP and IR related to job descriptions and related text sources.

My main focus is working on automated high-quality Information Extraction from unstructured text with real-life use cases that have societal impact. In my case, I am working on Skill Extraction for Job Market Analysis. My other interests include tricks and approaches to get more labeled training data and/or exploit models for tasks with limited data — this includes Active Learning, Weak Supervision, and Transfer Learning.




[7] Dennis Ulmer, Elisa Bassignana, Max Müller-Eberstein, Daniel Varab, Mike Zhang, Christian Hardmeier, and Barbara Plank. 2022. Experimental Standards for Deep Learning Research: A Natural Language Processing Perspective. To appear at the Machine Learning Evaluation Standards at ICLR 2022 (SMILES). Outstanding Paper Award. [Paper] [Repository]

[6] Mike Zhang, Kristian Nørgaard Jensen, Sif Dam Sonniks, and Barbara Plank. 2022. SkillSpan: Hard and Soft Skill Extraction from Job Postings. To appear at the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). [Paper] [Code]

[5] Mike Zhang, Kristian Nørgaard Jensen, and Barbara Plank. 2022. Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning. To appear at the 13th Edition of the Language Resources and Evaluation Conference (LREC). [Code]


[4] Mike Zhang and Barbara Plank. 2021. Cartography Active Learning. In Findings of the Association for Computational Linguistics: EMNLP 2021. [Paper] [Slides] [Code] [Video]

[3] Kristian Nørgaard Jensen, Mike Zhang and Barbara Plank. 2021. De-identification of Privacy-related Entities in Job Postings. In Proceedings of the 23rd Nordic Conference of Computational Linguistics (NoDaLiDa). [Paper] [Slides] [Code] [Video]


[2] Mike Zhang and Antonio Toral. 2019. The Effect of Translationese in Machine Translation Test Sets. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers). [Paper] [Slides] [Code]

[1] Mike Zhang, Roy David, Leon Graumans and Gerben Timmerman. 2019. Grunn2019 at SemEval-2019 Task 5: Shared Task on Multilingual Detection of Hate. In Proceedings of the 13th International Workshop on Semantic Evaluation. [Paper] [Code]


IT University of Copenhagen

Spring 2021, 2022

BSSEYEP1KU, Introduction to NLP and Deep Learning (Senior TA, Lecturer)

Spring 2022

Master Thesis, Computer Science. (Supervision)

Fall 2021

Master Research Project, Computer Science. (Supervision)

Fall 2021

PhD Course, Communicating State-of-the-art NLP Research to a Broader Audience (Co-Organizer)

University of Groningen

Fall 2020

SOMINDW07, Machine Learning (Head TA)

Fall 2019

SOMINDW07, Machine Learning (Head TA)

LIX016M05, Learning from Data (Head TA)

Spring 2019

LIX017B05, Social Media (TA)


Academic Background


PhD Computer Science, IT University of Copenhagen


MA Information Science, University of Groningen


BSc Information Science, University of Groningen

Other (Professional) Experiences

01/2020 - 08/2020

Data Engineer, Dataprovider.com B.V.

09/2019 - 12/2019

Research Engineer Intern, Dataprovider.com B.V.

Full resume can be found here.




You can reach me at mikz(at)itu(dot)dk or message me on any other platform here.