Yucheng Wang, M.Sc., B.Eng.
GitHub: https://github.com/131250208 | E-mail: wangyucheng@iie.ac.cn
I am looking for a Ph.D. position! My research insterests are NLP and knowledge graph. If you have any helpful info, please contact me! Thank you very much!
Research Interests
Natural Language Understanding, Information Extraction, Knowledge Graph, Data Mining
Education
Master of Science in Engineering (GPA: 3.71/4), University of Chinese Academy of Sciences, 2017 - 2020
- Relevant courses: Statistical Machine Learning (90), Information Retrieval (94), Deep Learning and Security (92), AI Security (93)
- IT skills: PyTorch, TensorFlow 1.0, TensorFlow 2.0, scikit-learn, Scrapy
- 2 years of research experience in IoT Lab, Institute of Information Engineering, Chinese Academy of Sciences (IIE, CAS)
- 3 papers published (1 in top conference, COLING 2020), 2 papers submitted, 3 papers in progress
Bachelor of Engineering in Software Engineering, Nanjing University, 2013 - 2017
- Relevant courses: Software Engineering and Computing II, III (85, 90), OS, Data Structure and Algorithm Analysis, Calculus, Linear Algebra
- IT skills: Java, C++, Python, Spring Boot, MySQL, Elasticsearch, NoSQL, JavaScript, HTML/CSS/XML
- Nanjing University 2014 People’s Scholarship and Merit Student
- 2 years of entrepreneurial experience: founded a software company and received RMB 1 million yuan funding
- Project listed in Nanjing High-tech Entrepreneurship Leading Talents Program (10% selected out of 1800 applicants)
- Funded by Nanjing University National Funding for Mass Entrepreneurship and Innovation
Publication
Yucheng Wang, Bowen Yu, Yueyang Zhang, Tingwen Liu, Hongsong Zhu and Limin Sun. “TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking”, In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020). Github PDF
Bowen Yu, Zhenyu Zhang, Jiawei Sheng, Tingwen Liu, Yubin Wang, Yucheng Wang and Bin Wang. “Semi-Open Information Extraction”, Submitted.
Yucheng Wang, Hongsong Zhu, Jinfa Wang, Jie Liu, Yong Wang, Limin Sun. “XLBoost-Geo: An IP Geolocation System Based on Extreme Landmark Boosting”, arXiv, Submitted.
Yucheng Wang, Xu Wang, Hongsong Zhu, Hai Zhao, Hong Li, and Limin Sun. "ONE-Geo: Client-Independent IP Geolocation Based on Owner Name Extraction", In Proceedings of the 14th International Conference on Wireless Algorithms, Systems, and Applications, pp. 346-357. Springer, Cham, 2019 (WASA 2019).
Xu Wang, Yucheng Wang, Xuan Feng, Hongsong Zhu, Limin Sun, and Yuchi Zou. "IoTTracker: An Enhanced Engine for Discovering Internet-of-Thing Devices", In Proceedings of the 20th International Symposium on" A World of Wireless, Mobile and Multimedia Networks", pp. 1-9. IEEE, 2019 (WoWMoM 2019).
Work Experience
- Software Engineer, Nanjing Weavi Information Technology Co., Ltd, 2014 – 2016
- Cofounder, Jiangsu Kongchanmingsha Information Technology Co., Ltd, 2016 – Present
- Research Assistant, Institute of Information Engineering, CAS, 2020 – Present
Rearch Experience
TPLinker for Joint Extraction, Dec. 2020, Barcelona, Spain (Online)
- To present a research paper on joint extraction at COLING 2020 (top conference, held once every 2 years)
Information Extraction by Token Pair Linking, Oct. 2020, Beijing, China
- Invited by the academic committee of IIE, CAS to give a talk about utilizing Token Pair Linking on information extraction tasks
- Covered joint extraction, nested named entity recognition, and open information extraction
Open Information Extraction, Oct. 2020 - Present, Beijing, China
- Extracting facts in the form of “subject, predicate, object” from text without pre-defined ontology schema
- Working on extracting discontinuous entity and predicate
- Implementing 2 ideas for sequence labeling model and seq-to-seq model respectively and writing 2 papers
- Submitted a paper about semi-open information extraction to a top conference, WWW.
Joint Extraction, Feb. 2020 – Jul. 2020, Guilin, China
- Designed a novel tagging schema and implemented a new model using PyTorch for single-stage joint extraction of entities and relations
- Achieved state-of-the-art (SOTA) performance on NYT and WebNLG
- One regular paper was accepted by a top international conference in NLP, COLING 2020 (32.9% accepted out of 1956 reviewed)
Nested Named Entity Recognition, Mar. 2020 – Jun. 2020, Guilin, China
- Designed a novel tagging schema and developed a new model using PyTorch for nested named entity recognition
- Completed the experiments and achieved SOTA performance on GENIA, writing the paper.
ONE-Geo, Jun. 2019, Hawaii, USA
- Presented a research paper on IP geolocation at WASA 2019 (international conference)
IoTTracker, Jun. 2019, Washington DC, USA
- Presented a research paper on IoT device detecting at WoWMoM 2019 (international conference)
IP geolocation, Sep. 2018 – Feb. 2020, Beijing, China
- Improved IP geolocation model by large-scale landmark mining based on NER and network measurement
- Achieved street-level precision: 441m median error distance (MED) on PlanetLab nodes, 2561m MED on RIPE-Atlas nodes
- One regular paper was accepted by an international conference, WASA 2019 (international conference) (33.8% accepted), and an extended paper was submitted.
IoT Device Tracker, Jun. 2018 – Jan. 2019, Beijing, China
- Identifying IoT devices on the Internet utilizing features extracted from the response data
- One regular paper was accepted to an international conference, WoWMoM 2019 (international conference) (17.8% accepted)
Other Projects
Doraemon, Feb. 2018 – May. 2018, Beijing, China
- A repository of crawlers to collect data using Scrapy and Beautiful Soup, only for research.
- Collected data from popular web applications, including Weibo microblog, QQ music lyrics, and NetEase cloud music
IR System for Financial News, Sep. 2017 – Jan. 2018, Beijing, China
- Developed an information retrieval (IR) system for news using Python
- The final project of a course - Information Retrieval, taught by the chief scientist of Xiaomi Inc., Dr. Bin Wang
Ancient Books IR System, Jan. 2016 – Dec. 2017, Nanjing, China
- Developed an IR system for ancient books in Java based on Spring Boot and Elasticsearch
- Benefits thousands of researchers who had to physically access the ancient books
- Received a venture capital fund of RMB one million yuan
- Project listed in Nanjing High-tech Entrepreneurship Leading Talents Program (10% selected out of 1800 applicants)
- Funded by Nanjing University National Funding for Mass Entrepreneurship and Innovation
Cao Cao – Gaming Application, Feb. 2014 – Jun. 2014, Nanjing, China
- Developed a desktop game inspired by Dots and Boxes (a pencil-and-paper game) using Java
- Won the first prize in EL 2014 game design competition host by Nanjing University