Language technologies for Multilingual societies not only need to address the multitude of languages, but also several phenomena that emerge at the intersection of multiple languages, such as code and script mixing. Code-mixing refers to mixing of one or more languages in a single piece of text – say a document or a query. Script mixing, on the other hand, is the use of script from one language to represent another language. These phenomena emerge due to cognitive, technological as well as several other socio-cultural factors, and therefore, are extremely heterogenous in nature. How can we build IR systems that can handle code-mixed queries and/or documents that might be written in a mixture of scripts? In this talk I will discuss these problems, their extent and nature, and also several strategies to tackle them which our team at Microsoft Research Lab India has been developing for a decade now.
Dr. Monojit Choudhury is a Principal Researcher at Microsoft Research Lab India. His research interests cover several sub-areas of Artificial Intelligence, Linguistics and Cognitive Sciences. He is well-known for his pioneering work in computational processing of code-mixed languages, which is one of his current research areas. He is also actively working in NLP for low-resource, minority and endangered languages, computational sociolinguistics and explainable NLP. He is a Professor of Practice at Plaksha University, and has held adjunct faculty positions in Ashoka University, IIT Kharagpur. Dr. Choudhury has served as an editor for reputed journals in NLP and has been Area chair and (senior) PC member of major NLP and AI conferences including AAAI, ACL, EMNLP, NAACL, IJCNLP, CoNLL, and CODS-COMAD. He is also the general chair of the Panini Linguistics Olympiad and the founding chair of Asia-Pacific Linguistics Olympiad, which are respectively the Indian national and an international program for encouraging high school students to explore the field of linguistics and language diversity of the World through puzzle solving. Dr. Choudhury earned his B.Tech and PhD degrees in computer science and engineering from Indian Institute of Technology, Kharagpur.