About
Hello, my name is Nguyen Thanh Do (Nguyễn Thành Đô). I will be joining the College of Computing and Data Science (CCDS) at Nanyang Technological University (NTU) as a Ph.D. student in Spring 2026, under the advisement of Prof. Sean Du.
I received my Bachelor's degree in Information Technology from VNU - University of Engineering and Technology, where I worked closely with Prof. Phan Xuan Hieu and Dr. Vuong Thi Hai Yen. Prior to pursuing graduate studies, I spent three years conducting NLP research and developing applications at Viettel AI. During this time, I worked under the guidance of Dr. Bui Khac Hoai Nam, with whom I maintained productive research collaborations.
My current research interests focus on reliable foundation models and their applications, specifically mitigating malicious behavior and understanding hallucination in large language models and multimodal models. Besides research, I am passionate about low-level programming and eager to deepen my understanding of how ML systems work in practice.
Education
Ph.D. in Computer Science
Jan. 2026B.Sc in Information Technology
2019 – 2023Publications
Work Experience
Research Assistant in Artificial Intelligence
Fulbright Uni, HCMC | Sep. 2025 – Jan. 2026Doing research in the intersection of Artificial Intelligence and Computational Biology, exploring how LLMs can help accelerate progress in biology research.
Artificial Intelligence Researcher
Viettel AI, Hanoi | Nov. 2022 – Sep. 2025Conducting applied research in semantic search, mathematical reasoning with LLMs, model merging techniques, and agentic frameworks. Contributing significantly to the development of multiple Vietnamese language models (upto 70B parameters). Building retrieval-augmented generation (RAG) applications for in-house services, incorporating document search and answer generation/extraction
Projects
This repo implements forward pass of FlashAttention-2 paper, which computes attention by tiles and avoids ever explicitly materializing full attention score matrices, enabling scaling to much longer sequence lengths.