Mr Gengze Zhou

Higher Degree by Research Candidate

School of Computer Science and Information Technology

College of Engineering and Information Technology

My research is dedicated to creating explainable and embodied AI systems that can interact dynamically with both humans and their environments. I aim to build an autonomous agent that can understand, reason, and navigate the physical world, while seamlessly communicating with humans in natural language. By integrating machine learning with visual and linguistic applications, I strive to enhance the transparency and interpretability of AI decision-making, fostering more natural and effective human-AI interactions.

Some topics that I currently focus on:

Self Explainable and Communicative Vision-and-Language Navigation (VLN) with Language Models: NavGPT, NavGPT-2
Sim2Real Transfer for VLN with Large Vision-Language Models: NaVid

Year	Citation
2023	Zhou, G., Hong, Y., & Wu, Q. (2023). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models.

Year	Citation
2025	Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2025). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. In Lecture Notes in Computer Science Vol. 15065 LNCS (pp. 260-278). Milan, Italy: Springer Nature Switzerland. DOI Scopus32 WoS16
2024	Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. T., & Wu, Q. (2024). WebVLN: Vision-and-Language Navigation on Websites. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 1165-1173). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus11 WoS4
2024	Zhou, G., Hong, Y., & Wu, Q. (2024). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 7641-7649). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus128 WoS89

Year	Citation
2024	Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2024). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models.
2024	Zhou, G., Hong, Y., Wang, Z., Zhao, C., Bansal, M., & Wu, Q. (2024). SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts.
2023	Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. -T., & Wu, Q. (2023). WebVLN: Vision-and-Language Navigation on Websites.

Email: gengze.zhou@adelaide.edu.au

Mr Gengze Zhou

Mr Gengze Zhou

Connect With Me

External Profiles

Other Links

Mr Gengze Zhou

Mr Gengze Zhou

Journals

Conference Papers

Preprint

Connect With Me

External Profiles

Other Links