— Gengze Zhou
Higher Degree by Research Candidate
PhD Candidate
My research is dedicated to creating explainable and embodied AI systems that can interact dynamically with both humans and their environments. I aim to build an autonomous agent that can understand, reason, and navigate the physical world, while seamlessly communicating with humans in natural language. By integrating machine learning with visual and linguistic applications, I strive to enhance the transparency and interpretability of AI decision-making, fostering more natural and effective human-AI interactions.
Some topics that I currently focus on:
| Year | Citation |
|---|---|
| 2023 | Zhou, G., Hong, Y., & Wu, Q. (2023). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. |
| Year | Citation |
|---|---|
| 2025 | Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2025). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. In Lecture Notes in Computer Science Vol. 15065 LNCS (pp. 260-278). Milan, Italy: Springer Nature Switzerland. DOI Scopus17 WoS3 |
| 2024 | Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. T., & Wu, Q. (2024). WebVLN: Vision-and-Language Navigation on Websites. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 1165-1173). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus8 WoS2 |
| 2024 | Zhou, G., Hong, Y., & Wu, Q. (2024). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 7641-7649). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus92 WoS43 |
| Year | Citation |
|---|---|
| 2024 | Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2024). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. |
| 2024 | Zhou, G., Hong, Y., Wang, Z., Zhao, C., Bansal, M., & Wu, Q. (2024). SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts. |
| 2023 | Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. -T., & Wu, Q. (2023). WebVLN: Vision-and-Language Navigation on Websites. |