Dr Vu Minh Hieu Phan
Office of Engineering and Information Technology
College of Engineering and Information Technology
Eligible to supervise Masters and PhD - email supervisor to discuss availability.
Vu Minh Hieu Phan is a research fellow working on foundational models and multimodal learning for medical image analysis. His research interests include vision language models, generative AI, and universal models that are able to understand multi-modal inputs and perform diverse tasks. His research has been published at top-tier venues such as CVPR, ACL, EMNLP, IJCAI, WACV, MICCAI, IJCV, and Pattern Recognition. He serves as a regular reviewer for TPAMI, IJCV, TCSVT, NeurIPS, and CVPR.
My research involves developing deep learning models and large language models for multi-modality, computer vision, natural language processing, and medical image analysis. Here is my Google Scholar profile.
My area of research includes:
- Multi-modal Large Language Models.
- Visual foundational models for zero-shot / few-shot learning of medical imaging.
- Vision-language modeling for medical image classification and segmentation.
- Generative models for image synthesis.
- Continual learning of semantic segmentation.
- Knowledge distillation for efficient deep learning.
| Date | Position | Institution name |
|---|---|---|
| 2022 - ongoing | Research Fellow | Australian Institute of Machine Learning |
| Date | Institution name | Country | Title |
|---|---|---|---|
| University of Wollongong | Australia | Doctor of Philosphy |
| Year | Citation |
|---|---|
| 2025 | Ge, J., Zhang, B., Liu, A., Phan, V. M. H., Chen, Q., Shu, Y., & Zhao, Y. (2025). CIT: Rethinking class-incremental semantic segmentation with a Class Independent Transformation. Pattern Recognition, 167, 111707. |
| 2023 | Zhang, B., Liu, L., Phan, M. H., Tian, Z., Shen, C., & Liu, Y. (2023). SegViT v2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers. International Journal of Computer Vision, 132(4), 1126-1147. Scopus30 WoS25 |
| 2022 | Phan, M. H., Phung, S. L., Luu, K., & Bouzerdoum, A. (2022). Efficient hyperspectral image segmentation for biosecurity scanning using knowledge distillation from multi-head teacher. Neurocomputing, 504, 189-203. Scopus12 WoS9 |
| 2021 | Phan, M. H., Nguyen, Q., Phung, S. L., Zhang, W. E., Vo, T. D., & Sheng, Q. Z. (2021). CompactNet: A Light-Weight Deep Learning Framework for Smart Intrusive Load Monitoring. IEEE Sensors Journal, 21(22), 25181-25189. Scopus7 WoS5 |
| Year | Citation |
|---|---|
| 2026 | Liu, Y., Verjans, J., Phan, V. M. H., & Liao, Z. (2026). CA-Seg: An Attribute-Based Medical Image Segmentation Framework for Unified Out-of-Distribution Medical Image Segmentation. In S. Ali, D. C. Hogg, & M. Peckham (Eds.), Lecture Notes in Computer Science (Vol. 15918 LNCS, pp. 17-31). SPRINGER INTERNATIONAL PUBLISHING AG. DOI |
| 2026 | Pham, C. M., Nguyen, P. L., Nguyen, T. T., Phan, V. M. H., & Nguyen, B. P. (2026). Unleashing SAM for Few-Shot Medical Image Segmentation with Dual-Encoder and Automated Prompting. In J. C. Gee, D. C. Alexander, J. Hong, J. E. Iglesias, C. H. Sudre, A. Venkataraman, . . . J. Park (Eds.), Lecture Notes in Computer Science (Vol. 15965 LNCS, pp. 670-680). SPRINGER INTERNATIONAL PUBLISHING AG. DOI |
| 2025 | Ge, J., Zhang, Z., Phan, V. M. H., Zhang, B., Liu, A., Zhao, Y., & Zhao, S. (2025). ESA: Annotation-Efficient Active Learning for Semantic Segmentation. In Communications in Computer and Information Science (Vol. 2574 CCIS, pp. 141-152). Springer Nature Singapore. DOI |
| Year | Citation |
|---|---|
| 2025 | Nguyen, D., Ho, M. K., Ta, H., Nguyen, T. T., Chen, Q., Rav, K., . . . Phan, V. M. H. (2025). Localizing Before Answering: A Benchmark for Grounded Medical Visual Question Answering. In J. Kwok (Ed.), Ijcai International Joint Conference on Artificial Intelligence (pp. 7670-7678). CANADA, Montreal: ASSOC COMPUTATIONAL LINGUISTICS-ACL. DOI |
| 2025 | Qi, X., Zhang, Z., Handoko, A. B., Zheng, H., Chen, M., Huy, T. D., . . . To, M. S. (2025). ProjectedEx: Enhancing Generation in Explainable AI for Prostate Cancer. In Proceedings IEEE Symposium on Computer Based Medical Systems (pp. 623-629). Madrid, Spain: IEEE. DOI |
| 2025 | Huy, T. D., Tran, S. K., Nguyen, P., Tran, N. H., Sam, T. B., Hengel, A. V. D., . . . Phan, V. M. H. (2025). Interactive Medical Image Analysis with Concept-based Similarity Reasoning.. In CVPR (pp. 30797-30806). IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): Computer Vision Foundation / IEEE. |
| 2024 | Chowdhury, T. F., Liao, K., Phan, V. M. H., To, M. -S., Xie, Y., Hung, K., . . . Liao, Z. (2024). CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation.. In CVPR (pp. 11072-11081). Seattle, WA, USA: IEEE. |
| 2024 | Phan, V. M. H., Xie, Y., Qi, Y., Liu, L., Liu, L., Zhang, B., . . . Verjans, J. W. (2024). Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) (pp. 11492-11501). Seattle, WA, USA: Institute of Electrical and Electronics Engineers (IEEE). DOI Scopus19 WoS12 |
| 2024 | Yuan, J., Phan, M. H., Liu, L., & Liu, Y. (2024). FAKD: Feature Augmented Knowledge Distillation for Semantic Segmentation. In Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 (pp. 584-594). Online: IEEE. DOI Scopus21 |
| 2024 | Liu, L., Wang, Z., Phan, M. H., Zhang, B., Ge, J., & Liu, Y. (2024). BPKD: Boundary Privileged Knowledge Distillation for Semantic Segmentation. In Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 (pp. 1051-1061). Waikoloa, HI, USA: IEEE COMPUTER SOC. DOI Scopus22 WoS10 |
| 2024 | Phan, V. M. H., Xie, Y., Zhang, B., Qi, Y., Liao, Z., Perperidis, A., . . . To, M. -S. (2024). Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis.. In M. G. Linguraru, Q. Dou, A. Feragen, S. Giannarou, B. Glocker, K. Lekadir, & J. A. Schnabel (Eds.), MICCAI (7) Vol. 15007 (pp. 690-700). Marrakesh, Morocco: Springer. |
| 2024 | Chowdhury, T. F., Phan, V. M. H., Liao, K., To, M. -S., Xie, Y., Hengel, A. V. D., . . . Liao, Z. (2024). AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis.. In M. G. Linguraru, Q. Dou, A. Feragen, S. Giannarou, B. Glocker, K. Lekadir, & J. A. Schnabel (Eds.), MICCAI (10) Vol. 15010 (pp. 35-45). Marrakesh, Morocco: Springer. |
| 2024 | Phan, V. M. H., Xie, Y., Zhang, B., Qi, Y., Liao, Z., Perperidis, A., . . . To, M. -S. (2024). Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis.. In Lecture Notes in Computer Science Vol. 15007 (pp. 690-700). Marrakesh, Morocco: Springer. DOI Scopus19 WoS16 |
| 2024 | Nguyen, T. D., Huynh, T. T., Phan, M. H., Nguyen, Q. V. H., & Le Nguyen, P. (2024). CARER - ClinicAl Reasoning-Enhanced Representation for Temporal Health Risk Prediction. In EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 10392-10407). Miami, Florida: Association for Computational Linguistics. DOI Scopus2 |
| 2023 | Phan, V. M. H., Liao, Z., Verjans, J. W., & To, M. -S. (2023). Structure-Preserving Synthesis: MaskGAN for Unpaired MR-CT Translation. In Lecture Notes in Computer Science Vol. 14229 LNCS (pp. 56-65). Vancouver, BC, Canada,: Springer Nature Switzerland. DOI Scopus17 WoS14 |
| 2022 | Phan, M. H., Ta, T. -A., Phung, S. L., Tran-Thanh, L., & Bouzerdoum, A. (2022). Class Similarity Weighted Knowledge Distillation for Continual Semantic Segmentation. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Vol. 2022-June (pp. 16845-16854). Online: IEEE. DOI Scopus61 WoS54 |
| 2020 | Nguyen, V. K., Sheng, Q. Z., Mahmood, A., Zhang, W. E., Phan, M. H., & Vo, T. D. (2020). Demo abstract: an internet of plants system for micro gardens. In Proceedings of the 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN 2020) (pp. 355-356). online: IEEE. DOI Scopus6 WoS3 |
| 2020 | Nguyen, V. K., Phan, M. H., Zhang, W. E., Sheng, Q. Z., & Vo, T. D. (2020). A hybrid approach for intrusive appliance load monitoring in smart home. In Proceedings of the IEEE International Conference on Smart Internet of Things (SmartIoT 2020) (pp. 154-160). online: IEEE. DOI Scopus3 |
| 2020 | Phan, M. H., Phung, S. L., & Bouzerdoum, A. (2020). Ordinal depth classification using region-based self-attention. In Proceedings of ICPR 2020 25th International Conference on Pattern Recognition (pp. 3620-3627). New York, NY, USA: IEEE. DOI Scopus2 WoS2 |
| 2020 | Phan, M. H., & Ogunbona, P. O. (2020). Modelling Context and Syntactical Features for Aspect-based Sentiment Analysis. In D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 3211-3220). Stroudsburg PA, USA: Association for Computational Linguistics. DOI Scopus243 WoS167 |
| Year | Citation |
|---|---|
| 2025 | Huy, T. D., Tran, S. K., Nguyen, P., Tran, N. H., Sam, T. B., Hengel, A. V. D., . . . Phan, V. M. H. (2025). Interactive Medical Image Analysis with Concept-based Similarity Reasoning. |
| 2025 | Nguyen, D., Ho, M. K., Ta, H., Nguyen, T. T., Chen, Q., Rav, K., . . . Phan, V. M. H. (2025). Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs. |
| 2025 | Huy, T. D., Huynh, D. A., Xie, Y., Qi, Y., Chen, Q., Nguyen, P. L., . . . Phan, V. M. H. (2025). Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding. |
| 2024 | Phan, V. M. H., Xie, Y., Qi, Y., Liu, L., Liu, L., Zhang, B., . . . Verjans, J. W. (2024). Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework.. |
| Date | Role | Research Topic | Program | Degree Type | Student Load | Student Name |
|---|---|---|---|---|---|---|
| 2024 | Co-Supervisor | Towards Explainable AI in Medical Imaging: Bridging the Gap between Deep Learning and Radiologist Trust with Human Interactions | Doctor of Philosophy | Doctorate | Full Time | Mr Huy Ta |
| 2024 | Co-Supervisor | Towards Explainable AI in Medical Imaging: Bridging the Gap between Deep Learning and Radiologist Trust with Human Interactions | Doctor of Philosophy | Doctorate | Full Time | Mr Huy Ta |
| 2023 | Co-Supervisor | Explainable and Semantically Meaningful Deep Learning Models for Medical Risk Prediction and Diagnostics | Doctor of Philosophy | Doctorate | Full Time | Mr Townim Faisal Chowdhury |
| 2023 | Co-Supervisor | Explainable and Semantically Meaningful Deep Learning Models for Medical Risk Prediction and Diagnostics | Doctor of Philosophy | Doctorate | Full Time | Mr Townim Faisal Chowdhury |