APrf Qi Wu

Associate Professor

School of Computer Science and Information Technology

College of Engineering and Information Technology

Eligible to supervise Masters and PhD, but is currently at capacity - email supervisor to discuss availability.

Dr Qi Wu is currently an Associate Professor in the University of Adelaide and he was an ARC Senior Research Associate in the Australian Centre for Robotic Vision (ACRV) in the University of Adelaide, Australia. Before that, he works as a Postdoc Researcher in the Australian Centre for Visual Technologies (ACVT). He received an MSc in Global Computing and Media Technology, a PhD in Computer Science from the University of Bath (United Kingdom), in 2011 and 2015. His research interests include cross-depictive style object modelling, object detection and Vision-to-Language problems. He is especially interested in the problem of Image Captioning and Visual Question Answering. His image captioning model produced the best result in the Microsoft COCO Image Captioning Challenges in the last year and his VQA model is the current state-of-the-art in the area. His work has been published in prestigious journals and conferences such as TPAMI, CVPR, ICCV and ECCV.

My research interests are mainly in computer vision and machine learning. My previous research projects include modeling visual objects regardless of depictive styles and image understanding using contextual cues. I am currently leading a small team at the Adelaide to research on the topic of Vision-and-Language.

I have been in the computer vision filed for nearly 10 years and I have a strong track record in this field. Currently, I am working on the vision to language problem and I am especially an expert in the image captioning and visual question answering (VQA). In 2015, my image captioning model and VQA model achieved the leading performance in the Microsoft COCO Image Captioning Challenges and VQA Challenges. I have published several papers in the top journals such as IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Signal Processing Magazine (SPM), Computer Vision and Image Understanding (CVIU). I also have published several papers on the top conference, such as International Joint Conference on Artificial Intelligence (IJCAI), AAAI, The Conference on Computer Vision and Pattern Recognition (CVPR) and the European Conference on Computer Vision (ECCV), and so on.

Date	Position	Institution name
2023 - ongoing	Associate Professor	University of Adelaide
2018 - 2022	Senior Lecturer	University of Adelaide, Adelaide
2017 - 2018	ARC Senior Research Associate	Australia Centre for Robotic Vision, University of Adelaide
2015 - 2017	Senior Research Associate	University of Adelaide
2014 - ongoing	Research Intern	Lenovo
2011 - 2015	PhD	University of Bath

Language	Competency
Chinese (Mandarin)	Can read, write, speak, understand spoken and peer review
English	Can read, write, speak, understand spoken and peer review

Date	Institution name	Country	Title
2011 - 2015	University of Bath	United Kingdom	PhD
2010 - 2011	University of Bath	United Kingdom	MSc
2006 - 2010	China Jiliang University	China	BSc

Year	Citation
2026	Zheng, S., Zhao, P., Huang, Q., Cai, Y., Cheng, H., & Wu, Q. (2026). Implement Referring Expression Comprehension by Extending Auto-focus Lens to Locked Vision Model. ACM Transactions on Multimedia Computing Communications and Applications, 22(2), 24 pages. DOI
2026	Mohammadi, B., Abbasnejad, E., Qi, Y., Wu, Q., Van Den Hengel, A., & Shi, J. Q. (2026). Parameter-efficient action planning with large language models for vision-and-language navigation. Pattern Recognition, 172, 11 pages. DOI Scopus1 WoS2
2026	He, K., Huang, Y., Jing, Y., Wu, Q., & Wang, L. (2026). Fine-Grained Alignment Supervision Matters in Vision-and-Language Navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-16. DOI
2026	Suo, W., Ma, J., Sun, M., Zhang, H., Wang, P., Zhang, Y., & Wu, Q. (2026). Semi-Supervised VQA Multi-Modal Explanation via Self-Critical Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-18. DOI
2025	Yang, D., Zhang, S., Xu, X., Wu, Q., Fan, W., Zhang, L., . . . Wang, F. (2025). Yield Estimation of Longline Aquaculture by the Shadows of Buoys Based on UAV Orthophoto Image. DRONES, 9(11), 21 pages. DOI
2025	Wang, C., Xie, Y., Chen, Q., Zhou, Y., & Wu, Q. (2025). A Comprehensive Analysis of Mamba for 3D Volumetric Medical Image Segmentation. Pattern Recognition, 173, 112701. DOI
2025	Tian, X., Yang, Y. L., & Wu, Q. (2025). Script-to-storyboard: A new contextual retrieval dataset and benchmark. Computational Visual Media, 11(1), 103-122. DOI Scopus1
2025	Li, L., Cong, G., Qi, Y., Zha, Z. J., Wu, Q., Sheng, Q. Z., . . . Yang, M. H. (2025). Dubbing Movies via Hierarchical Phoneme Modeling and Acoustic Diffusion Denoising. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(11), 1-17. DOI Scopus3 WoS1
2025	Wen, Z., Tan, M., Wang, Y., Wu, Q., & Wu, Q. (2025). Enhanced Reasoning via Multimodal LLMs and Collaborative Inference. IEEE Transactions on Multimedia, 27, 1-14. DOI Scopus1
2025	Tan, M., Chen, Q., Huang, Z., Wu, Q., Li, Y., & Zhou, J. (2025). Auto-3D-house Design from Structured User Requirements. MACHINE INTELLIGENCE RESEARCH, 22(2), 18 pages. DOI
2025	Zhang, J., Chen, X., Yang, B., Guan, Q., Chen, Q., Chen, J., . . . Xia, Y. (2025). Advances in attention mechanisms for medical image segmentation. Computer Science Review, 56, 18 pages. DOI Scopus31 WoS23
2025	Yuan, Y., Sun, B., Zeng, J., Wu, Q., Liu, J., Jiang, D., & Qin, F. (2025). 6G Network Architecture: QoS Paradigms and Data Lifecycle Management for Next-Generation Networks. IEEE COMMUNICATIONS MAGAZINE, 63(8), 16-22. DOI
2024	Zhang, Y., Ma, Z., Li, J., Qiao, Y., Wang, Z., Chai, J., . . . Kordjamshidi, P. (2024). Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models. Transactions on Machine Learning Research, 2024. Scopus3
2024	Chen, Q., Zhao, R., Wang, S., Phan, V. M. H., Hengel, A. V. D., Verjans, J., . . . Wu, Q. (2024). A Survey of Medical Vision-and-Language Applications and Their Techniques.. CoRR, abs/2411.12195.
2024	Sun, M., Suo, W., Wang, P., Niu, K., Liu, L., Lin, G., . . . Wu, Q. (2024). An Adaptive Correlation Filtering Method for Text-Based Person Search. International Journal of Computer Vision, 132(10), 4440-4455. DOI Scopus8 WoS8
2024	Xie, Y., Zhang, J., Xia, Y., & Wu, Q. (2024). UniMiSS+: Universal Medical Self-Supervised Learning From Cross-Dimensional Unpaired Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12), 10021-10035. DOI Scopus8 WoS5 Europe PMC1
2024	Xie, Y., Gu, L., Harada, T., Zhang, J., Xia, Y., & Wu, Q. (2024). Rethinking masked image modeling for medical image representation. Medical Image Analysis, 98, 103304. DOI Scopus15 WoS12 Europe PMC7
2024	Ding, N., Deng, C., Tan, M., Du, Q., Ge, Z., & Wu, Q. (2024). Image Captioning With Controllable and Adaptive Length Levels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 764(779), 1-16. DOI Scopus15 WoS12 Europe PMC1
2024	Gao, C., Liu, S., Chen, J., Wang, L., Wu, Q., Li, B., & Tian, Q. (2024). Room-Object Entity Prompting and Reasoning for Embodied Referring Expression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(2), 994-1010. DOI Scopus18 WoS16 Europe PMC3
2024	Wen, Z., Niu, S., Li, G., Wu, Q., Tan, M., & Wu, Q. (2024). Test-Time Model Adaptation for Visual Question Answering with Debiased Self-Supervisions. IEEE Transactions on Multimedia, 26, 2137-2147. DOI Scopus11 WoS8
2023	Lin, Z., Zhang, D., Tao, Q., Shi, D., Haffari, G., Wu, Q., . . . Ge, Z. (2023). Medical visual question answering: A survey. Artificial Intelligence in Medicine, 143, 102611. DOI Scopus131 WoS86 Europe PMC28
2023	Zhou, G., Hong, Y., & Wu, Q. (2023). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models.
2023	Wang, Z., Byrnes, O., Wang, H., Sun, R., Ma, C., Chen, H., . . . Xue, M. (2023). Data Hiding With Deep Learning: A Survey Unifying Digital Watermarking and Steganography. IEEE Transactions on Computational Social Systems, 10(6), 1-15. DOI Scopus79 WoS49
2023	Li, H., Huang, J., Jin, P., Song, G., Wu, Q., & Chen, J. (2023). Weakly-Supervised 3D Spatial Reasoning for Text-based Visual Question Answering. IEEE Transactions on Image Processing, 32, 3367-3382. DOI Scopus22 WoS18 Europe PMC1
2023	Tan, M., Wen, Z., Fang, L., & Wu, Q. (2023). Transformer-Based Relational Inference Network for Complex Visual Relational Reasoning. ACM Transactions on Multimedia Computing, Communications, and Applications, 20(1), 1-23. DOI Scopus7 WoS4
2023	Shi, X., Qiao, Y., Wu, Q., Liu, L., & Dayoub, F. (2023). Improving Online Source-free Domain Adaptation for Object Detection by Unsupervised Data Acquisition.
2023	He, M., Du, W., Wen, Z., Du, Q., Xie, Y., & Wu, Q. (2023). Multi-Granularity Aggregation Transformer for Joint Video-Audio-Text Representation Learning. IEEE Transactions on Circuits and Systems for Video Technology, 33(6), 2990-3002. DOI Scopus10 WoS10
2023	Qiao, Y., Qi, Y., Hong, Y., Yu, Z., Wang, P., & Wu, Q. (2023). HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(7), 8524-8537. DOI Scopus55 WoS26 Europe PMC6
2023	Liu, D., Chen, Z., Huang, Z., Wu, Q., Song, Y., Yao, J., . . . Fang, G. (2023). In Situ Surface Modification Enables High Stability and Optoelectrical Performance for a Self-powered Photodetector. ADVANCED OPTICAL MATERIALS, 11(22), 10 pages. DOI WoS39
2022	Xun, L., Zhang, H., Yan, Q., Wu, Q., & Zhang, J. (2022). VISOR-NET: Visibility Estimation Based on Deep Ordinal Relative Learning under Discrete-Level Labels. SENSORS, 22(16), 20 pages. DOI WoS13
2022	Li, Y., Wu, Q., Lai, M., Zhao, J., Liu, Y., Fan, Y., . . . Liu, B. (2022). Influence of chemical disorder on mechanical and thermal properties of multi-component rare earth zirconate pyrochlores (<i>n</i>RE<sub>1/<i>n</i></sub>)<sub>2</sub>Zr<sub>2</sub>O<sub>7</sub>. JOURNAL OF APPLIED PHYSICS, 132(7), 11 pages. DOI WoS15
2022	Ji, G., Chen, C., Zhou, M., Wen, W., Wang, C., Tang, J., . . . Feng, Z. (2022). Post-COVID-19 fatigue among COVID-19 in patients discharged from hospital: A meta-analysis. JOURNAL OF INFECTION, 84(5), 731-733. DOI WoS6
2022	Wu, Y., Feng, T., Shen, Y., Fu, F., Meng, N., Li, X., . . . Wang, M. (2022). Total-body parametric imaging using the Patlak model: Feasibility of reduced scan time. MEDICAL PHYSICS, 49(7), 4529-4539. DOI WoS23
2022	Ling, L., Wu, Q., Huang, K., Wang, Y., & Wang, C. (2022). A Lightweight Bearing Fault Diagnosis Method Based on Multi-Channel Depthwise Separable Convolutional Neural Network. Electronics (Switzerland), 11(24), 21 pages. DOI Scopus18 WoS16
2022	Manchin, A., Sherrah, J., Wu, Q., & van den Hengel, A. (2022). Program Generation from Diverse Video Demonstrations. BMVC 2022 - 33rd British Machine Vision Conference Proceedings.
2022	Suo, W., Sun, M., Wang, P., Zhang, Y., & Wu, Q. (2022). Rethinking and Improving Feature Pyramids for One-stage Referring Expression Comprehension. IEEE Transactions on Image Processing, 32, 854-864. DOI Scopus16 WoS17 Europe PMC1
2022	Sun, M., Suo, W., Wang, P., Zhang, Y., & Wu, Q. (2022). A proposal-free one-stage framework for referring expression comprehension and generation via dense cross-attention. IEEE Transactions on Multimedia, 25, 2446-2458. DOI Scopus45 WoS40
2022	Deng, C., Wu, Q., Wu, Q., Hu, F., Lyu, F., & Tan, M. (2022). Visual Grounding Via Accumulated Attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3), 1670-1684. DOI Scopus13 WoS13 Europe PMC1
2022	Parvaneh, A., Abbasnejad, E., Wu, Q., Shi, Q., & Van Den Hengel, A. (2022). Show, price and negotiate: a negotiator with online value look-ahead. IEEE Transactions on Multimedia, 24, 1426-1434. DOI Scopus2 WoS1
2022	Sun, Z., Liu, H., Wang, Q., Zhou, T., Wu, Q., & Tang, Z. (2022). Co-LDL: A Co-training-based Label Distribution Learning Method for Tackling Label Noise. IEEE Transactions on Multimedia, 24, 1093-1104. DOI Scopus42 WoS39
2021	Yu, J., Jiang, X., Qin, Z., Zhang, W., Hu, Y., & Wu, Q. (2021). Learning Dual Encoding Model for Adaptive Visual Understanding in Visual Dialogue. IEEE TRANSACTIONS ON IMAGE PROCESSING, 30, 220-233. DOI Scopus34 WoS29 Europe PMC5
2021	Wang, Y., Qi, Y., Yao, H., Gong, D., & Wu, Q. (2021). Image editing with varying intensities of processing. Computer Vision and Image Understanding, 211, 1-13. DOI Scopus4 WoS4
2021	Zhang, W., Ma, C., Wu, Q., & Yang, X. (2021). Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning. IEEE Transactions on Circuits and Systems for Video Technology, 31(9), 3469-3481. DOI Scopus54 WoS48
2021	Wang, H., Chen, H., Wu, Q., Ma, C., & Li, Y. (2021). Multi-Intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline. IEEE Open Journal of Intelligent Transportation Systems, 3, 126-136. DOI Scopus16 WoS16
2021	Zhang, C., Wang, Q., Xie, G., Wu, Q., Shen, F., & Tang, Z. (2021). Robust Learning from Noisy Web Images via Data Purification for Fine-Grained Recognition. IEEE Transactions on Multimedia, 24, 1. DOI Scopus13 WoS13
2020	Gao, C., Zhu, Q., Wang, P., Li, H., Liu, Y., Van den Hengel, A., & Wu, Q. (2020). Structured Multimodal Attentions for TextVQA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(8), 1. DOI Scopus36 WoS44 Europe PMC3
2020	Chen, Q., Wu, Q., Chen, J., Wu, Q., Van Den Hengel, A., & Tan, M. (2020). Scripted Video Generation with a Bottom-Up Generative Adversarial Network. IEEE Transactions on Image Processing, 29, 7454-7467. DOI Scopus33 WoS16
2020	Qiao, Y., Deng, C., & Wu, Q. (2020). Referring expression comprehension: a survey of methods and datasets. IEEE Transactions on Multimedia, 23, 4426-4440. DOI Scopus80 WoS68
2020	Yu, J., Zhang, W., Lu, Y., Qin, Z., Hu, Y., Tan, J., & Wu, Q. (2020). Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval. IEEE Transactions on Multimedia, 22(12), 3196-3209. DOI Scopus96 WoS86
2020	Huang, Y., Wu, Q., Wang, W., & Wang, L. (2020). Image and Sentence Matching via Semantic Concepts and Order Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(3), 636-650. DOI Scopus39 WoS32 Europe PMC6
2020	Liu, X., Dai, P., Gu, T., Wu, Q., Wei, H., Liu, S., . . . Zhao, Q. (2020). Cyclometalated iridium(III) complexes containing an anthracene unit for sensing and imaging singlet oxygen in cellular mitochondria. JOURNAL OF INORGANIC BIOCHEMISTRY, 209, 10 pages. DOI WoS17
2020	Zhou, S., Wang, S., Wu, Q., Azim, R., & Li, W. (2020). Predicting potential miRNA-disease associations by combining gradient boosting decision tree with logistic regression. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 85, 8 pages. DOI WoS90
2019	Yang, J., Wang, M., Zhang, Y., Jia, X., Chen, Y., Liu, T., . . . Xiao, H. (2019). Rapid Preparation of Oxidized Starch with High Carbonyl Contents Using NaBrO as Oxidizer. STARCH-STARKE, 71(9-10), 9 pages. DOI WoS10
2019	Xu, J. -L., Stutzki, J., Wu, Y., Guan, X., Wang, J. -J., Miller, M., . . . Wu, Q. (2019). Probing star formation and feedback using CCOSMA and archival data in the CFG028.68-0.28 quasi-sinusoidal filament. RESEARCH IN ASTRONOMY AND ASTROPHYSICS, 19(12), 13 pages. DOI WoS2
2019	Xiao, J., Ding, W., Peng, Y., Wu, Q., Chen, Z., Wang, Z., . . . Peng, T. (2019). UPGRADING IRON AND REMOVING PHOSPHORUS OF HIGH PHOSPHORUS OOLITIC IRON ORE BY SEGREGATION ROASTING WITH CALCIUM CHLORIDE AND CALCIUM HYPOCHLORITE. JOURNAL OF MINING AND METALLURGY SECTION B-METALLURGY, 55(3), 305-314. DOI WoS14
2019	Li, K. -P., Yuan, M., He, Z. -R., Wu, Q., Zhang, C. -M., Lei, Z. -L., . . . Guo, J. (2019). Omics Insights into Metabolic Stress and Resilience of Rats in Response to Short-term Fructose Overfeeding. MOLECULAR NUTRITION & FOOD RESEARCH, 63(23), 14 pages. DOI WoS12
2019	Tang, T., Duan, X., Zhou, Z., & Wu, Q. (2019). Scatter Correction Based on Beam Stop Array for Cone-Beam Micro-Computed Tomography. ACTA OPTICA SINICA, 39(8), 11 pages. DOI WoS1
2019	Liu, W., Li, Y., & Wu, Q. (2019). An Attribute-Based High-Level Image Representation for Scene Classification. IEEE Access, 7, 4629-4640. DOI Scopus5 WoS2
2019	Lyu, F., Wu, Q., Hu, F., Wu, Q., & Tan, M. (2019). Attend and Imagine: Multi-Label Image Classification with Visual Attention and Recurrent Neural Networks. IEEE Transactions on Multimedia, 21(8), 1971-1981. DOI Scopus68 WoS57
2019	Zhang, J., Wu, Q., Zhang, J., Shen, C., Lu, J., & Wu, Q. (2019). Heritage image annotation via collective knowledge. Pattern Recognition, 93, 204-214. DOI Scopus9 WoS8
2019	Zhang, J., Xie, Y., Wu, Q., & Xia, Y. (2019). Medical image classification using synergic deep learning. Medical Image Analysis, 54, 10-19. DOI Scopus371 WoS273 Europe PMC112
2018	Wu, Q., Shen, C., Wang, P., Dick, A., & van den Hengel, A. (2018). Image captioning and visual question answering based on attributes and external knowledge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1367-1381. DOI Scopus356 WoS266 Europe PMC32
2018	Zhang, J., Wu, Q., Shen, C., Zhang, J., & Lu, J. (2018). Multilabel image classification with regional latent semantic dependencies. IEEE Transactions on Multimedia, 20(10), 2801-2813. DOI Scopus180 WoS95
2018	Hu, L., Zhu, Q., Wu, Q., Li, D., An, Z., & Xu, B. (2018). Natural Biomass-Derived Hierarchical Porous Carbon Synthesized by an <i>in Situ</i> Hard Template Coupled with NaOH Activation for Ultrahigh Rate Supercapacitors. ACS SUSTAINABLE CHEMISTRY & ENGINEERING, 6(11), 13949-13959. DOI WoS148
2018	Sun, P., Wu, Q., Sun, X., Miao, H., Deng, W., Zhang, W., . . . Huang, W. (2018). J-Aggregate squaraine nanoparticles with bright NIR-II fluorescence for imaging guided photothermal therapy. CHEMICAL COMMUNICATIONS, 54(95), 13395-13398. DOI WoS156
2018	Zhang, K. Y., Zhang, T., Wei, H., Wu, Q., Liu, S., Zhao, Q., & Huang, W. (2018). Phosphorescent iridium(III) complexes capable of imaging and distinguishing between exogenous and endogenous analytes in living cells. CHEMICAL SCIENCE, 9(36), 7236-7240. DOI WoS51
2018	Wu, Q., Ma, H., Ling, K., Gan, N., Cheng, Z., Gu, L., . . . Huang, W. (2018). Reversible Ultralong Organic Phosphorescence for Visual and Selective Chloroform Detection. ACS APPLIED MATERIALS & INTERFACES, 10(39), 33730-33736. DOI WoS84
2018	Lu, X., Yuan, P., Zhang, W., Wu, Q., Wang, X., Zhao, M., . . . Fan, Q. (2018). A highly water-soluble triblock conjugated polymer for <i>in vivo</i> NIR-II imaging and photothermal therapy of cancer. POLYMER CHEMISTRY, 9(22), 3118-3126. DOI WoS67
2018	Cai, S., Shi, H., Zhang, Z., Wang, X., Ma, H., Gan, N., . . . Huang, W. (2018). Hydrogen-Bonded Organic Aromatic Frameworks for Ultralong Phosphorescence by Intralayer π-π Interactions. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 57(15), 4005-4009. DOI WoS241
2018	Li, S., Cheng, L., Wu, Q., Zhang, Q., Yang, J., & Liu, J. (2018). Mechanism of Aerobic Alcohol Oxidation Mediated by Water-Soluble Cu<SUP>II</SUP>-TEMPO Catalyst in Water: A Density Functional Theory Study. CHEMISTRYSELECT, 3(4), 1268-1274. DOI WoS2
2018	Sun, C., Ran, X., Wang, X., Cheng, Z., Wu, Q., Cai, S., . . . Huang, W. (2018). Twisted Molecular Structure on Tuning Ultralong Organic Phosphorescence. JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 9(2), 335-339. DOI WoS84
2018	Cui, S., Wang, X., Zhang, X., Xia, W., Tang, X., Lin, B., . . . Shen, X. (2018). Preparation of magnetic MnFe<sub>2</sub>O<sub>4</sub>-Cellulose aerogel composite and its kinetics and thermodynamics of Cu(II) adsorption. CELLULOSE, 25(1), 735-751. DOI WoS61
2018	Gu, L., Shi, H., Miao, C., Wu, Q., Cheng, Z., Cai, S., . . . Huang, W. (2018). Prolonging the lifetime of ultralong organic phosphorescence through dihydrogen bonding. JOURNAL OF MATERIALS CHEMISTRY C, 6(2), 226-233. DOI WoS101
2018	Wu, Q., Li, Y., Wang, C., Zhang, J., Huang, M., Kim, J. K., & Wu, Y. (2018). 1,4-Refunctionalization of β-diketones to γ-keto nitriles <i>via</i> C-C single bond cleavage. ORGANIC CHEMISTRY FRONTIERS, 5(16), 2496-2500. DOI WoS18
2018	Bian, L., Shi, H., Wang, X., Ling, K., Ma, H., Li, M., . . . Huang, W. (2018). Simultaneously Enhancing Efficiency and Lifetime of Ultralong Organic Phosphorescence Materials by Molecular Self-Assembly. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 140(34), 10734-10739. DOI WoS488
2018	Chen, H., Xu, J., Xiao, G., Wu, Q., & Zhang, S. (2018). Fast auto-clean CNN model for online prediction of food materials. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 117, 218-227. DOI WoS23
2018	Deng, W., Wu, Q., Sun, P., Yuan, P., Lu, X., Fan, Q., & Huang, W. (2018). Zwitterionic diketopyrrolopyrrole for fluorescence/photoacoustic imaging guided photodynamic/photothermal therapy. POLYMER CHEMISTRY, 9(20), 2805-2812. DOI WoS31
2018	Cai, S., Shi, H., Tian, D., Ma, H., Cheng, Z., Wu, Q., . . . Huang, W. (2018). Enhancing Ultralong Organic Phosphorescence by Effective π-Type Halogen Bonding. ADVANCED FUNCTIONAL MATERIALS, 28(9), 7 pages. DOI WoS305
2018	Cheng, Z., Shi, H., Ma, H., Bian, L., Wu, Q., Gu, L., . . . Huang, W. (2018). Ultralong Phosphorescence from Organic Ionic Crystals under Ambient Conditions. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 57(3), 678-682. DOI WoS211
2017	Hu, L., Ma, L., Zhu, Q., Yu, L., Wu, Q., Hu, C., . . . Xu, B. (2017). Organic salt-derived nitrogen-rich, hierarchical porous carbon for ultrafast supercapacitors. NEW JOURNAL OF CHEMISTRY, 41(22), 13611-13618. DOI WoS12
2017	Li, S., Cheng, L., Wu, Q., Zhang, Q., Yang, J., & Liu, J. (2017). Mechanistic Insight into the 2° Alcohol Oxidation Mediated by an Efficient Cu<SUP>I</SUP>/L-Proline-TEMPO Catalyst-A Density Functional Theory Study. CATALYSTS, 7(9), 15 pages. DOI WoS4
2017	Teney, D., Wu, Q., & Van Den Hengel, A. (2017). Visual Question Answering: a tutorial. IEEE Signal Processing Magazine, 34(6), 63-75. DOI Scopus35 WoS23
2017	Zhuang, B., Wu, Q., Shen, C., Reid, I., & Hengel, A. V. D. (2017). Care about you: towards large-scale human-centric visual relationship detection.
2017	Wang, P., Wu, Q., Shen, C., Dick, A., & Van Den Hengel, A. (2017). FVQA: fact-based Visual Question Answering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(10), 2413-2427. DOI Scopus414 WoS340 Europe PMC36
2017	Wu, Q., Teney, D., Wang, P., Shen, C., Dick, A., & van den Hengel, A. (2017). Visual question answering: a survey of methods and datasets. Computer Vision and Image Understanding, 163, 21-40. DOI Scopus335 WoS258
2016	Wang, J., Zhang, F. F., Wei, B., Wu, Q., Cao, M. J., Bai, Y., & Yang, G. W. (2016). Counterion-Directed Assembly of Praseodymium(III) Compounds based on the Flexible Ligand 5-Aminotetrazole-1-propionic Acid (Hatzp). ZEITSCHRIFT FUR ANORGANISCHE UND ALLGEMEINE CHEMIE, 642(2), 169-173. DOI WoS7
2016	Shen, L., Cao, M. J., Zhang, F. F., Wu, Q., Zhao, L. Y., Lu, Y. M., . . . Zou, J. H. (2016). Three new manganese(II) coordination complexes based on tetrazole carboxylate ligands. TRANSITION METAL CHEMISTRY, 41(2), 125-131. DOI WoS27
2016	Sun, Y., Wang, X., Du, J., Chen, N., Yu, H., Wu, Q., & Meng, X. (2016). Amorphous Structure and Bonding Chemistry of Aluminium Antimonide(AlSb) Alloy for Phase-change Memory Device. CHEMICAL RESEARCH IN CHINESE UNIVERSITIES, 32(1), 76-81. DOI WoS5
2016	Shen, L., Min, Y. -T., Bai, X., Wang, J., Wu, Q., Yang, J., . . . Li, Q. -Y. (2016). Four Gadolinium Coordination Compounds Derived from Various Tetrazole-Containing Carboxylic Acids. ZEITSCHRIFT FUR ANORGANISCHE UND ALLGEMEINE CHEMIE, 642(19), 1112-1119. DOI WoS3
2016	Wu, J., Bai, Y., Lu, Y. M., Wang, J., Wu, Q., Yang, G. W., & Li, Q. Y. (2016). Substituted group-directed magnesium(II) coordination compounds based on the derivatives of tetrazole-2-acetic acid. JOURNAL OF THE IRANIAN CHEMICAL SOCIETY, 13(12), 2155-2162. DOI WoS3
2016	Shen, L., Bai, Y., Min, Y. -T., Jia, T. -T., Wu, Q., Wang, J., . . . Yang, G. -W. (2016). Coordination Architectures of energetic Cd (II) coordination polymers constructed by the bifunctional substituted-tetrazole-carboxylate ligands. JOURNAL OF SOLID STATE CHEMISTRY, 244, 129-139. DOI WoS15
2016	Zhang, J., Tang, Z., Giddings, R., Wu, Q., Wang, W., Cao, B., . . . Tang, J. M. (2016). Stage-Dependent DSP Operation Range Clipping-Induced Bit Resolution Reductions of Full Parallel 64-Point FFTs Incorporated in FPGA-Based Optical OFDM Receivers. JOURNAL OF LIGHTWAVE TECHNOLOGY, 34(16), 3752-3760. DOI WoS13
2016	Miao, L. -L., Guo, M. -Y., Wu, J., Lu, Y. -M., Wu, Q., Bai, Y., . . . Yang, G. -W. (2016). Counter anion and pH directed assembly of europium(III) compounds based on tetrazole containing carboxylic acids. INORGANICA CHIMICA ACTA, 450, 176-181. DOI WoS12
2016	Yang, G. W., Zhang, Y. T., Wu, Q., Cao, M. J., Wu, J., Yue, Q. Y., & Li, Q. Y. (2016). Nitrogen-rich 5-(4-pyridyl)tetrazole-2-acetic acid and its alkaline earth metal coordination polymers for potential energetic materials. INORGANICA CHIMICA ACTA, 450, 364-371. DOI WoS17
2016	Wang, C., Li, Y., Gong, M., Wu, Q., Zhang, J., Kim, J. K., . . . Wu, Y. (2016). Method for Direct Synthesis of α-Cyanomethyl-β-dicarbonyl Compounds with Acetonitrile and 1,3-Dicarbonyls. ORGANIC LETTERS, 18(17), 4151-4153. DOI WoS49
2016	Du, J., Wang, M., Chen, N., Xie, S., Yu, H., & Wu, Q. (2016). Instability Origin and Improvement Scheme of Facial Alq<sub>3</sub> for Blue OLED Application. CHEMICAL RESEARCH IN CHINESE UNIVERSITIES, 32(3), 423-427. DOI WoS3
2016	Tang, X. -L., Lin, B. -L., Cui, S., Zhang, X., Zhong, Y., Wu, Q., . . . Wang, T. -W. (2016). Paclitaxel modified Fe<sub>3</sub>O<sub>4</sub> loaded albumin nanoparticles as drug delivery vehicles by self-assembly. RSC ADVANCES, 6(49), 43284-43292. DOI WoS13
2015	Wu, Q., Cao, M. J., Wei, B., Bai, Y., Tian, H., Wang, J., . . . Yang, G. W. (2015). pH dependent synthesis of structurally diverse praseodymium(III) coordination polymers based on isomeric ligands. INORGANIC CHEMISTRY COMMUNICATIONS, 62, 111-114. DOI WoS26
2015	Yang, G. W., Zhang, F. F., Wu, Q., Cao, M. J., Bai, Y., Li, Q. Y., . . . Zou, J. H. (2015). Substituted group directed assembly of energetic lead(II) compounds based on structure-relevant ligands. RSC ADVANCES, 5(103), 84439-84445. DOI WoS32
2015	Nie, Y., Speakman, J. R., Wu, Q., Zhang, C., Hu, Y., Xia, M., . . . Wei, F. (2015). Exceptionally low daily energy expenditure in the bamboo-eating giant panda. SCIENCE, 349(6244), 171-174. DOI WoS139
2015	Hall, P., Cai, H., Wu, Q., & Corradi, T. (2015). Cross-depiction problem: recognition and synthesis of photographs and artwork. Computational Visual Media, 1(2), 91-103. DOI Scopus37
2014	Wu, Q., & Xiao, H. (2014). Dynamic CGE Model and Simulation Analysis on the Impact of Citizenization of Rural Migrant Workers on the Labor and Capital Markets in China. DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2014, 8 pages. DOI WoS3
2011	Fu, Z., Wu, Q., Gong, W., Shi, L., Li, W., & Dai, Z. (2011). Photoluminescence properties and analysis of spectral structure of R<sub>2</sub>(MoO<sub>4</sub>)<sub>3</sub>: Eu<SUP>3+</SUP> (R = La, Gd) phosphors. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA B-OPTICAL PHYSICS, 28(4), 709-713. DOI WoS8
2011	Fu, Z., Gong, W., Li, H., Wu, Q., Li, W., Yang, H. K., & Jeong, J. H. (2011). Synthesis and spectral properties of nanocrystalline Eu<SUP>3+</SUP>-doped pyrochlore oxide M<sub>2</sub>Sn<sub>2</sub>O<sub>7</sub> (M = Gd and Y). CURRENT APPLIED PHYSICS, 11(3), 933-938. DOI WoS14
2011	Wu, Q., Li, H., Xia, W., Fu, X., Fu, Z., Zhou, S., . . . Jeong, J. H. (2011). Investigation of the Structure and Photoluminescence Properties of Ln<SUP>3+</SUP>(Eu<SUP>3+</SUP>, Dy<SUP>3+</SUP>, Sm<SUP>3+</SUP>) Ion-Doped NaY(MoO<sub>4</sub>)<sub>2</sub>. JOURNAL OF THE ELECTROCHEMICAL SOCIETY, 158(12), J387-J393. DOI WoS16
2006	Zhang, F., Wu, Q., Chen, Z. -C., Li, X., Jiang, X. -M., & Lin, X. -F. (2006). Bioactive galactose-branched polyelectrolyte multilayers and microcapsules: Self-assembly, characterization, and biospecific lectin adsorption. LANGMUIR, 22(20), 8458-8464. DOI WoS33
2006	Bi, J., Wu, Q., & Li, Z. (2006). On estimating clock skew for one-way measurements. COMPUTER COMMUNICATIONS, 29(8), 1213-1225. DOI WoS10

Year	Citation
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Visual Question Answering. Springer Nature Singapore. DOI
2020	Garg, S., Sünderhauf, N., Dayoub, F., Morrison, D., Cosgun, A., Carneiro, G., . . . Milford, M. (2020). Semantics for Robotic Mapping, Perception and Interaction: A Survey (Vol. 8). United States: Now Publishers. DOI

Year	Citation
2025	Shi, X., Qiao, Y., Wu, Q., Liu, L., & Dayoub, F. (2025). Improving Online Source-Free Domain Adaptation for Object Detection by Unsupervised Data Acquisition. In A. DelBue, C. Canton, J. Pont-Tuset, & T. Tommasi (Eds.), Lecture Notes in Computer Science (Vol. 15629 LNCS, pp. 195-205). SPRINGER INTERNATIONAL PUBLISHING AG. DOI Scopus2 WoS1
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Video Representation Learning. In Advances in Computer Vision and Pattern Recognition (pp. 111-117). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Vision-and-Language Pretraining for VQA. In Advances in Computer Vision and Pattern Recognition (pp. 91-107). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Text-Based VQA. In Advances in Computer Vision and Pattern Recognition (pp. 177-187). Springer Nature Singapore. DOI Scopus1
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Deep Learning Basics. In Advances in Computer Vision and Pattern Recognition (pp. 15-26). Springer Nature Singapore. DOI Scopus1
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Summary and Outlook. In Advances in Computer Vision and Pattern Recognition (pp. 233-236). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Knowledge-Based VQA. In Advances in Computer Vision and Pattern Recognition (pp. 73-90). Springer Nature Singapore. DOI Scopus2
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Medical VQA. In Advances in Computer Vision and Pattern Recognition (pp. 165-176). Springer Nature Singapore. DOI Scopus10
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Question Answering (QA) Basics. In Advances in Computer Vision and Pattern Recognition (pp. 27-31). Springer Nature Singapore. DOI Scopus2
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Visual Dialogue. In Advances in Computer Vision and Pattern Recognition (pp. 199-218). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Referring Expression Comprehension. In Advances in Computer Vision and Pattern Recognition (pp. 219-230). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Classical Visual Question Answering. In Advances in Computer Vision and Pattern Recognition (pp. 35-72). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Advanced Models for Video Question Answering. In Advances in Computer Vision and Pattern Recognition (pp. 135-143). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Video Question Answering. In Advances in Computer Vision and Pattern Recognition (pp. 119-133). Springer Nature Singapore. DOI Scopus1
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Visual Question Generation. In Advances in Computer Vision and Pattern Recognition (pp. 189-197). Springer Nature Singapore. DOI
2022	Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Embodied VQA. In Advances in Computer Vision and Pattern Recognition (pp. 147-164). Springer Nature Singapore. DOI Scopus1
2015	Brown-Grant, R. (2015). Introduction. In R. BrownGrant, A. D. Hedeman, & B. Ribemont (Eds.), Advances in Computer Vision and Pattern Recognition (pp. 1-13). ROUTLEDGE. DOI Scopus1

Year	Citation
2026	Chen, X., Chen, Q., Phan, M. H., Wu, Q., Chen, J., & Xie, Y. (2026). Towards Generalizable Clinical Knowledge Discovery for Radiology Report Generation. In Lecture Notes in Computer Science Vol. 16241 LNCS (pp. 379-389). Springer Nature Switzerland. DOI
2025	Wang, C., Chen, Q., Xie, Y., & Wu, Q. (2025). Filling in the Missing Piece: Advancing Automated Radiology Report Generation with Clinical Insights. In 2025 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS, DICTA (pp. 435-442). AUSTRALIA, Adelaide: IEEE COMPUTER SOC. DOI
2025	Chen, F., Zhuang, B., & Wu, Q. (2025). Streaming Video Diffusion: Online Video Editing with Diffusion Models. In Proceedings of the International Conference on Digital Image Computing Techniques and Applications (pp. 90-98). United States: IEEE. DOI
2025	Wang, C., Chen, Q., To, M. -S., Kutaiba, N., Yoo, J. -G., Xie, Y., & Wu, Q. (2025). X-Gen: Enhancing Radiology Report Generation via LLM-Driven Data Augmentation and Decoupled Training. In Proceedings of the International Conference on Digital Image Computing Techniques and Applications (pp. 450-457). United States: IEEE. DOI
2025	Zhuang, J., Yu, J., Qu, X., Tang, Y., Gou, G., Xiong, G., & Wu, Q. (2025). Soft Multi-view Representation Learning for Disambiguating Text-Based Person Retrieval. In Lecture Notes in Computer Science Vol. 15686 LNCS (pp. 143-156). Springer Nature Singapore. DOI
2025	Wang, X., Zhuang, B., & Wu, Q. (2025). ARE LARGE VISION LANGUAGE MODELS GOOD GAME PLAYERS?. In 13th International Conference on Learning Representations Iclr 2025 (pp. 24502-24539). Scopus2
2025	Hong, H., Qiao, Y., Wang, S., Liu, J., & Wu, Q. (2025). General Scene Adaptation for Vision-and-Language Navigation. In Proceedings of the 13th International Conference on Learning Representations (ICLR 2025) (pp. 4389-4416). Singapore: International Conference on Learning Representations (ICLR).
2025	Liu, Q., Zhang, S., Qiao, Y., Zhu, J., Li, X., Guo, L., . . . Liu, J. (2025). GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation. In Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025 (pp. 1775-1784). Tucson, AZ, USA: IEEE. DOI
2025	Gai, K., Wang, D., Yu, J., Wang, M., Zhu, L., & Wu, Q. (2025). MFL-Owner: Ownership Protection for Multi-modal Federated Learning via Orthogonal Transform Watermark. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 39 (pp. 3049-3058). Philadelphia, USA: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus4 WoS1
2025	Zhu, J., Qiao, Y., Zhang, S., He, X., Wu, Q., & Liu, J. (2025). MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation. In C. Ott (Ed.), Proceedings IEEE International Conference on Robotics and Automation (pp. 97-103). GA, Atlanta: IEEE. DOI
2025	Li, Z., Zhou, G., Hong, H., Shao, Y., Lyu, W., Qiao, Y., & Wu, Q. (2025). Ground-Level Viewpoint Vision-and-Language Navigation in Continuous Environments. In C. Ott (Ed.), Proceedings IEEE International Conference on Robotics and Automation (pp. 5266-5273). GA, Atlanta: IEEE. DOI Scopus1
2025	Qiao, Y., Lyu, W., Wang, H., Wang, Z., Li, Z., Zhang, Y., . . . Wu, Q. (2025). Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs. In C. Ott (Ed.), Proceedings IEEE International Conference on Robotics and Automation (pp. 6710-6717). GA, Atlanta: IEEE. DOI Scopus4
2025	Tang, Y., Zhang, J., Qin, X., Yu, J., Gou, G., Gangxiong, G. X., . . . Wu, Q. (2025). Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 14400-14410). TN, Nashville: IEEE COMPUTER SOC. DOI Scopus2 WoS2
2025	Tang, Y., Yu, J., Gai, K., Zhuang, J., Xiong, G., Gou, G., & Wu, Q. (2025). Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 24785-24795). TN, Nashville: IEEE COMPUTER SOC. DOI Scopus3 WoS1
2025	Liu, S., Zhang, H., Qiao, Q., Wu, Q., & Wang, P. (2025). VLN-ChEnv: Vision-language Navigation in Changeable Environments. In Mm 2025 Proceedings of the 33rd ACM International Conference on Multimedia Co Located with mm 2025 (pp. 3798-3807). IRELAND, Dublin: ASSOC COMPUTING MACHINERY. DOI
2025	Lei, L., Gai, K., Yu, J., Zhu, L., & Wu, Q. (2025). Secure and Efficient Watermarking for Latent Diffusion Models in Model Distribution Scenarios. In J. Kwok (Ed.), Ijcai International Joint Conference on Artificial Intelligence (pp. 7473-7481). CANADA, Montreal: ASSOC COMPUTATIONAL LINGUISTICS-ACL. DOI
2025	Shi, X., Li, Z., Lyu, W., Xia, J., Dayoub, F., Qiao, Y., & Wu, Q. (2025). SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation. In IEEE International Conference on Intelligent Robots and Systems (pp. 16923-16930). IEEE. DOI
2025	Xu, X., Zheng, D., Wu, Q., Hong, W., Cheng, X., & Yao, Y. (2025). A 95-110GHz Fully Metallic Four-Beam Passive Array Antenna with High Gain and High Efficiency. In 2025 19TH EUROPEAN CONFERENCE ON ANTENNAS AND PROPAGATION, EUCAP (pp. 3 pages). SWEDEN, Stockholm: IEEE.
2025	Chen, Y., Wu, Q., & Xie, Y. (2025). MoE-Enhanced-TTT: Advancing Medical Image Segmentation. In 2025 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS, DICTA (pp. 122-129). AUSTRALIA, Adelaide: IEEE COMPUTER SOC. DOI
2025	Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2025). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. In Lecture Notes in Computer Science Vol. 15065 LNCS (pp. 260-278). Milan, Italy: Springer Nature Switzerland. DOI Scopus31 WoS16
2025	Qiao, Y., Liu, Q., Liu, J., Liu, J., & Wu, Q. (2025). LLM as Copilot for Coarse-Grained Vision-and-Language Navigation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 15063 LNCS (pp. 459-476). Milan, Italy: Springer Science and Business Media Deutschland GmbH. DOI Scopus8 WoS2
2025	Chen, Q., Xie, Y., Wu, B., Chen, X., Ang, J., To, M. -S., . . . Wu, Q. (2025). Act Like a Radiologist: Radiology Report Generation Across Anatomical Regions. In Lecture Notes in Computer Science Vol. 15477 LNCS (pp. 36-52). Hanoi, Vietnam: Springer Nature Singapore. DOI
2024	Huang, Z., Chen, Q., Sung, L., Yang, Y., Wang, N., Wu, Q., & Tan, M. (2024). G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images. In 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (pp. 10117-10126). WA, Seattle: IEEE COMPUTER SOC. DOI Scopus8 WoS2
2024	Qiao, Y., Yu, Z., Zhao, Z., Chen, S., Sun, M., Guo, L., . . . Liu, J. (2024). VL-Mamba: Exploring State Space Models for Multimodal Learning. In Proceedings of Machine Learning Research Vol. 262 (pp. 102-113). Vancouver, Canada: ML Research Press. Scopus7 WoS1
2024	Wu, Y., Xie, Y., Luo, X., Wu, Q., & Cai, J. (2024). Dataset, Challenge, and Evaluation for Tumor Segmentation Variability. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 11302-11303). Melbourne VIC Australia: ACM. DOI Scopus3 WoS2
2024	Qu, X., Yu, J., Gai, K., Zhuang, J., Tang, Y., Xiong, G., . . . Wu, Q. (2024). Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 4581-4590). Melbourne VIC Australia: ACM. DOI Scopus3
2024	Hong, H., Wang, S., Huang, Z., Wu, Q., & Liu, J. (2024). Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments. In Proceedings of the 32nd ACM International Conference on Multimedia (MM'24) (pp. 7639-7648). New York, NY, USA: Association for Computing Machinery (ACM). DOI Scopus2
2024	Li, Y., Yu, J., Gai, K., Liu, B., Xiong, G., & Wu, Q. (2024). T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 3955-3963). Melbourne, Victoria, Australia: ACM. DOI WoS2
2024	Wang, X., Zhuang, B., & Wu, Q. (2024). ModaVerse: Efficiently Transforming Modalities with LLMs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 26596-26606). WA, Seattle: IEEE COMPUTER SOC. DOI Scopus12 WoS4
2024	Lu, Z., Xie, Y., Zeng, Q., Lu, M., Wu, Q., & Xia, Y. (2024). Spot the Difference: Difference Visual Question Answering with Residual Alignment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 15005 LNCS (pp. 649-658). Marrakesh: Springer Science and Business Media Deutschland GmbH. DOI Scopus5 WoS4
2024	Ye, Y., Xie, Y., Zhang, J., Chen, Z., Wu, Q., & Xia, Y. (2024). Continual Self-Supervised Learning: Towards Universal Multi-Modal Medical Data Representation Learning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 11114-11124). Online: IEEE Computer Society. DOI Scopus33 WoS31
2024	Wang, Z., Li, J., Hong, Y., Wang, Y., Wu, Q., Bansal, M., . . . Qiao, Y. (2024). Scaling Data Generation in Vision-and-Language Navigation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 11975-11986). Paris, France: IEEE. DOI Scopus60 WoS30
2024	Mohammadi, B., Hong, Y., Qi, Y., Wu, Q., Pan, S., & Shi, J. Q. (2024). Augmented Commonsense Knowledge for Remote Object Grounding. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 4269-4277). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus17 WoS15
2024	Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. T., & Wu, Q. (2024). WebVLN: Vision-and-Language Navigation on Websites. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 1165-1173). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus11 WoS4
2024	Zhou, G., Hong, Y., & Wu, Q. (2024). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 7641-7649). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus125 WoS89
2024	Tang, Y., Yu, J., Gai, K., Zhuang, J., Xiong, G., Hu, Y., & Wu, Q. (2024). Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 5180-5188). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus37 WoS28
2024	Phan, V. M. H., Xie, Y., Qi, Y., Liu, L., Liu, L., Zhang, B., . . . Verjans, J. W. (2024). Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) (pp. 11492-11501). Seattle, WA, USA: Institute of Electrical and Electronics Engineers (IEEE). DOI Scopus21 WoS12
2024	Wang, X., Wu, Q., & Zhuang, B. (2024). ModaVerse: Efficiently Transforming Modalities with LLMs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 26606-26616). Online: IEEE.
2024	Hong, H., Wang, S., Huang, Z., Wu, Q., & Liu, J. (2024). Why only text: empowering vision-and-language navigation with multi-modal prompts. In Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024) (pp. 839-847). Jeju, Jeju Island, South Korea.: International Joint Conferences on Artificial Intelligence Organisation. DOI Scopus3 WoS2
2024	Xie, Y., Chen, Q., Wang, S., To, M. S., Lee, I., Khoo, E. W., . . . Wu, Q. (2024). PairAug: What Can Augmented Image-Text Pairs Do for Radiology?. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 11652-11661). Seattle, Washington, USA: IEEE. DOI Scopus11 WoS6
2024	Wei, Y., Fu, S., Jiang, W., Zhang, Z., Zeng, Z., Wu, Q., . . . Zhang, Y. (2024). GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning. In Advances in Neural Information Processing Systems Vol. 37 (pp. 29 pages). Vancouver, Canada: Neural information processing systems foundation. Scopus11
2024	Chen, Q., Zhang, B., Wang, G., & Wu, Q. (2024). Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, & C. Zhang (Eds.), Advances in Neural Information Processing Systems Vol. 37 (pp. 24 pages). CANADA, Vancouver: NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). Scopus2
2024	He, K., Chen, K., Bai, J., Huang, Y., Wu, Q., Xia, S. T., & Wang, L. (2024). Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor. In Advances in Neural Information Processing Systems Vol. 37 (pp. 22 pages). Vancouver, Canada: Neural information processing systems foundation. Scopus2
2024	Wu, B., Xie, Y., Zhang, Z., Ge, J., Yaxley, K., Bahadir, S., . . . To, M. S. (2024). BHSD: A 3D Multi-class Brain Hemorrhage Segmentation Dataset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14348 LNCS (pp. 147-156). Online: Springer Nature Switzerland. DOI Scopus19 WoS1
2024	Yu, Z., Qiao, Y., Xie, Y., & Wu, Q. (2024). Multi-modal Adapter for Medical Vision-and-Language Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14348 LNCS (pp. 393-402). Online: Springer Nature Switzerland. DOI Scopus4 WoS1
2024	Deng, C., Chen, D., & Wu, Q. (2024). Identity-Consistent Aggregation for Video Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 13388-13398). Online: IEEE. DOI Scopus9 WoS10
2024	Qiao, Y., Yu, Z., & Wu, Q. (2024). VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15397-15406). Online: IEEE. DOI Scopus18 WoS13
2024	Liu, S., Zhang, H., Qi, Y., Wang, P., Zhang, Y., & Wu, Q. (2024). AerialVLN: Vision-and-Language Navigation for UAVs. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15338-15348). Online: IEEE. DOI Scopus42 WoS31
2024	Tian, X., Yang, Y. L., & Wu, Q. (2024). ShapeScaffolder: Structure-Aware 3D Shape Generation from Text. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2715-2724). Paris, France: IEEE. DOI Scopus9 WoS9
2023	Yu, Z., Xie, Y., Xia, Y., & Wu, Q. (2023). PLMVQA: Applying Pseudo Labels for Medical Visual Question Answering with Limited Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14394 LNCS (pp. 357-367). Online: Springer Nature Switzerland. DOI Scopus1 WoS1
2023	Qiao, Y., Qi, Y., Yu, Z., Liu, J., & Wu, Q. (2023). March in Chat: Interactive Prompting for Remote Embodied Referring Expression. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15712-15721). Paris, France: IEEE. DOI Scopus30 WoS29
2023	Deng, C., Chen, Q., Qin, P., Chen, D., & Wu, Q. (2023). Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15602-15612). Online: IEEE. DOI Scopus42 WoS32
2023	Suo, W., Sun, M., Liu, W., Gao, Y., Wang, P., Zhang, Y., & Wu, Q. (2023). S<SUP>3</SUP>C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning. In 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR Vol. 2023-June (pp. 2646-2656). Online: IEEE COMPUTER SOC. DOI Scopus11 WoS8
2023	Wen, Z., Wang, Y., Tan, M., Wu, Q., & Wu, Q. (2023). Digging out Discrimination Information from Generated Samples for Robust Visual Question Answering. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 6910-6928). Dubrovnik, Croatia and Online: Association for Computational Linguistics. DOI Scopus12 WoS5
2023	Wu, Q., Chao, W., Zhou, X., & Luo, Z. (2023). TP-Detector: Detecting Turning Points in the Engineering Process of Large-scale Projects. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings of the System Demonstrations (pp. 177-185). Singapore: Association for Computational Linguistics (ACL). DOI
2023	Rodriguez-Opazo, C., Marrese-Taylor, E., Fernando, B., Takamura, H., & Wu, Q. (2023). Memory-efficient Temporal Moment Localization in Long Videos. In A. Vlachos, & I. Augenstein (Eds.), 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 (pp. 1909-1924). CROATIA, Dubrovnik: ASSOC COMPUTATIONAL LINGUISTICS-ACL. WoS3
2023	Gao, J., Blair, A., Wu, Q., & Pagnucco, M. (2023). LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering. In Advances in Neural Information Processing Systems Vol. 36 (pp. 13 pages). Online: Neural information processing systems foundation. Scopus6
2023	Rodriguez-Opazo, C., Marrese-Taylor, E., Fernando, B., Takamura, H., & Wu, Q. (2023). Memory-efficient Temporal Moment Localization in Long Videos. In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1901-1916). Online: Association for Computational Linguistics (ACL). DOI Scopus5
2023	Chen, Q., Deng, C., & Wu, Q. (2023). Learning Distinct and Representative Modes for Image Captioning. In Advances in Neural Information Processing Systems Vol. 35 (pp. 14 pages). USA: Neural information processing systems foundation. Scopus21
2023	Huang, Y., Leung, C. H., Ma, S., Yuan, Z., Wu, Q., Wang, S., . . . Huang, Z. (2023). Towards Balanced Representation Learning for Credit Policy Evaluation. In Proceedings of the International Conference on Artificial Intelligence and Statistics Vol. 206 (pp. 3677-3692). Valencia, Spain (virtual event). Scopus5
2023	Zhao, C., Qi, Y., & Wu, Q. (2023). Mind the Gap: Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes. In Proceedings of the 31st ACM International Conference on Multimedia (pp. 4349-4358). Ottawa ON Canada: ACM. DOI Scopus14 WoS10
2023	Cong, G., Li, L., Qi, Y., Zha, Z. J., Wu, Q., Wang, W., . . . Huang, Q. (2023). Learning to Dub Movies via Hierarchical Prosody Models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2023-June (pp. 14687-14697). Online: IEEE. DOI Scopus29 WoS25
2023	Guan, Q., Xie, Y., Yang, B., Zhang, J., Liao, Z., Wu, Q., & Xia, Y. (2023). Unpaired Cross-Modal Interaction Learning for COVID-19 Segmentation on Limited CT Images. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14222 (pp. 603-613). Vancouver, BC, Canada: Springer Nature Switzerland. DOI Scopus3 WoS2
2023	Xie, Y., Gu, L., Harada, T., Zhang, J., Xia, Y., & Wu, Q. (2023). MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14220 (pp. 13-23). Vancouver, BC, Canada: Springer Nature Switzerland. DOI Scopus12 WoS12
2022	Tian, X., Yang, Y. L., & Wu, Q. (2022). Enhancing Person Synthesis in Complex Scenes via Intrinsic and Contextual Structure Modeling. In BMVC 2022 - 33rd British Machine Vision Conference Proceedings. Scopus1
2022	Jing, C., Jia, Y., Wu, Y., Li, C., & Wu, Q. (2022). Learning the Dynamics of Visual Relational Reasoning via Reinforced Path Routing. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 Vol. 36 (pp. 1122-1130). Palo Alto, California USA: AAAI Press. DOI Scopus9 WoS4
2022	Kazemi Moghaddam, M., Abbasnejad, E., Wu, Q., Qinfeng Shi, J., & Van Den Hengel, A. (2022). ForeSI: Success-Aware Visual Navigation Agent. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022) (pp. 3401-3410). Online: IEEE. DOI Scopus11 WoS9
2022	Qi, Y., Pan, Z., Hong, Y., Yang, M. H., Van Den Hengel, A., & Wu, Q. (2022). The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2021) (pp. 1635-1644). online: IEEE. DOI Scopus72 WoS38
2022	Suo, W., Sun, M., Niu, K., Gao, Y., Wang, P., Zhang, Y., & Wu, Q. (2022). A Simple and Robust Correlation Filtering Method for Text-Based Person Search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 13695 LNCS (pp. 726-742). Online: Springer Nature Switzerland. DOI Scopus80 WoS66
2022	Gu, J., Stefani, E., Wu, Q., Thomason, J., & Wang, X. E. (2022). Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions. In PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) Vol. 1 (pp. 7606-7623). Online: ASSOC COMPUTATIONAL LINGUISTICS-ACL. DOI Scopus76 WoS59
2022	Zhu, W., Qi, Y., Narayana, P., Sone, K., Basu, S., Wang, E. X., . . . Wang, W. Y. (2022). Diagnosing Vision-and-Language Navigation: What Really Matters. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 5981-5993). Online: ssociation for Computational Linguistics (ACL). DOI Scopus26 WoS18
2022	Chen, Q., Tan, M., Qi, Y., Zhou, J., Li, Y., & Wu, Q. (2022). V2C: Visual Voice Cloning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2022) Vol. 2022-June (pp. 21210-21219). Online: IEEE. DOI Scopus33 WoS19
2022	Jing, C., Jia, Y., Wu, Y., Liu, X., & Wu, Q. (2022). Maintaining Reasoning Consistency in Compositional Visual Question Answering. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 5089-5098). Online: IEEE. DOI Scopus30 WoS26
2022	Hong, Y., Wang, Z., Wu, Q., & Gould, S. (2022). Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 15418-15428). Online: IEEE. DOI Scopus76 WoS62
2022	Xie, Y., Zhang, J., Xia, Y., & Wu, Q. (2022). UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. In Proceedings, Part XXI of the 17th European Conference on Computer Vision (ECCV 2022), as published in Lecture Notes in Computer Science Vol. 13681 LNCS (pp. 558-575). Online: Springer. DOI Scopus67 WoS61
2022	Ding, Y., Yu, J., Liu, B., Hu, Y., Cui, M., & Wu, Q. (2022). MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 5079-5088). Online: IEEE. DOI Scopus128 WoS113
2022	Qiao, Y., Qi, Y., Hong, Y., Yu, Z., Wang, P., & Wu, Q. (2022). HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 15397-15406). New Orleans, LA, USA: IEEE. DOI Scopus86 WoS60
2022	Chen, C., Hu, Z., Jin, S., Xiao, L., Hu, M., Wu, Q., . . . Zou, M. (2022). Classification of COVID-19 in CT Scans Using Image Smoothing and Improved Deep Residual Network. In Artificial Intelligence First CAAI International Conference, CICAI 2021, Hangzhou, China, June 5–6, 2021, Proceedings, Part I Vol. 13069 LNAI (pp. 89-100). Switzerland: Springer. DOI
2022	Cao, Y., Wu, Q., Zhang, B., Liu, Z., & Li, J. (2022). FSE-MV: Compressed Domain Video Information Assisted Hybrid Real-Time Vehicle Speed Estimation. In C. T. Calafate, X. Chen, & Y. Wu (Eds.), MOBILE NETWORKS AND MANAGEMENT, MONAMI 2021 Vol. 418 (pp. 100-114). ELECTR NETWORK: SPRINGER INTERNATIONAL PUBLISHING AG. DOI
2021	Yao, Y., Chen, T., Xie, G. S., Zhang, C., Shen, F., Wu, Q., . . . Zhang, J. (2021). Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2623-2632). online: IEEE. DOI Scopus218 WoS178
2021	Yao, Y., Sun, Z., Zhang, C., Shen, F., Wu, Q., Zhang, J., & Tang, Z. (2021). Jo-SRC: A Contrastive Approach for Combating Noisy Labels. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 5188-5197). online: IEEE. DOI Scopus164 WoS151
2021	Deng, C., Chen, S., Chen, D., He, Y., & Wu, Q. (2021). Sketch, ground, and refine: top-down dense video captioning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2021) (pp. 234-243). online: IEEE. DOI Scopus72 WoS50
2021	Hong, Y., Wu, Q., Qi, Y., Rodriguez Opazo, C., & Gould, S. (2021). VLN↻BERT: A Recurrent Vision-and-Language BERT for Navigation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 1643-1653). online: IEEE. DOI Scopus260 WoS189
2021	Xu, G., Niu, S., Tan, M., Luo, Y., Du, Q., & Wu, Q. (2021). Towards Accurate Text-based Image Captioning with Content Diversity Exploration. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 12632-12641). online: IEEE. DOI Scopus67 WoS47
2021	Gao, C., Chen, J., Liu, S., Wang, L., Zhang, Q., & Wu, Q. (2021). Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3063-3072). online: IEEE COMPUTER SOC. DOI Scopus84 WoS62
2021	Wu, Q., Wu, C. J., Zhu, Y., & Joo, J. (2021). Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4095-4102). online: IEEE. DOI Scopus20 WoS8
2021	Yu, J., Chai, Y., Wang, Y., Hu, Y., & Wu, Q. (2021). CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation. In IJCAI International Joint Conference on Artificial Intelligence (pp. 1274-1280). online: International Joint Conferences on Artificial Intelligence. DOI Scopus67 WoS44
2021	Suo, W., Sun, M., Wang, P., & Wu, Q. (2021). Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention. In IJCAI International Joint Conference on Artificial Intelligence (pp. 1032-1038). online: International Joint Conferences on Artificial Intelligence. DOI Scopus10 WoS7
2021	Gao, C., Zhu, Q., Wang, P., & Wu, Q. (2021). Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) (pp. 664-670). United States: International Joint Conferences on Artificial Intelligence. DOI Scopus2 WoS1
2021	An, D., Qi, Y., Huang, Y., Wu, Q., Wang, L., & Tan, T. (2021). Neighbor-view Enhanced Model for Vision and Language Navigation. In MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia (pp. 5101-5109). virtual online: ACM. DOI Scopus67 WoS52
2021	Qiao, Y., Chen, Q., Deng, C., DIng, N., Qi, Y., Tan, M., . . . Wu, Q. (2021). R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks. In Proceedings of the 29th ACM International Conference on Multimedia (pp. 2085-2093). United States: Association for Computing Machinery. DOI Scopus17 WoS15
2021	Wen, Z., Xu, G., Tan, M., Wu, Q., & Wu, Q. (2021). Debiased Visual Question Answering from Feature and Sample Perspectives. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. Wortman Vaughan (Eds.), Advances in Neural Information Processing Systems 34 Vol. 5 (pp. 3784-3796). Online: Neural Information Processing Systems Foundation, Inc (NeurIPS). Scopus79 WoS43
2021	He, K., Huang, Y., Wu, Q., Yang, J., An, D., Sima, S., & Wang, L. (2021). Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision. In Advances in Neural Information Processing Systems Vol. 2 (pp. 652-663). ELECTR NETWORK: NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). Scopus34 WoS68
2021	Kazemi Moghaddam, M., Wu, Q., Abbasnejad, E., & Shi, J. (2021). Optimistic Agent: Accurate Graph-Based Value Estimation for More Successful Visual Navigation. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV 2021) (pp. 3732-3741). online: IEEE. DOI Scopus16 WoS14
2021	Zheng, Y., Wen, Z., Tan, M., Zeng, R., Chen, Q., Wang, Y., & Wu, Q. (2021). Modular graph attention network for complex visual relational reasoning. In Proceedings of the 15th Asian Conference on Computer Vision (ACCV 2020), as published in Lecture Notes in Computer Science Vol. 12627 (pp. 137-153). Cham, Switzerland: Springer. DOI Scopus2
2021	Zhu, Q., Gao, C., Wang, P., & Wu, Q. (2021). Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps. In THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE Vol. 35 (pp. 3608-3615). ELECTR NETWORK: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. DOI Scopus46 WoS37
2021	Wang, Z., Bao, R., Wu, Q., & Liu, S. (2021). Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. In THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE Vol. 35 (pp. 2835-2843). ELECTR NETWORK: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. DOI Scopus25 WoS17
2021	Liu, L., He, M., Xu, G., Tan, M., & Wu, Q. (2021). How to Train Your Agent to Read and Write. In THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE Vol. 35 (pp. 13397-13405). Online: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. DOI Scopus3 WoS3
2021	Wu, Q., Qin, M., Song, J., & Liu, L. (2021). An improved method of low light image enhancement based on retinex. In 2021 6th International Conference on Image, Vision and Computing, ICIVC 2021 (pp. 233-241). online: IEEE. DOI Scopus13
2020	Hong, Y., Rodriguez Opazo, C., Wu, Q., & Gould, S. (2020). Sub-Instruction Aware Vision-and-Language Navigation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 3360-3376). virtual online: Association for Computational Linguistics. DOI Scopus46 WoS38
2020	Jiang, X., Yu, J., Qin, Z., Zhuang, Y., Zhang, X., Hu, Y., & Wu, Q. (2020). DualVD: An adaptive dual encoding model for deep visual understanding in visual dialogue. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence Vol. 34 (pp. 11125-11132). online: AAAI. Scopus62 WoS45
2020	Jing, C., Wu, Y., Zhang, X., Jia, Y., & Wu, Q. (2020). Overcoming language priors in VQA via decomposed linguistic representations. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20) Vol. 34 (pp. 11181-11188). online: AAAI. DOI Scopus101 WoS68
2020	Zhang, C., Yao, Y., Shu, X., Li, Z., Tang, Z., & Wu, Q. (2020). Data-driven Meta-set Based Fine-Grained Visual Recognition. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 2372-2381). online: ACM. DOI Scopus23 WoS16
2020	Wang, P., Liu, D., Li, H., & Wu, Q. (2020). Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 28-36). online: ACM. DOI Scopus20 WoS17
2020	Jing, C., Wu, Y., Pei, M., Hu, Y., Jia, Y., & Wu, Q. (2020). Visual-Semantic Graph Matching for Visual Grounding. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 4041-4050). online: ACM. DOI Scopus33 WoS23
2020	Liu, F., Xu, G., Wu, Q., Du, Q., Jia, W., & Tan, M. (2020). Cascade Reasoning Network for Text-based Visual Question Answering. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 4060-4069). online: ACM. DOI Scopus58 WoS45
2020	Hong, Y., Rodriguez-Opazo, C., Qi, Y., Wu, Q., & Gould, S. (2020). Language and visual entity relationship graph for agent navigation. In Advances in Neural Information Processing Systems Vol. 2020-December (pp. 1-12). online: NIPS. Scopus95
2020	Liao, Z., Liu, L., Wu, Q., Teney, D., Shen, C., Van Den Hengel, A., & Verjans, J. (2020). Medical data inquiry using a question answering model. In Proceedings: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI 2020) Vol. 2020-April (pp. 1490-1493). online: IEEE. DOI Scopus9 WoS4
2020	Wang, H., Wu, Q., & Shen, C. (2020). Soft Expert Reward Learning for Vision-and-Language Navigation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12354 LNCS (pp. 126-141). Switzerland: Springer Nature. DOI Scopus29 WoS23
2020	Tang, R., Ma, C., Zhang, W. E., Wu, Q., & Yang, X. (2020). Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12364 LNCS (pp. 437-453). Switzerland: Springer International Publishing. DOI Scopus40 WoS34
2020	Deng, C., Ding, N., Tan, M., & Wu, Q. (2020). Length-Controllable Image Captioning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12358 LNCS (pp. 712-729). Switzerland: Springer International Publishing. DOI Scopus51 WoS55
2020	Qi, Y., Pan, Z., Zhang, S., van den Hengel, A., & Wu, Q. (2020). Object-and-Action Aware Model for Visual Language Navigation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12355 LNCS (pp. 303-317). Switzerland: Springer International Publishing. DOI Scopus81 WoS56
2020	Jiang, X., Yu, J., Sun, Y., Qin, Z., Zhu, Z., Hu, Y., & Wu, Q. (2020). DAM: Deliberation, abandon and memory networks for generating detailed and non-repetitive responses in visual dialogue. In IJCAI International Joint Conference on Artificial Intelligence Vol. 2021-January (pp. 687-693). online: AAAI Press. Scopus10 WoS4
2020	Zhu, Z., Yu, J., Wang, Y., Sun, Y., Hu, Y., & Wu, Q. (2020). Mucko: Multi-layer cross-modal knowledge reasoning for fact-based visual question answering. In IJCAI International Joint Conference on Artificial Intelligence Vol. 2021-January (pp. 1097-1103). online: AAAI Press. Scopus102 WoS109
2020	Chen, Z., Wang, P., Ma, L., Wong, K. Y. K., & Wu, Q. (2020). Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension. In Proceedings of the 2020 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 10083-10092). online: IEEE. DOI Scopus65 WoS23
2020	Qi, Y., Wu, Q., Anderson, P., Wang, X., Wang, W. Y., Shen, C., & Van Den Hengel, A. (2020). Reverie: Remote embodied visual referring expression in real indoor environments. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 9979-9988). online: IEEE. DOI Scopus311 WoS245
2020	Liao, Z., Wu, Q., Shen, C., Van Den Hengel, A., & Verjans, J. (2020). AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering. In L. Cappellato, C. Eickhoff, N. Ferro, & A. Névéol (Eds.), Proceedings of the 11th International Conference of the CLEF Initiative (CLEF 2020), as published in CEUR Workshop Proceedings Vol. 2696 (pp. 1-14). online: CEUR-WS. Scopus8
2020	Abbasnejad, M., Abbasnejad, I., Wu, Q., Shi, Q., & Van Den Hengel, A. (2020). Gold seeker: Information gain from policy distributions for goal-oriented vision-and-langauge reasoning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 13447-13456). online: IEEE. DOI Scopus4 WoS1
2020	Chen, S., Jin, Q., Wang, P., & Wu, Q. (2020). Say as you wish: Fine-grained control of image caption generation with abstract scene graphs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 9959-9968). online: IEEE. DOI Scopus243 WoS195
2020	Chen, S., Zhao, Y., Jin, Q., & Wu, Q. (2020). Fine-grained video-text retrieval with hierarchical graph reasoning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 10635-10644). online: IEEE. DOI Scopus344 WoS174
2020	Chen, Q., Wu, Q., Tang, R., Wang, Y., Wang, S., & Tan, M. (2020). Intelligent home 3D: Automatic 3D-house design from linguistic descriptions only. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 12622-12631). online: IEEE. DOI Scopus43 WoS28
2019	Duan, X., Wu, Q., Gan, C., Zhang, Y., Huang, W., Van Den Hengel, A., & Zhu, W. (2019). Watch, reason and code: Learning to represent videos using program. In Proceedings of the 27th ACM International Conference on Multimedia (ACM Multimedia 2019), MM '19 (pp. 1543-1551). online: Association for Computing Machinery. DOI Scopus5 WoS1
2019	Abbasnejad, E., Wu, Q., Shi, Q., & Van Den Hengel, A. (2019). What's to know? uncertainty as a guide to asking goal-oriented questions. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2019-June (pp. 4150-4159). online: IEEE. DOI Scopus18 WoS10
2019	Zhang, J., Wu, Q., Zhang, J., Shen, C., & Lu, J. (2019). Mind your neighbours: Image annotation with metadata neighbourhood graph co-attention networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2019-June (pp. 2951-2959). online: IEEE. DOI Scopus22 WoS13
2019	Wang, P., Wu, Q., Cao, J., Shen, C., Gao, L., & Hengel, A. V. D. (2019). Neighbourhood watch: Referring expression comprehension via language-guided graph attention networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2019-June (pp. 1960-1968). online: IEEE. DOI Scopus284 WoS242
2018	Cao, I., Guo, Y., Wu, Q., Shen, C., Huang, J., & Tan, M. (2018). Adversarial learning with local coordinate coding. In 35th International Conference on Machine Learning, ICML 2018 Vol. 2 (pp. 1104-1117). online: PMLR. Scopus23 WoS11
2018	Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., . . . Zhu, J. (2018). Discrimination-aware Channel Pruning for Deep Neural Networks. In Advances in Neural Information Processing Systems Vol. 2018-December (pp. 875-886). online: NIPS. Scopus460 WoS275
2018	Zhang, J., Zhang, J., Wu, Q., Wu, Q., Xu, J., Lu, J., . . . Tang, Z. (2018). Historical image annotation by exploring the tag relevance. In Proceedings - 4th Asian Conference on Pattern Recognition, ACPR 2017 (pp. 646-651). Nanjing, PEOPLES R CHINA: IEEE. DOI Scopus1 WoS1
2018	Zhuang, B., Wu, Q., Shen, C., Reid, I., & Van Den Hengel, A. (2018). HCVRD: A benchmark for large-scale human-centered visual relationship detection. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 7631-7638). New Orleans: Association for the Advancement of Artificial Intelligence. Scopus38 WoS30
2018	Zhang, J., Wu, Q., Zhang, J., Shen, C., & Lu, J. (2018). Kill two birds with one stone: Weakly-supervised neural network for image annotation and tag refinement. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 7550-7557). New Orleans: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. Scopus9 WoS6
2018	Wu, Q., Wang, P., Shen, C., Reid, I., & Hengel, A. (2018). Are you talking to me? Reasoned visual dialog generation through adversarial learning. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) (pp. 6106-6115). Salt Lake City, UT: IEEE. DOI Scopus113 WoS92
2018	Deng, C., Wu, Q., Wu, Q., Hu, F., Lyu, F., & Tan, M. (2018). Visual Grounding via Accumulated Attention. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 7746-7755). online: IEEE. DOI Scopus181 WoS139
2018	Anderson, P., Das, A., & Wu, Q. (2018). Connecting language and vision to actions. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference Tutorial Abstracts (pp. 10-14). Melbourne: Association for Computational Linguistics. DOI
2018	Huang, Y., Wu, Q., Song, C., & Wang, L. (2018). Learning Semantic Concepts and Order for Image and Sentence Matching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 6163-6171). online: IEEE. DOI Scopus344 WoS277
2018	Ma, C., Shen, C., Dick, A., Wu, Q., Wang, P., Van Den Hengel, A., & Reid, I. (2018). Visual Question Answering with memory-augmented network. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) (pp. 6975-6984). Salt Lake City, Utah: IEEE. DOI Scopus103 WoS80
2018	Anderson, P., Wu, Q., Teney, D., Bruce, J., Johnson, M., Sünderhauf, N., . . . Hengel, A. V. D. (2018). Vision-and-language navigation: interpreting visually-grounded navigation instructions in real environments. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) Vol. abs/1711.07280 (pp. 3674-3683). Salt Lake City, UT: IEEE. DOI Scopus1118 WoS1339
2018	Zhang, J., Xie, Y., Wu, Q., & Xia, Y. (2018). Skin lesion classification in dermoscopy images using synergic deep learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 11071 LNCS (pp. 12-20). Switzerland: Springer. DOI Scopus47 WoS30
2018	Zhang, J., Wu, Q., Shen, C., Zhang, J., Lu, J., & van den Hengel, A. (2018). Goal-oriented visual question generation via intermediate rewards. In V. Ferrari, M. Hebert, C. Sminchisescu, & Y. Weiss (Eds.), Computer Vision - ECCV 2018: Proceedings, Part V Vol. Lecture Notes in Computer Science; vol. 11209 (pp. 189-204). Munich: Springer. DOI Scopus13 WoS17
2018	Zhuang, B., Wu, Q., Shen, C., Reid, I., & van den Hengel, A. (2018). Parallel attention: a unified framework for visual object discovery through dialogs and queries. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) (pp. 4252-4261). Salt Lake City, UT: IEEE. DOI Scopus141 WoS104
2018	Wang, C., Zhao, R., Yang, X., & Wu, Q. (2018). Research of UAV Target Detection and Flight Control Based on Deep Learning. In 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD) (pp. 170-174). online: IEEE. DOI WoS14
2018	Wu, Q., Wang, P., Liu, E., Fan, Y., Duan, D., Wang, Z., & Cai, S. (2018). Design and Implementation of Learning Management Platform for Aviation Flight Training Based on SCORM/AICC Standard-A Case Study of K Airline Company Flight Training Learning Platform. In ADVANCED SCIENCE LETTERS Vol. 24 (pp. 5194-5198). INDONESIA, Bandung: AMER SCIENTIFIC PUBLISHERS. DOI WoS1
2018	Cao, J., Guo, Y., Wu, Q., Shen, C., Huang, J., & Tan, M. (2018). Adversarial Learning with Local Coordinate Coding. In Proceedings of Machine Learning Research Vol. 80 (pp. 707-715). Scopus9
2017	Wang, Q., Chen, W., & Wu, Q. (2017). The research and application of an real-time embedded measurement and control system for the river discharge. In S. Li, Y. Dai, & Y. Cheng (Eds.), 2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE) (pp. 1295-1298). Changsha, PEOPLES R CHINA: IEEE. DOI
2017	Wang, P., Wu, Q., Shen, C., & van den Hengel, A. (2017). The VQA-machine: learning how to use existing vision algorithms to answer new questions. In Proceedings: 30th IEEE Conference on Computer Vision and Pattern Recognition Vol. 2017-January (pp. 3909-3918). Honolulu: IEEE. DOI Scopus72 WoS43
2017	Wang, P., Wu, Q., Shen, C., Dick, A., & Van Den Hengel, A. (2017). Explicit knowledge-based reasoning for visual question answering. In C. Sierra (Ed.), Proceedings of the twenty-sixth International Joint Conference on Artificial Intelligence Vol. 0 (pp. 1290-1296). online: IJCAI. DOI Scopus162 WoS107
2016	Wu, Q., Wang, P., Shen, C., Dick, A., & Van Den Hengel, A. (2016). Ask me anything: free-form visual question answering based on knowledge from external sources. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2016-December (pp. 4622-4630). Las Vegas, NV: IEEE. DOI Scopus326 WoS216
2016	Wu, Q., Shen, C., Liu, L., Dick, A., & Van Den Hengel, A. (2016). What value do explicit high level concepts have in vision to language problems?. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2016-December (pp. 203-212). Las Vegas, NV: IEEE. DOI Scopus432 WoS299
2016	Wu, Q., Wang, C., Li, A., & Huang, B. (2016). Integral sliding mode controller design for near space vehicle with input constraints. In 2016 IEEE CHINESE GUIDANCE, NAVIGATION AND CONTROL CONFERENCE (CGNCC) (pp. 187-191). PEOPLES R CHINA, Nanjing: IEEE. WoS2
2016	Gao, G., Yang, H., Wu, Q., Mao, S. -J., & Yin, W. -L. (2016). A Wideband and Low Cross Polarization Slot Antenna Based on Differential-Feed. In INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATION AND NETWORK ENGINEERING (WCNE 2016) (pp. 4 pages). PEOPLES R CHINA, Beijing: DESTECH PUBLICATIONS, INC.
2016	Gao, G., Yang, H., Jin, Z., & Wu, Q. (2016). A Broadband Dual-polarization Slot Antenna Based on Substrate-integrated Cavity. In 2016 PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM (PIERS) (pp. 1994-1998). PEOPLES R CHINA, Shanghai: IEEE.
2016	Wu, Q., Yang, H., Jin, Z., Gao, G., & Cao, D. (2016). A Design of Band-pass Filter with Steep Stopband Attenuation Based on Transmission Zeros. In 2016 PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM (PIERS) (pp. 3482-3486). PEOPLES R CHINA, Shanghai: IEEE.
2016	Wang, X., Wu, Q., & Yang, J. (2016). Extended PGA Processing of High Resolution Airborne SAR Imagery Reconstructed via Backprojection Algorithm. In 2016 CIE INTERNATIONAL CONFERENCE ON RADAR (RADAR) (pp. 3 pages). PEOPLES R CHINA, Guangzhou: IEEE.
2016	Wu, Q., Yang, H., Gao, G., Gu, L., & Zhao, F. (2016). A Design of High Gain Archimedean Spiral Antenna. In INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATION AND NETWORK ENGINEERING (WCNE 2016) (pp. 4 pages). PEOPLES R CHINA, Beijing: DESTECH PUBLICATIONS, INC.
2016	Tang, J., Guo, Y., Lai, X., Liu, Y., & Wu, Q. (2016). Study on the Correlation between Fe<SUP>2+</SUP> and Peridot's Yellow Green Color and Quality Evaluation of Color Based on CIE1976 Lab* Uniform Color Space. In X. Xiao, & P. Han (Eds.), PROCEEDINGS OF THE 2016 5TH INTERNATIONAL CONFERENCE ON ENVIRONMENT, MATERIALS, CHEMISTRY AND POWER ELECTRONICS Vol. 84 (pp. 599-604). PEOPLES R CHINA, Zhengzhou: ATLANTIS PRESS. WoS2
2015	Wu, Q., Wu, Q., Zhao, S., Wei, M., & Wang, F. L. (2015). Knowledge Communication Analysis Based on Clustering and Association Rules Mining. In A. Liu, Y. Ishikawa, T. Qian, S. Nutanong, & M. A. Cheema (Eds.), DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015 Vol. 9052 (pp. 66-75). VIETNAM, Hanoi: SPRINGER-VERLAG BERLIN. DOI
2015	Wu, Q., Vogt, A., Briins, H. -D., Gronwald, F., & Schuster, C. (2015). Numerical and Experimental Evaluation of Electromagnetic Coupling between Radiating Antenna Structures inside a Computer Casing. In 2015 IEEE INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC COMPATIBILITY (EMC) (pp. 328-333). GERMANY, Dresden: IEEE.
2015	Cai, H., Wu, Q., & Hall, P. (2015). Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem. In Proceedings of the IEEE International Conference on Computer Vision Vol. 2015-February (pp. 74-79). Santigo: IEEE. DOI Scopus3 WoS4
2015	Wu, Q., Chen, F. -C., & Huang, R. -Y. (2015). Detecting Temporal Community from Dynamic Heterogeneous Networks. In PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015) (pp. 610-613). Harbin, PEOPLES R CHINA: IEEE.
2014	Wu, Q., Cai, H., & Hall, P. (2014). Learning graphs to model visual objects across different depictive styles. In D. Fleet, T. Pajdia, B. Schiele, & T. Tuytelaars (Eds.), Proceedings of the 13th European Conference on Computer Vision Vol. VII (pp. 313-328). Zurich, Switzerland: Springer. DOI Scopus17 WoS10
2013	Wu, Q., & Hall, P. (2013). Modelling visual objects Invariant to depictive style. In T. Burghardt, D. Damen, W. Mayol-Cuevas, & M. Mirmehdi (Eds.), Proceedings of the British Machine Vision Conference (pp. 23.1-23.12). Bristol, UK: BMVA Press. DOI Scopus5
2013	Hao, Y., Wu, Q., & Liu, B. (2013). Literature Review on the Impact of Income Distribution Gap on Consumer Demand. In G. Lee (Ed.), PSYCHOLOGY, MANAGEMENT AND SOCIAL SCIENCE Vol. 18 (pp. 65-70). PEOPLES R CHINA, Shenzhen: INFORMATION ENGINEERING RESEARCH INST, USA.
2012	Wu, Q., Fu, X., & Shen, X. (2012). Automatic micro-expression analysis. In INTERNATIONAL JOURNAL OF PSYCHOLOGY Vol. 47 (pp. 144-145). JOHN WILEY & SONS LTD.
2012	Wu, Q., & Hall, P. (2012). Prime shapes in natural images. In R. Bowden, J. Collomosse, & K. Mikolajcczk (Eds.), Proceedings of the British Machine Vision Conference (pp. 45-1-45-12). Surrey, UK: BMVA Press. DOI Scopus4 WoS2
2011	Hoffman, J., Wang, L. -M., Wu, Q., & Morton, K. (2011). Uptake of 2-deoxyglucose analogs by thrombotically activated cells. In JOURNAL OF NUCLEAR MEDICINE Vol. 52 (pp. 2 pages). SOC NUCLEAR MEDICINE INC.
2008	Liu, Y., Yin, Y., Teng, Z., Wu, Q., & Li, G. (2008). Activities prediction of drug molecules by using the optimal ensemble based on uniform design. In D. S. Huang, D. C. Wunsch, D. S. Levine, & K. H. Jo (Eds.), ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS Vol. 5226 (pp. 106-+). PEOPLES R CHINA, Shanghai: SPRINGER-VERLAG BERLIN. WoS1
2007	Wu, Q., Shao, T. -C., & Chen, T. (2007). Robust self-calibration from single image using RANSAC. In G. Bebis, R. Boyle, B. Parvin, D. Koracin, N. Paragios, S. M. Tanveer, . . . T. Malzbender (Eds.), ADVANCES IN VISUAL COMPUTING, PT I Vol. 4841 (pp. 230-+). NV, Lake Tahoe: SPRINGER-VERLAG BERLIN. WoS5
2006	Wu, Q., Song, M., Bu, J., & Chen, C. (2006). EigenExpress approach in recognition of facial expression using GPU. In T. S. Huang, N. Sebe, M. S. Lew, V. Pavlovic, M. Kolsch, A. Galata, & B. Kisacanin (Eds.), COMPUTER VISION IN HUMAN-COMPUTER INTERACTION Vol. 3979 (pp. 12-20). AUSTRIA, Graz: SPRINGER-VERLAG BERLIN. WoS1

Year	Citation
2024	Li, Y., Yu, J., Gai, K., Liu, B., Xiong, G., & Wu, Q. (2024). T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval. DOI Scopus3

Year	Citation
2024	Phan, V. M. H., Xie, Y., Qi, Y., Liu, L., Liu, L., Zhang, B., . . . Verjans, J. W. (2024). Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework..
2024	Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2024). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models.
2024	Zhou, G., Hong, Y., Wang, Z., Zhao, C., Bansal, M., & Wu, Q. (2024). SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts.
2024	Wei, Y., Fu, S., Jiang, W., Zhang, Z., Zeng, Z., Wu, Q., . . . Zhang, Y. (2024). GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning.
2024	Chen, Q., Zhang, B., Wang, G., & Wu, Q. (2024). Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles.
2024	Chen, Q., Zhao, R., Wang, S., Phan, V. M. H., Hengel, A. V. D., Verjans, J., . . . Wu, Q. (2024). A Survey of Medical Vision-and-Language Applications and Their Techniques.
2023	Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. -T., & Wu, Q. (2023). WebVLN: Vision-and-Language Navigation on Websites.
2021	Chen, Q., Li, Y., Qi, Y., Zhou, J., Tan, M., & Wu, Q. (2021). V2C: Visual Voice Cloning.
2021	Moghaddam, M. K., Abbasnejad, E., Wu, Q., Shi, J., & Hengel, A. V. D. (2021). Learning for Visual Navigation by Imagining the Success.
2019	Parvaneh, A., Abbasnejad, E., Wu, Q., & Shi, J. (2019). Show, Price and Negotiate: A Hierarchical Attention Recurrent Visual Negotiator..

MyIP-7370, CERA grants, Anton van den Hengel, Anthony Dick, Qi Wu, Answer Me Why：Explainability is Critical if We are to Trust Automated Decision Making, 98,000 AUD
MyIP-7370, CERA grants, Anton van den Hengel, Anthony Dick, Qi Wu, Robust long-term Autonomous Navigation, 98,000 AUD
Facebook’s Research and Academic Relations Program, Peter Anderson, Qi Wu, Damien Teney, Niko Sunderhauf, Stephen Gould, Anton van den Hengel, Treasure Hunt: Natural Language N

Computer Vision
Machine Learning
Algorithms and Data Structure Analysis
Research Methods
Advanced Topics in Computer Science

Date	Role	Research Topic	Program	Degree Type	Student Load	Student Name
2025	Principal Supervisor	CNN-TTT Fusion: Advancing 3D Medical Image Segmentation	Master of Philosophy	Master	Full Time	Mr Yuming Chen
2025	Principal Supervisor	Embodied Vision-and-Language Navigation: Deploy Vision-and-Language Navigation in Real-World via Knowledge Distillation from Large Foundation Models	Doctor of Philosophy	Doctorate	Full Time	Mr Zerui Li
2025	Principal Supervisor	Foundation Models for Embodied Navigation	Doctor of Philosophy	Doctorate	Full Time	Mr Xiangyu Shi
2025	Co-Supervisor	Multi-agent Vision-and-Language Navigation Base on Large Foundation Models	Doctor of Philosophy	Doctorate	Full Time	Mr Qunchao Jin
2025	Principal Supervisor	Towards Building Real-World Embodied Vision Language Navigation Agents	Doctor of Philosophy	Doctorate	Full Time	Mr Xunyi Zhao
2025	Co-Supervisor	Multi-agent Vision-and-Language Navigation Base on Large Foundation Models	Doctor of Philosophy	Doctorate	Full Time	Mr Qunchao Jin
2025	Principal Supervisor	CNN-TTT Fusion: Advancing 3D Medical Image Segmentation	Master of Philosophy	Master	Full Time	Mr Yuming Chen
2025	Principal Supervisor	Foundation Models for Embodied Navigation	Doctor of Philosophy	Doctorate	Full Time	Mr Xiangyu Shi
2025	Principal Supervisor	Towards Building Real-World Embodied Vision Language Navigation Agents	Doctor of Philosophy	Doctorate	Full Time	Mr Xunyi Zhao
2025	Principal Supervisor	Embodied Vision-and-Language Navigation: Deploy Vision-and-Language Navigation in Real-World via Knowledge Distillation from Large Foundation Models	Doctor of Philosophy	Doctorate	Full Time	Mr Zerui Li
2024	Principal Supervisor	Vision-language Pre-training in Medical Domain	Doctor of Philosophy	Doctorate	Full Time	Ms Sinuo Wang
2024	Principal Supervisor	Direct Fitting 3D Generative Models Using Volume Rendering	Master of Philosophy	Master	Full Time	Mr Jian Zhou
2024	Principal Supervisor	Parameter-efficient Tuning Large Vision-Language Models	Doctor of Philosophy	Doctorate	Full Time	Mr Shuai Fu
2024	Principal Supervisor	Vision-language Pre-training in Medical Domain	Doctor of Philosophy	Doctorate	Full Time	Ms Sinuo Wang
2024	Principal Supervisor	Direct Fitting 3D Generative Models Using Volume Rendering	Master of Philosophy	Master	Full Time	Mr Jian Zhou
2024	Principal Supervisor	Parameter-efficient Tuning Large Vision-Language Models	Doctor of Philosophy	Doctorate	Full Time	Mr Shuai Fu
2023	Principal Supervisor	Vision-and-Language in the Wild	Doctor of Philosophy	Doctorate	Full Time	Mr Zheng Yu
2023	Principal Supervisor	Efficient Video Foundation Model	Doctor of Philosophy	Doctorate	Full Time	Mr Feng Chen
2023	Principal Supervisor	Vision-and-Language in the Wild	Doctor of Philosophy	Doctorate	Full Time	Mr Zheng Yu
2023	Principal Supervisor	Efficient Video Foundation Model	Doctor of Philosophy	Doctorate	Full Time	Mr Feng Chen
2022	Principal Supervisor	Vision-and-Language Methods in Clinical Applications	Doctor of Philosophy	Doctorate	Full Time	Mr Chaohan Wang
2022	Co-Supervisor	MUDE: Mixed-reality Unified Development Environment for Context-Aware AI Automation Tasks	Doctor of Philosophy	Doctorate	Full Time	Miss Xiaoyan Wei
2022	Principal Supervisor	Spatiotemporal Multimodal Learning in Embodied AI	Doctor of Philosophy	Doctorate	Full Time	Mr Gengze Zhou
2022	Co-Supervisor	MUDE: Mixed-reality Unified Development Environment for Context-Aware AI Automation Tasks	Doctor of Philosophy	Doctorate	Full Time	Miss Xiaoyan Wei
2022	Principal Supervisor	Vision-and-Language Methods in Clinical Applications	Doctor of Philosophy	Doctorate	Full Time	Mr Chaohan Wang
2022	Principal Supervisor	Spatiotemporal Multimodal Learning in Embodied AI	Doctor of Philosophy	Doctorate	Full Time	Mr Gengze Zhou

Date	Role	Research Topic	Program	Degree Type	Student Load	Student Name
2022 - 2023	Principal Supervisor	Vision-and-Language Navigation in the Real-World	Master of Philosophy	Master	Full Time	Mr Chongyang Zhao
2021 - 2024	Principal Supervisor	Multi-modal Generation, Synergy and Evaluation	Doctor of Philosophy	Doctorate	Full Time	Mr Qi Chen
2021 - 2025	Co-Supervisor	Finding the Optimal Path in Real-World Environments Using Natural Language Instructions	Doctor of Philosophy	Doctorate	Full Time	Mr Bahram Mohammadi
2020 - 2023	Principal Supervisor	General Vision and Language Methods in Real Applications: A Focus on Vision-and-Language Navigation	Doctor of Philosophy	Doctorate	Full Time	Miss Yanyuan Qiao
2020 - 2024	Principal Supervisor	Language-based Visual Understanding	Doctor of Philosophy	Doctorate	Full Time	Mr Chaorui Deng
2019 - 2022	Co-Supervisor	Towards Optimistic, Imaginative, and Harmonious Reinforcement Learning in Single-Agent and Multi-Agent Environments	Doctor of Philosophy	Doctorate	Full Time	Mr Mahdi Kazemi Moghaddam
2018 - 2021	Co-Supervisor	Fully Convolutional Instance-level Visual Recognition	Doctor of Philosophy	Doctorate	Full Time	Mr Zhi Tian
2018 - 2022	Co-Supervisor	3D Scene Reconstruction from A Monocular Image	Doctor of Philosophy	Doctorate	Full Time	Mr Wei Yin
2018 - 2021	Co-Supervisor	Multi-modality Data Analysis Using Deep Reinforcement Learning	Doctor of Philosophy	Doctorate	Full Time	Mr Hu Wang
2018 - 2022	Co-Supervisor	Efficient Deep Networks for Image Matting	Doctor of Philosophy	Doctorate	Full Time	Ms Yutong Dai
2017 - 2018	Co-Supervisor	Text Detection and Recognition in Natural Scene Images	Doctor of Philosophy	Doctorate	Full Time	Mrs Hui Li

Position: Associate Professor
Email: qi.wu01@adelaide.edu.au

APrf Qi Wu

APrf Qi Wu

Connect With Me

External Profiles

Other Links

APrf Qi Wu

APrf Qi Wu

Appointments

Language Competencies

Education

Journals

Books

Book Chapters

Conference Papers

Report for External Bodies

Preprint

Current Higher Degree by Research Supervision (Adelaide University)

Past Higher Degree by Research Supervision (Adelaide University)

Connect With Me

External Profiles

Other Links