APrf Qi Wu
Associate Professor
School of Computer Science and Information Technology
College of Engineering and Information Technology
Dr Qi Wu is currently an Associate Professor in the University of Adelaide and he was an ARC Senior Research Associate in the Australian Centre for Robotic Vision (ACRV) in the University of Adelaide, Australia. Before that, he works as a Postdoc Researcher in the Australian Centre for Visual Technologies (ACVT). He received an MSc in Global Computing and Media Technology, a PhD in Computer Science from the University of Bath (United Kingdom), in 2011 and 2015. His research interests include cross-depictive style object modelling, object detection and Vision-to-Language problems. He is especially interested in the problem of Image Captioning and Visual Question Answering. His image captioning model produced the best result in the Microsoft COCO Image Captioning Challenges in the last year and his VQA model is the current state-of-the-art in the area. His work has been published in prestigious journals and conferences such as TPAMI, CVPR, ICCV and ECCV.
My research interests are mainly in computer vision and machine learning. My previous research projects include modeling visual objects regardless of depictive styles and image understanding using contextual cues. I am currently leading a small team at the Adelaide to research on the topic of Vision-and-Language.
I have been in the computer vision filed for nearly 10 years and I have a strong track record in this field. Currently, I am working on the vision to language problem and I am especially an expert in the image captioning and visual question answering (VQA). In 2015, my image captioning model and VQA model achieved the leading performance in the Microsoft COCO Image Captioning Challenges and VQA Challenges. I have published several papers in the top journals such as IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Signal Processing Magazine (SPM), Computer Vision and Image Understanding (CVIU). I also have published several papers on the top conference, such as International Joint Conference on Artificial Intelligence (IJCAI), AAAI, The Conference on Computer Vision and Pattern Recognition (CVPR) and the European Conference on Computer Vision (ECCV), and so on.
| Date | Position | Institution name |
|---|---|---|
| 2023 - ongoing | Associate Professor | University of Adelaide |
| 2018 - 2022 | Senior Lecturer | University of Adelaide, Adelaide |
| 2017 - 2018 | ARC Senior Research Associate | Australia Centre for Robotic Vision, University of Adelaide |
| 2015 - 2017 | Senior Research Associate | University of Adelaide |
| 2014 - ongoing | Research Intern | Lenovo |
| 2011 - 2015 | PhD | University of Bath |
| Language | Competency |
|---|---|
| Chinese (Mandarin) | Can read, write, speak, understand spoken and peer review |
| English | Can read, write, speak, understand spoken and peer review |
| Date | Institution name | Country | Title |
|---|---|---|---|
| 2011 - 2015 | University of Bath | United Kingdom | PhD |
| 2010 - 2011 | University of Bath | United Kingdom | MSc |
| 2006 - 2010 | China Jiliang University | China | BSc |
| Year | Citation |
|---|---|
| 2026 | Wang, C., Xie, Y., Chen, Q., Zhou, Y., & Wu, Q. (2026). A comprehensive analysis of Mamba for 3D volumetric medical image segmentation. Pattern Recognition, 173, 11 pages. |
| 2026 | Mohammadi, B., Abbasnejad, E., Qi, Y., Wu, Q., Van Den Hengel, A., & Shi, J. Q. (2026). Parameter-efficient action planning with large language models for vision-and-language navigation. Pattern Recognition, 172, 11 pages. |
| 2025 | Yang, D., Zhang, S., Xu, X., Wu, Q., Fan, W., Zhang, L., . . . Wang, F. (2025). Yield Estimation of Longline Aquaculture by the Shadows of Buoys Based on UAV Orthophoto Image. DRONES, 9(11), 21 pages. |
| 2025 | Tian, X., Yang, Y. L., & Wu, Q. (2025). Script-to-storyboard: A new contextual retrieval dataset and benchmark. Computational Visual Media, 11(1), 103-122. Scopus1 |
| 2025 | Li, L., Cong, G., Qi, Y., Zha, Z. J., Wu, Q., Sheng, Q. Z., . . . Yang, M. H. (2025). Dubbing Movies via Hierarchical Phoneme Modeling and Acoustic Diffusion Denoising. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(11), 1-17. |
| 2025 | Wen, Z., Tan, M., Wang, Y., Wu, Q., & Wu, Q. (2025). Enhanced Reasoning via Multimodal LLMs and Collaborative Inference. IEEE Transactions on Multimedia, 27, 1-14. |
| 2025 | Tan, M., Chen, Q., Huang, Z., Wu, Q., Li, Y., & Zhou, J. (2025). Auto-3D-house Design from Structured User Requirements. MACHINE INTELLIGENCE RESEARCH, 22(2), 18 pages. |
| 2025 | Zhang, J., Chen, X., Yang, B., Guan, Q., Chen, Q., Chen, J., . . . Xia, Y. (2025). Advances in attention mechanisms for medical image segmentation. Computer Science Review, 56, 18 pages. Scopus16 WoS12 |
| 2024 | Zhang, Y., Ma, Z., Li, J., Qiao, Y., Wang, Z., Chai, J., . . . Kordjamshidi, P. (2024). Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models. Transactions on Machine Learning Research, 2024. Scopus1 |
| 2024 | Chen, Q., Zhao, R., Wang, S., Phan, V. M. H., Hengel, A. V. D., Verjans, J., . . . Wu, Q. (2024). A Survey of Medical Vision-and-Language Applications and Their Techniques.. CoRR, abs/2411.12195. |
| 2024 | Sun, M., Suo, W., Wang, P., Niu, K., Liu, L., Lin, G., . . . Wu, Q. (2024). An Adaptive Correlation Filtering Method for Text-Based Person Search. International Journal of Computer Vision, 132(10), 4440-4455. Scopus7 WoS5 |
| 2024 | Xie, Y., Zhang, J., Xia, Y., & Wu, Q. (2024). UniMiSS+: Universal Medical Self-Supervised Learning From Cross-Dimensional Unpaired Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12), 10021-10035. Scopus7 WoS5 |
| 2024 | Xie, Y., Gu, L., Harada, T., Zhang, J., Xia, Y., & Wu, Q. (2024). Rethinking masked image modeling for medical image representation. Medical Image Analysis, 98, 103304. Scopus11 WoS9 Europe PMC5 |
| 2024 | Ding, N., Deng, C., Tan, M., Du, Q., Ge, Z., & Wu, Q. (2024). Image Captioning With Controllable and Adaptive Length Levels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 764(779), 1-16. Scopus12 WoS10 Europe PMC1 |
| 2024 | Gao, C., Liu, S., Chen, J., Wang, L., Wu, Q., Li, B., & Tian, Q. (2024). Room-Object Entity Prompting and Reasoning for Embodied Referring Expression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(2), 994-1010. Scopus14 WoS15 Europe PMC3 |
| 2024 | Wen, Z., Niu, S., Li, G., Wu, Q., Tan, M., & Wu, Q. (2024). Test-Time Model Adaptation for Visual Question Answering with Debiased Self-Supervisions. IEEE Transactions on Multimedia, 26, 2137-2147. Scopus7 WoS7 |
| 2023 | Lin, Z., Zhang, D., Tao, Q., Shi, D., Haffari, G., Wu, Q., . . . Ge, Z. (2023). Medical visual question answering: A survey. Artificial Intelligence in Medicine, 143, 102611. Scopus119 WoS73 Europe PMC22 |
| 2023 | Zhou, G., Hong, Y., & Wu, Q. (2023). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. |
| 2023 | Wang, Z., Byrnes, O., Wang, H., Sun, R., Ma, C., Chen, H., . . . Xue, M. (2023). Data Hiding With Deep Learning: A Survey Unifying Digital Watermarking and Steganography. IEEE Transactions on Computational Social Systems, 10(6), 1-15. Scopus66 WoS39 |
| 2023 | Li, H., Huang, J., Jin, P., Song, G., Wu, Q., & Chen, J. (2023). Weakly-Supervised 3D Spatial Reasoning for Text-based Visual Question Answering. IEEE Transactions on Image Processing, 32, 3367-3382. Scopus21 WoS15 Europe PMC1 |
| 2023 | Tan, M., Wen, Z., Fang, L., & Wu, Q. (2023). Transformer-Based Relational Inference Network for Complex Visual Relational Reasoning. ACM Transactions on Multimedia Computing, Communications, and Applications, 20(1), 1-23. Scopus6 WoS4 |
| 2023 | Shi, X., Qiao, Y., Wu, Q., Liu, L., & Dayoub, F. (2023). Improving Online Source-free Domain Adaptation for Object Detection by Unsupervised Data Acquisition. |
| 2023 | He, M., Du, W., Wen, Z., Du, Q., Xie, Y., & Wu, Q. (2023). Multi-Granularity Aggregation Transformer for Joint Video-Audio-Text Representation Learning. IEEE Transactions on Circuits and Systems for Video Technology, 33(6), 2990-3002. Scopus10 WoS8 |
| 2023 | Qiao, Y., Qi, Y., Hong, Y., Yu, Z., Wang, P., & Wu, Q. (2023). HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(7), 8524-8537. Scopus52 WoS22 Europe PMC6 |
| 2023 | Liu, D., Chen, Z., Huang, Z., Wu, Q., Song, Y., Yao, J., . . . Fang, G. (2023). In Situ Surface Modification Enables High Stability and Optoelectrical Performance for a Self-powered Photodetector. ADVANCED OPTICAL MATERIALS, 11(22), 10 pages. WoS34 |
| 2022 | Xun, L., Zhang, H., Yan, Q., Wu, Q., & Zhang, J. (2022). VISOR-NET: Visibility Estimation Based on Deep Ordinal Relative Learning under Discrete-Level Labels. SENSORS, 22(16), 20 pages. WoS10 |
| 2022 | Li, Y., Wu, Q., Lai, M., Zhao, J., Liu, Y., Fan, Y., . . . Liu, B. (2022). Influence of chemical disorder on mechanical and thermal properties of multi-component rare earth zirconate pyrochlores (<i>n</i>RE<sub>1/<i>n</i></sub>)<sub>2</sub>Zr<sub>2</sub>O<sub>7</sub>. JOURNAL OF APPLIED PHYSICS, 132(7), 11 pages. WoS15 |
| 2022 | Ji, G., Chen, C., Zhou, M., Wen, W., Wang, C., Tang, J., . . . Feng, Z. (2022). Post-COVID-19 fatigue among COVID-19 in patients discharged from hospital: A meta-analysis. JOURNAL OF INFECTION, 84(5), 731-733. WoS6 |
| 2022 | Wu, Y., Feng, T., Shen, Y., Fu, F., Meng, N., Li, X., . . . Wang, M. (2022). Total-body parametric imaging using the Patlak model: Feasibility of reduced scan time. MEDICAL PHYSICS, 49(7), 4529-4539. WoS21 |
| 2022 | Ling, L., Wu, Q., Huang, K., Wang, Y., & Wang, C. (2022). A Lightweight Bearing Fault Diagnosis Method Based on Multi-Channel Depthwise Separable Convolutional Neural Network. Electronics (Switzerland), 11(24), 21 pages. Scopus17 WoS15 |
| 2022 | Manchin, A., Sherrah, J., Wu, Q., & van den Hengel, A. (2022). Program Generation from Diverse Video Demonstrations. BMVC 2022 - 33rd British Machine Vision Conference Proceedings. |
| 2022 | Suo, W., Sun, M., Wang, P., Zhang, Y., & Wu, Q. (2022). Rethinking and Improving Feature Pyramids for One-stage Referring Expression Comprehension. IEEE Transactions on Image Processing, 32, 854-864. Scopus15 WoS15 Europe PMC1 |
| 2022 | Sun, M., Suo, W., Wang, P., Zhang, Y., & Wu, Q. (2022). A proposal-free one-stage framework for referring expression comprehension and generation via dense cross-attention. IEEE Transactions on Multimedia, 25, 2446-2458. Scopus42 WoS35 |
| 2022 | Deng, C., Wu, Q., Wu, Q., Hu, F., Lyu, F., & Tan, M. (2022). Visual Grounding Via Accumulated Attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3), 1670-1684. Scopus13 WoS10 Europe PMC1 |
| 2022 | Parvaneh, A., Abbasnejad, E., Wu, Q., Shi, Q., & Van Den Hengel, A. (2022). Show, price and negotiate: a negotiator with online value look-ahead. IEEE Transactions on Multimedia, 24, 1426-1434. Scopus2 WoS1 |
| 2022 | Sun, Z., Liu, H., Wang, Q., Zhou, T., Wu, Q., & Tang, Z. (2022). Co-LDL: A Co-training-based Label Distribution Learning Method for Tackling Label Noise. IEEE Transactions on Multimedia, 24, 1093-1104. Scopus40 WoS35 |
| 2021 | Yu, J., Jiang, X., Qin, Z., Zhang, W., Hu, Y., & Wu, Q. (2021). Learning Dual Encoding Model for Adaptive Visual Understanding in Visual Dialogue. IEEE TRANSACTIONS ON IMAGE PROCESSING, 30, 220-233. Scopus31 WoS27 Europe PMC3 |
| 2021 | Wang, Y., Qi, Y., Yao, H., Gong, D., & Wu, Q. (2021). Image editing with varying intensities of processing. Computer Vision and Image Understanding, 211, 1-13. Scopus4 WoS4 |
| 2021 | Zhang, W., Ma, C., Wu, Q., & Yang, X. (2021). Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning. IEEE Transactions on Circuits and Systems for Video Technology, 31(9), 3469-3481. Scopus53 WoS45 |
| 2021 | Wang, H., Chen, H., Wu, Q., Ma, C., & Li, Y. (2021). Multi-Intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline. IEEE Open Journal of Intelligent Transportation Systems, 3, 126-136. Scopus15 WoS14 |
| 2021 | Zhang, C., Wang, Q., Xie, G., Wu, Q., Shen, F., & Tang, Z. (2021). Robust Learning from Noisy Web Images via Data Purification for Fine-Grained Recognition. IEEE Transactions on Multimedia, 24, 1. Scopus13 WoS12 |
| 2020 | Gao, C., Zhu, Q., Wang, P., Li, H., Liu, Y., Van den Hengel, A., & Wu, Q. (2020). Structured Multimodal Attentions for TextVQA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(8), 1. Scopus35 WoS44 Europe PMC3 |
| 2020 | Chen, Q., Wu, Q., Chen, J., Wu, Q., Van Den Hengel, A., & Tan, M. (2020). Scripted Video Generation with a Bottom-Up Generative Adversarial Network. IEEE Transactions on Image Processing, 29, 7454-7467. Scopus32 WoS15 |
| 2020 | Qiao, Y., Deng, C., & Wu, Q. (2020). Referring expression comprehension: a survey of methods and datasets. IEEE Transactions on Multimedia, 23, 4426-4440. Scopus74 WoS59 |
| 2020 | Yu, J., Zhang, W., Lu, Y., Qin, Z., Hu, Y., Tan, J., & Wu, Q. (2020). Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval. IEEE Transactions on Multimedia, 22(12), 3196-3209. Scopus94 WoS85 |
| 2020 | Huang, Y., Wu, Q., Wang, W., & Wang, L. (2020). Image and Sentence Matching via Semantic Concepts and Order Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(3), 636-650. Scopus39 WoS30 Europe PMC6 |
| 2020 | Liu, X., Dai, P., Gu, T., Wu, Q., Wei, H., Liu, S., . . . Zhao, Q. (2020). Cyclometalated iridium(III) complexes containing an anthracene unit for sensing and imaging singlet oxygen in cellular mitochondria. JOURNAL OF INORGANIC BIOCHEMISTRY, 209, 10 pages. WoS17 |
| 2020 | Zhou, S., Wang, S., Wu, Q., Azim, R., & Li, W. (2020). Predicting potential miRNA-disease associations by combining gradient boosting decision tree with logistic regression. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 85, 8 pages. WoS86 |
| 2019 | Yang, J., Wang, M., Zhang, Y., Jia, X., Chen, Y., Liu, T., . . . Xiao, H. (2019). Rapid Preparation of Oxidized Starch with High Carbonyl Contents Using NaBrO as Oxidizer. STARCH-STARKE, 71(9-10), 9 pages. WoS8 |
| 2019 | Xu, J. -L., Stutzki, J., Wu, Y., Guan, X., Wang, J. -J., Miller, M., . . . Wu, Q. (2019). Probing star formation and feedback using CCOSMA and archival data in the CFG028.68-0.28 quasi-sinusoidal filament. RESEARCH IN ASTRONOMY AND ASTROPHYSICS, 19(12), 13 pages. WoS2 |
| 2019 | Xiao, J., Ding, W., Peng, Y., Wu, Q., Chen, Z., Wang, Z., . . . Peng, T. (2019). UPGRADING IRON AND REMOVING PHOSPHORUS OF HIGH PHOSPHORUS OOLITIC IRON ORE BY SEGREGATION ROASTING WITH CALCIUM CHLORIDE AND CALCIUM HYPOCHLORITE. JOURNAL OF MINING AND METALLURGY SECTION B-METALLURGY, 55(3), 305-314. WoS12 |
| 2019 | Li, K. -P., Yuan, M., He, Z. -R., Wu, Q., Zhang, C. -M., Lei, Z. -L., . . . Guo, J. (2019). Omics Insights into Metabolic Stress and Resilience of Rats in Response to Short-term Fructose Overfeeding. MOLECULAR NUTRITION & FOOD RESEARCH, 63(23), 14 pages. WoS12 |
| 2019 | Tang, T., Duan, X., Zhou, Z., & Wu, Q. (2019). Scatter Correction Based on Beam Stop Array for Cone-Beam Micro-Computed Tomography. ACTA OPTICA SINICA, 39(8), 11 pages. WoS1 |
| 2019 | Liu, W., Li, Y., & Wu, Q. (2019). An Attribute-Based High-Level Image Representation for Scene Classification. IEEE Access, 7, 4629-4640. Scopus5 WoS2 |
| 2019 | Lyu, F., Wu, Q., Hu, F., Wu, Q., & Tan, M. (2019). Attend and Imagine: Multi-Label Image Classification with Visual Attention and Recurrent Neural Networks. IEEE Transactions on Multimedia, 21(8), 1971-1981. Scopus67 WoS56 |
| 2019 | Zhang, J., Wu, Q., Zhang, J., Shen, C., Lu, J., & Wu, Q. (2019). Heritage image annotation via collective knowledge. Pattern Recognition, 93, 204-214. Scopus9 WoS8 |
| 2019 | Zhang, J., Xie, Y., Wu, Q., & Xia, Y. (2019). Medical image classification using synergic deep learning. Medical Image Analysis, 54, 10-19. Scopus361 WoS261 Europe PMC107 |
| 2018 | Wu, Q., Shen, C., Wang, P., Dick, A., & van den Hengel, A. (2018). Image captioning and visual question answering based on attributes and external knowledge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1367-1381. Scopus352 WoS263 Europe PMC29 |
| 2018 | Zhang, J., Wu, Q., Shen, C., Zhang, J., & Lu, J. (2018). Multilabel image classification with regional latent semantic dependencies. IEEE Transactions on Multimedia, 20(10), 2801-2813. Scopus177 WoS92 |
| 2018 | Hu, L., Zhu, Q., Wu, Q., Li, D., An, Z., & Xu, B. (2018). Natural Biomass-Derived Hierarchical Porous Carbon Synthesized by an <i>in Situ</i> Hard Template Coupled with NaOH Activation for Ultrahigh Rate Supercapacitors. ACS SUSTAINABLE CHEMISTRY & ENGINEERING, 6(11), 13949-13959. WoS146 |
| 2018 | Sun, P., Wu, Q., Sun, X., Miao, H., Deng, W., Zhang, W., . . . Huang, W. (2018). J-Aggregate squaraine nanoparticles with bright NIR-II fluorescence for imaging guided photothermal therapy. CHEMICAL COMMUNICATIONS, 54(95), 13395-13398. WoS155 |
| 2018 | Zhang, K. Y., Zhang, T., Wei, H., Wu, Q., Liu, S., Zhao, Q., & Huang, W. (2018). Phosphorescent iridium(III) complexes capable of imaging and distinguishing between exogenous and endogenous analytes in living cells. CHEMICAL SCIENCE, 9(36), 7236-7240. WoS50 |
| 2018 | Wu, Q., Ma, H., Ling, K., Gan, N., Cheng, Z., Gu, L., . . . Huang, W. (2018). Reversible Ultralong Organic Phosphorescence for Visual and Selective Chloroform Detection. ACS APPLIED MATERIALS & INTERFACES, 10(39), 33730-33736. WoS83 |
| 2018 | Lu, X., Yuan, P., Zhang, W., Wu, Q., Wang, X., Zhao, M., . . . Fan, Q. (2018). A highly water-soluble triblock conjugated polymer for <i>in vivo</i> NIR-II imaging and photothermal therapy of cancer. POLYMER CHEMISTRY, 9(22), 3118-3126. WoS66 |
| 2018 | Cai, S., Shi, H., Zhang, Z., Wang, X., Ma, H., Gan, N., . . . Huang, W. (2018). Hydrogen-Bonded Organic Aromatic Frameworks for Ultralong Phosphorescence by Intralayer π-π Interactions. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 57(15), 4005-4009. WoS236 |
| 2018 | Li, S., Cheng, L., Wu, Q., Zhang, Q., Yang, J., & Liu, J. (2018). Mechanism of Aerobic Alcohol Oxidation Mediated by Water-Soluble Cu<SUP>II</SUP>-TEMPO Catalyst in Water: A Density Functional Theory Study. CHEMISTRYSELECT, 3(4), 1268-1274. WoS2 |
| 2018 | Sun, C., Ran, X., Wang, X., Cheng, Z., Wu, Q., Cai, S., . . . Huang, W. (2018). Twisted Molecular Structure on Tuning Ultralong Organic Phosphorescence. JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 9(2), 335-339. WoS80 |
| 2018 | Cui, S., Wang, X., Zhang, X., Xia, W., Tang, X., Lin, B., . . . Shen, X. (2018). Preparation of magnetic MnFe<sub>2</sub>O<sub>4</sub>-Cellulose aerogel composite and its kinetics and thermodynamics of Cu(II) adsorption. CELLULOSE, 25(1), 735-751. WoS60 |
| 2018 | Gu, L., Shi, H., Miao, C., Wu, Q., Cheng, Z., Cai, S., . . . Huang, W. (2018). Prolonging the lifetime of ultralong organic phosphorescence through dihydrogen bonding. JOURNAL OF MATERIALS CHEMISTRY C, 6(2), 226-233. WoS99 |
| 2018 | Wu, Q., Li, Y., Wang, C., Zhang, J., Huang, M., Kim, J. K., & Wu, Y. (2018). 1,4-Refunctionalization of β-diketones to γ-keto nitriles <i>via</i> C-C single bond cleavage. ORGANIC CHEMISTRY FRONTIERS, 5(16), 2496-2500. WoS18 |
| 2018 | Bian, L., Shi, H., Wang, X., Ling, K., Ma, H., Li, M., . . . Huang, W. (2018). Simultaneously Enhancing Efficiency and Lifetime of Ultralong Organic Phosphorescence Materials by Molecular Self-Assembly. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 140(34), 10734-10739. WoS474 |
| 2018 | Chen, H., Xu, J., Xiao, G., Wu, Q., & Zhang, S. (2018). Fast auto-clean CNN model for online prediction of food materials. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 117, 218-227. WoS23 |
| 2018 | Deng, W., Wu, Q., Sun, P., Yuan, P., Lu, X., Fan, Q., & Huang, W. (2018). Zwitterionic diketopyrrolopyrrole for fluorescence/photoacoustic imaging guided photodynamic/photothermal therapy. POLYMER CHEMISTRY, 9(20), 2805-2812. WoS31 |
| 2018 | Cai, S., Shi, H., Tian, D., Ma, H., Cheng, Z., Wu, Q., . . . Huang, W. (2018). Enhancing Ultralong Organic Phosphorescence by Effective π-Type Halogen Bonding. ADVANCED FUNCTIONAL MATERIALS, 28(9), 7 pages. WoS298 |
| 2018 | Cheng, Z., Shi, H., Ma, H., Bian, L., Wu, Q., Gu, L., . . . Huang, W. (2018). Ultralong Phosphorescence from Organic Ionic Crystals under Ambient Conditions. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 57(3), 678-682. WoS207 |
| 2017 | Hu, L., Ma, L., Zhu, Q., Yu, L., Wu, Q., Hu, C., . . . Xu, B. (2017). Organic salt-derived nitrogen-rich, hierarchical porous carbon for ultrafast supercapacitors. NEW JOURNAL OF CHEMISTRY, 41(22), 13611-13618. WoS12 |
| 2017 | Li, S., Cheng, L., Wu, Q., Zhang, Q., Yang, J., & Liu, J. (2017). Mechanistic Insight into the 2° Alcohol Oxidation Mediated by an Efficient Cu<SUP>I</SUP>/L-Proline-TEMPO Catalyst-A Density Functional Theory Study. CATALYSTS, 7(9), 15 pages. WoS3 |
| 2017 | Teney, D., Wu, Q., & Van Den Hengel, A. (2017). Visual Question Answering: a tutorial. IEEE Signal Processing Magazine, 34(6), 63-75. Scopus35 WoS23 |
| 2017 | Zhuang, B., Wu, Q., Shen, C., Reid, I., & Hengel, A. V. D. (2017). Care about you: towards large-scale human-centric visual relationship detection. |
| 2017 | Wang, P., Wu, Q., Shen, C., Dick, A., & Van Den Hengel, A. (2017). FVQA: fact-based Visual Question Answering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(10), 2413-2427. Scopus403 WoS321 Europe PMC33 |
| 2017 | Wu, Q., Teney, D., Wang, P., Shen, C., Dick, A., & van den Hengel, A. (2017). Visual question answering: a survey of methods and datasets. Computer Vision and Image Understanding, 163, 21-40. Scopus325 WoS244 |
| 2016 | Wang, J., Zhang, F. F., Wei, B., Wu, Q., Cao, M. J., Bai, Y., & Yang, G. W. (2016). Counterion-Directed Assembly of Praseodymium(III) Compounds based on the Flexible Ligand 5-Aminotetrazole-1-propionic Acid (Hatzp). ZEITSCHRIFT FUR ANORGANISCHE UND ALLGEMEINE CHEMIE, 642(2), 169-173. WoS7 |
| 2016 | Shen, L., Cao, M. J., Zhang, F. F., Wu, Q., Zhao, L. Y., Lu, Y. M., . . . Zou, J. H. (2016). Three new manganese(II) coordination complexes based on tetrazole carboxylate ligands. TRANSITION METAL CHEMISTRY, 41(2), 125-131. WoS27 |
| 2016 | Sun, Y., Wang, X., Du, J., Chen, N., Yu, H., Wu, Q., & Meng, X. (2016). Amorphous Structure and Bonding Chemistry of Aluminium Antimonide(AlSb) Alloy for Phase-change Memory Device. CHEMICAL RESEARCH IN CHINESE UNIVERSITIES, 32(1), 76-81. WoS5 |
| 2016 | Shen, L., Min, Y. -T., Bai, X., Wang, J., Wu, Q., Yang, J., . . . Li, Q. -Y. (2016). Four Gadolinium Coordination Compounds Derived from Various Tetrazole-Containing Carboxylic Acids. ZEITSCHRIFT FUR ANORGANISCHE UND ALLGEMEINE CHEMIE, 642(19), 1112-1119. WoS3 |
| 2016 | Wu, J., Bai, Y., Lu, Y. M., Wang, J., Wu, Q., Yang, G. W., & Li, Q. Y. (2016). Substituted group-directed magnesium(II) coordination compounds based on the derivatives of tetrazole-2-acetic acid. JOURNAL OF THE IRANIAN CHEMICAL SOCIETY, 13(12), 2155-2162. WoS3 |
| 2016 | Shen, L., Bai, Y., Min, Y. -T., Jia, T. -T., Wu, Q., Wang, J., . . . Yang, G. -W. (2016). Coordination Architectures of energetic Cd (II) coordination polymers constructed by the bifunctional substituted-tetrazole-carboxylate ligands. JOURNAL OF SOLID STATE CHEMISTRY, 244, 129-139. WoS15 |
| 2016 | Zhang, J., Tang, Z., Giddings, R., Wu, Q., Wang, W., Cao, B., . . . Tang, J. M. (2016). Stage-Dependent DSP Operation Range Clipping-Induced Bit Resolution Reductions of Full Parallel 64-Point FFTs Incorporated in FPGA-Based Optical OFDM Receivers. JOURNAL OF LIGHTWAVE TECHNOLOGY, 34(16), 3752-3760. WoS12 |
| 2016 | Miao, L. -L., Guo, M. -Y., Wu, J., Lu, Y. -M., Wu, Q., Bai, Y., . . . Yang, G. -W. (2016). Counter anion and pH directed assembly of europium(III) compounds based on tetrazole containing carboxylic acids. INORGANICA CHIMICA ACTA, 450, 176-181. WoS12 |
| 2016 | Yang, G. W., Zhang, Y. T., Wu, Q., Cao, M. J., Wu, J., Yue, Q. Y., & Li, Q. Y. (2016). Nitrogen-rich 5-(4-pyridyl)tetrazole-2-acetic acid and its alkaline earth metal coordination polymers for potential energetic materials. INORGANICA CHIMICA ACTA, 450, 364-371. WoS17 |
| 2016 | Wang, C., Li, Y., Gong, M., Wu, Q., Zhang, J., Kim, J. K., . . . Wu, Y. (2016). Method for Direct Synthesis of α-Cyanomethyl-β-dicarbonyl Compounds with Acetonitrile and 1,3-Dicarbonyls. ORGANIC LETTERS, 18(17), 4151-4153. WoS49 |
| 2016 | Du, J., Wang, M., Chen, N., Xie, S., Yu, H., & Wu, Q. (2016). Instability Origin and Improvement Scheme of Facial Alq<sub>3</sub> for Blue OLED Application. CHEMICAL RESEARCH IN CHINESE UNIVERSITIES, 32(3), 423-427. WoS2 |
| 2016 | Tang, X. -L., Lin, B. -L., Cui, S., Zhang, X., Zhong, Y., Wu, Q., . . . Wang, T. -W. (2016). Paclitaxel modified Fe<sub>3</sub>O<sub>4</sub> loaded albumin nanoparticles as drug delivery vehicles by self-assembly. RSC ADVANCES, 6(49), 43284-43292. WoS13 |
| 2015 | Wu, Q., Cao, M. J., Wei, B., Bai, Y., Tian, H., Wang, J., . . . Yang, G. W. (2015). pH dependent synthesis of structurally diverse praseodymium(III) coordination polymers based on isomeric ligands. INORGANIC CHEMISTRY COMMUNICATIONS, 62, 111-114. WoS26 |
| 2015 | Yang, G. W., Zhang, F. F., Wu, Q., Cao, M. J., Bai, Y., Li, Q. Y., . . . Zou, J. H. (2015). Substituted group directed assembly of energetic lead(II) compounds based on structure-relevant ligands. RSC ADVANCES, 5(103), 84439-84445. WoS32 |
| 2015 | Nie, Y., Speakman, J. R., Wu, Q., Zhang, C., Hu, Y., Xia, M., . . . Wei, F. (2015). Exceptionally low daily energy expenditure in the bamboo-eating giant panda. SCIENCE, 349(6244), 171-174. WoS134 |
| 2015 | Hall, P., Cai, H., Wu, Q., & Corradi, T. (2015). Cross-depiction problem: recognition and synthesis of photographs and artwork. Computational Visual Media, 1(2), 91-103. Scopus36 |
| 2014 | Wu, Q., & Xiao, H. (2014). Dynamic CGE Model and Simulation Analysis on the Impact of Citizenization of Rural Migrant Workers on the Labor and Capital Markets in China. DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2014, 8 pages. WoS3 |
| 2011 | Fu, Z., Wu, Q., Gong, W., Shi, L., Li, W., & Dai, Z. (2011). Photoluminescence properties and analysis of spectral structure of R<sub>2</sub>(MoO<sub>4</sub>)<sub>3</sub>: Eu<SUP>3+</SUP> (R = La, Gd) phosphors. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA B-OPTICAL PHYSICS, 28(4), 709-713. WoS8 |
| 2011 | Fu, Z., Gong, W., Li, H., Wu, Q., Li, W., Yang, H. K., & Jeong, J. H. (2011). Synthesis and spectral properties of nanocrystalline Eu<SUP>3+</SUP>-doped pyrochlore oxide M<sub>2</sub>Sn<sub>2</sub>O<sub>7</sub> (M = Gd and Y). CURRENT APPLIED PHYSICS, 11(3), 933-938. WoS14 |
| 2011 | Wu, Q., Li, H., Xia, W., Fu, X., Fu, Z., Zhou, S., . . . Jeong, J. H. (2011). Investigation of the Structure and Photoluminescence Properties of Ln<SUP>3+</SUP>(Eu<SUP>3+</SUP>, Dy<SUP>3+</SUP>, Sm<SUP>3+</SUP>) Ion-Doped NaY(MoO<sub>4</sub>)<sub>2</sub>. JOURNAL OF THE ELECTROCHEMICAL SOCIETY, 158(12), J387-J393. WoS16 |
| 2006 | Zhang, F., Wu, Q., Chen, Z. -C., Li, X., Jiang, X. -M., & Lin, X. -F. (2006). Bioactive galactose-branched polyelectrolyte multilayers and microcapsules: Self-assembly, characterization, and biospecific lectin adsorption. LANGMUIR, 22(20), 8458-8464. WoS33 |
| 2006 | Bi, J., Wu, Q., & Li, Z. (2006). On estimating clock skew for one-way measurements. COMPUTER COMMUNICATIONS, 29(8), 1213-1225. WoS10 |
| - | Zheng, S., Zhao, P., Huang, Q., Cai, Y., Cheng, H., & Wu, Q. (2025). Implement Referring Expression Comprehension by Extending Auto-focus Lens to Locked Vision Model. ACM Transactions on Multimedia Computing, Communications, and Applications. |
| Year | Citation |
|---|---|
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Visual Question Answering. Springer Nature Singapore. DOI |
| 2020 | Garg, S., Sünderhauf, N., Dayoub, F., Morrison, D., Cosgun, A., Carneiro, G., . . . Milford, M. (2020). Semantics for Robotic Mapping, Perception and Interaction: A Survey (Vol. 8). United States: Now Publishers. DOI |
| Year | Citation |
|---|---|
| 2025 | Shi, X., Qiao, Y., Wu, Q., Liu, L., & Dayoub, F. (2025). Improving Online Source-Free Domain Adaptation for Object Detection by Unsupervised Data Acquisition. In A. DelBue, C. Canton, J. Pont-Tuset, & T. Tommasi (Eds.), Lecture Notes in Computer Science (Vol. 15629 LNCS, pp. 195-205). SPRINGER INTERNATIONAL PUBLISHING AG. DOI Scopus1 WoS1 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Video Representation Learning. In Advances in Computer Vision and Pattern Recognition (pp. 111-117). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Vision-and-Language Pretraining for VQA. In Advances in Computer Vision and Pattern Recognition (pp. 91-107). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Text-Based VQA. In Advances in Computer Vision and Pattern Recognition (pp. 177-187). Springer Nature Singapore. DOI Scopus1 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Deep Learning Basics. In Advances in Computer Vision and Pattern Recognition (pp. 15-26). Springer Nature Singapore. DOI Scopus1 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Summary and Outlook. In Advances in Computer Vision and Pattern Recognition (pp. 233-236). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Knowledge-Based VQA. In Advances in Computer Vision and Pattern Recognition (pp. 73-90). Springer Nature Singapore. DOI Scopus2 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Medical VQA. In Advances in Computer Vision and Pattern Recognition (pp. 165-176). Springer Nature Singapore. DOI Scopus9 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Question Answering (QA) Basics. In Advances in Computer Vision and Pattern Recognition (pp. 27-31). Springer Nature Singapore. DOI Scopus2 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Visual Dialogue. In Advances in Computer Vision and Pattern Recognition (pp. 199-218). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Referring Expression Comprehension. In Advances in Computer Vision and Pattern Recognition (pp. 219-230). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Classical Visual Question Answering. In Advances in Computer Vision and Pattern Recognition (pp. 35-72). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Advanced Models for Video Question Answering. In Advances in Computer Vision and Pattern Recognition (pp. 135-143). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Video Question Answering. In Advances in Computer Vision and Pattern Recognition (pp. 119-133). Springer Nature Singapore. DOI Scopus1 |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Visual Question Generation. In Advances in Computer Vision and Pattern Recognition (pp. 189-197). Springer Nature Singapore. DOI |
| 2022 | Wu, Q., Wang, P., Wang, X., He, X., & Zhu, W. (2022). Embodied VQA. In Advances in Computer Vision and Pattern Recognition (pp. 147-164). Springer Nature Singapore. DOI Scopus1 |
| 2015 | Brown-Grant, R. (2015). Introduction. In R. BrownGrant, A. D. Hedeman, & B. Ribemont (Eds.), Advances in Computer Vision and Pattern Recognition (pp. 1-13). ROUTLEDGE. DOI Scopus1 |
| Year | Citation |
|---|---|
| 2025 | Zhuang, J., Yu, J., Qu, X., Tang, Y., Gou, G., Xiong, G., & Wu, Q. (2025). Soft Multi-view Representation Learning for Disambiguating Text-Based Person Retrieval. In Lecture Notes in Computer Science Vol. 15686 LNCS (pp. 143-156). Springer Nature Singapore. DOI |
| 2025 | Wang, X., Zhuang, B., & Wu, Q. (2025). ARE LARGE VISION LANGUAGE MODELS GOOD GAME PLAYERS?. In 13th International Conference on Learning Representations Iclr 2025 (pp. 24502-24539). Scopus2 |
| 2025 | Hong, H., Qiao, Y., Wang, S., Liu, J., & Wu, Q. (2025). General Scene Adaptation for Vision-and-Language Navigation. In Proceedings of the 13th International Conference on Learning Representations (ICLR 2025) (pp. 4389-4416). Singapore: International Conference on Learning Representations (ICLR). |
| 2025 | Liu, Q., Zhang, S., Qiao, Y., Zhu, J., Li, X., Guo, L., . . . Liu, J. (2025). GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation. In Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025 (pp. 1775-1784). Tucson, AZ, USA: IEEE. DOI |
| 2025 | Gai, K., Wang, D., Yu, J., Wang, M., Zhu, L., & Wu, Q. (2025). MFL-Owner: Ownership Protection for Multi-modal Federated Learning via Orthogonal Transform Watermark. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 39 (pp. 3049-3058). Philadelphia, USA: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus3 WoS1 |
| 2025 | Zhu, J., Qiao, Y., Zhang, S., He, X., Wu, Q., & Liu, J. (2025). MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation. In C. Ott (Ed.), Proceedings IEEE International Conference on Robotics and Automation (pp. 97-103). GA, Atlanta: IEEE. DOI |
| 2025 | Li, Z., Zhou, G., Hong, H., Shao, Y., Lyu, W., Qiao, Y., & Wu, Q. (2025). Ground-Level Viewpoint Vision-and-Language Navigation in Continuous Environments. In C. Ott (Ed.), Proceedings IEEE International Conference on Robotics and Automation (pp. 5266-5273). GA, Atlanta: IEEE. DOI |
| 2025 | Qiao, Y., Lyu, W., Wang, H., Wang, Z., Li, Z., Zhang, Y., . . . Wu, Q. (2025). Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs. In Proceedings IEEE International Conference on Robotics and Automation (pp. 6710-6717). IEEE. DOI Scopus2 |
| 2025 | Tang, Y., Zhang, J., Qin, X., Yu, J., Gou, G., Gangxiong, G. X., . . . Wu, Q. (2025). Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 14400-14410). IEEE. DOI |
| 2025 | Tang, Y., Yu, J., Gai, K., Zhuang, J., Xiong, G., Gou, G., & Wu, Q. (2025). Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 24785-24795). IEEE. DOI Scopus1 |
| 2025 | Liu, S., Zhang, H., Qiao, Q., Wu, Q., & Wang, P. (2025). VLN-ChEnv: Vision-language Navigation in Changeable Environments. In Mm 2025 Proceedings of the 33rd ACM International Conference on Multimedia Co Located with mm 2025 (pp. 3798-3807). ACM. DOI |
| 2025 | Lei, L., Gai, K., Yu, J., Zhu, L., & Wu, Q. (2025). Secure and Efficient Watermarking for Latent Diffusion Models in Model Distribution Scenarios. In Ijcai International Joint Conference on Artificial Intelligence (pp. 7473-7481). International Joint Conferences on Artificial Intelligence Organization. DOI |
| 2025 | Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2025). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. In Lecture Notes in Computer Science Vol. 15065 LNCS (pp. 260-278). Milan, Italy: Springer Nature Switzerland. DOI Scopus22 WoS4 |
| 2025 | Qiao, Y., Liu, Q., Liu, J., Liu, J., & Wu, Q. (2025). LLM as Copilot for Coarse-Grained Vision-and-Language Navigation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 15063 LNCS (pp. 459-476). Milan, Italy: Springer Science and Business Media Deutschland GmbH. DOI Scopus5 WoS2 |
| 2025 | Chen, Q., Xie, Y., Wu, B., Chen, X., Ang, J., To, M. -S., . . . Wu, Q. (2025). Act Like a Radiologist: Radiology Report Generation Across Anatomical Regions. In Lecture Notes in Computer Science Vol. 15477 LNCS (pp. 36-52). Hanoi, Vietnam: Springer Nature Singapore. DOI |
| 2024 | Huang, Z., Chen, Q., Sung, L., Yang, Y., Wang, N., Wu, Q., & Tan, M. (2024). G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images. In 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (pp. 10117-10126). WA, Seattle: IEEE COMPUTER SOC. DOI Scopus7 WoS2 |
| 2024 | Qiao, Y., Yu, Z., Zhao, Z., Chen, S., Sun, M., Guo, L., . . . Liu, J. (2024). VL-Mamba: Exploring State Space Models for Multimodal Learning. In Proceedings of Machine Learning Research Vol. 262 (pp. 102-113). Vancouver, Canada: ML Research Press. Scopus4 WoS1 |
| 2024 | Wu, Y., Xie, Y., Luo, X., Wu, Q., & Cai, J. (2024). Dataset, Challenge, and Evaluation for Tumor Segmentation Variability. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 11302-11303). Melbourne VIC Australia: ACM. DOI Scopus3 WoS2 |
| 2024 | Qu, X., Yu, J., Gai, K., Zhuang, J., Tang, Y., Xiong, G., . . . Wu, Q. (2024). Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 4581-4590). Melbourne VIC Australia: ACM. DOI Scopus2 |
| 2024 | Hong, H., Wang, S., Huang, Z., Wu, Q., & Liu, J. (2024). Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments. In Proceedings of the 32nd ACM International Conference on Multimedia (MM'24) (pp. 7639-7648). New York, NY, USA: Association for Computing Machinery (ACM). DOI |
| 2024 | Li, Y., Yu, J., Gai, K., Liu, B., Xiong, G., & Wu, Q. (2024). T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 3955-3963). Melbourne, Victoria, Australia: ACM. DOI |
| 2024 | Wang, X., Zhuang, B., & Wu, Q. (2024). ModaVerse: Efficiently Transforming Modalities with LLMs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 26596-26606). WA, Seattle: IEEE COMPUTER SOC. DOI Scopus10 WoS3 |
| 2024 | Lu, Z., Xie, Y., Zeng, Q., Lu, M., Wu, Q., & Xia, Y. (2024). Spot the Difference: Difference Visual Question Answering with Residual Alignment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 15005 LNCS (pp. 649-658). Marrakesh: Springer Science and Business Media Deutschland GmbH. DOI Scopus3 WoS2 |
| 2024 | Ye, Y., Xie, Y., Zhang, J., Chen, Z., Wu, Q., & Xia, Y. (2024). Continual Self-Supervised Learning: Towards Universal Multi-Modal Medical Data Representation Learning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 11114-11124). Online: IEEE Computer Society. DOI Scopus27 WoS24 |
| 2024 | Wang, Z., Li, J., Hong, Y., Wang, Y., Wu, Q., Bansal, M., . . . Qiao, Y. (2024). Scaling Data Generation in Vision-and-Language Navigation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 11975-11986). Paris, France: IEEE. DOI Scopus53 WoS22 |
| 2024 | Mohammadi, B., Hong, Y., Qi, Y., Wu, Q., Pan, S., & Shi, J. Q. (2024). Augmented Commonsense Knowledge for Remote Object Grounding. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 4269-4277). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus17 WoS14 |
| 2024 | Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. T., & Wu, Q. (2024). WebVLN: Vision-and-Language Navigation on Websites. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 1165-1173). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus9 WoS2 |
| 2024 | Zhou, G., Hong, Y., & Wu, Q. (2024). NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 7641-7649). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus103 WoS49 |
| 2024 | Tang, Y., Yu, J., Gai, K., Zhuang, J., Xiong, G., Hu, Y., & Wu, Q. (2024). Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 38 (pp. 5180-5188). Online: Association for the Advancement of Artificial Intelligence (AAAI). DOI Scopus34 WoS20 |
| 2024 | Phan, V. M. H., Xie, Y., Qi, Y., Liu, L., Liu, L., Zhang, B., . . . Verjans, J. W. (2024). Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) (pp. 11492-11501). Seattle, WA, USA: Institute of Electrical and Electronics Engineers (IEEE). DOI Scopus13 WoS9 |
| 2024 | Wang, X., Wu, Q., & Zhuang, B. (2024). ModaVerse: Efficiently Transforming Modalities with LLMs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 26606-26616). Online: IEEE. |
| 2024 | Hong, H., Wang, S., Huang, Z., Wu, Q., & Liu, J. (2024). Why only text: empowering vision-and-language navigation with multi-modal prompts. In Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024) (pp. 839-847). Jeju, Jeju Island, South Korea.: International Joint Conferences on Artificial Intelligence Organisation. DOI Scopus3 WoS2 |
| 2024 | Xie, Y., Chen, Q., Wang, S., To, M. S., Lee, I., Khoo, E. W., . . . Wu, Q. (2024). PairAug: What Can Augmented Image-Text Pairs Do for Radiology?. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 11652-11661). Seattle, Washington, USA: IEEE. DOI Scopus7 WoS5 |
| 2024 | Wei, Y., Fu, S., Jiang, W., Zhang, Z., Zeng, Z., Wu, Q., . . . Zhang, Y. (2024). GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning. In Advances in Neural Information Processing Systems Vol. 37. Vancouver, Canada: Neural information processing systems foundation. Scopus9 |
| 2024 | Chen, Q., Zhang, B., Wang, G., & Wu, Q. (2024). Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles. In Advances in Neural Information Processing Systems Vol. 37. Scopus1 |
| 2024 | He, K., Chen, K., Bai, J., Huang, Y., Wu, Q., Xia, S. T., & Wang, L. (2024). Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor. In Advances in Neural Information Processing Systems Vol. 37. Vancouver, Canada: Neural information processing systems foundation. |
| 2024 | Wu, B., Xie, Y., Zhang, Z., Ge, J., Yaxley, K., Bahadir, S., . . . To, M. S. (2024). BHSD: A 3D Multi-class Brain Hemorrhage Segmentation Dataset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14348 LNCS (pp. 147-156). Online: Springer Nature Switzerland. DOI Scopus13 WoS1 |
| 2024 | Yu, Z., Qiao, Y., Xie, Y., & Wu, Q. (2024). Multi-modal Adapter for Medical Vision-and-Language Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14348 LNCS (pp. 393-402). Online: Springer Nature Switzerland. DOI Scopus4 WoS1 |
| 2024 | Deng, C., Chen, D., & Wu, Q. (2024). Identity-Consistent Aggregation for Video Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 13388-13398). Online: IEEE. DOI Scopus7 WoS7 |
| 2024 | Qiao, Y., Yu, Z., & Wu, Q. (2024). VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15397-15406). Online: IEEE. DOI Scopus16 WoS12 |
| 2024 | Liu, S., Zhang, H., Qi, Y., Wang, P., Zhang, Y., & Wu, Q. (2024). AerialVLN: Vision-and-Language Navigation for UAVs. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15338-15348). Online: IEEE. DOI Scopus36 WoS21 |
| 2024 | Tian, X., Yang, Y. L., & Wu, Q. (2024). ShapeScaffolder: Structure-Aware 3D Shape Generation from Text. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2715-2724). Paris, France: IEEE. DOI Scopus9 WoS8 |
| 2023 | Yu, Z., Xie, Y., Xia, Y., & Wu, Q. (2023). PLMVQA: Applying Pseudo Labels for Medical Visual Question Answering with Limited Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14394 LNCS (pp. 357-367). Online: Springer Nature Switzerland. DOI Scopus1 WoS1 |
| 2023 | Qiao, Y., Qi, Y., Yu, Z., Liu, J., & Wu, Q. (2023). March in Chat: Interactive Prompting for Remote Embodied Referring Expression. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15712-15721). Paris, France: IEEE. DOI Scopus27 WoS27 |
| 2023 | Deng, C., Chen, Q., Qin, P., Chen, D., & Wu, Q. (2023). Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval. In Proceedings of the IEEE International Conference on Computer Vision (pp. 15602-15612). Online: IEEE. DOI Scopus36 WoS26 |
| 2023 | Suo, W., Sun, M., Liu, W., Gao, Y., Wang, P., Zhang, Y., & Wu, Q. (2023). S<SUP>3</SUP>C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning. In 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR Vol. 2023-June (pp. 2646-2656). Online: IEEE COMPUTER SOC. DOI Scopus10 WoS5 |
| 2023 | Wen, Z., Wang, Y., Tan, M., Wu, Q., & Wu, Q. (2023). Digging out Discrimination Information from Generated Samples for Robust Visual Question Answering. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 6910-6928). Dubrovnik, Croatia and Online: Association for Computational Linguistics. DOI Scopus11 WoS4 |
| 2023 | Wu, Q., Chao, W., Zhou, X., & Luo, Z. (2023). TP-Detector: Detecting Turning Points in the Engineering Process of Large-scale Projects. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings of the System Demonstrations (pp. 177-185). Singapore: Association for Computational Linguistics (ACL). DOI |
| 2023 | Rodriguez-Opazo, C., Marrese-Taylor, E., Fernando, B., Takamura, H., & Wu, Q. (2023). Memory-efficient Temporal Moment Localization in Long Videos. In A. Vlachos, & I. Augenstein (Eds.), 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 (pp. 1909-1924). CROATIA, Dubrovnik: ASSOC COMPUTATIONAL LINGUISTICS-ACL. WoS3 |
| 2023 | Gao, J., Blair, A., Wu, Q., & Pagnucco, M. (2023). LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering. In Advances in Neural Information Processing Systems Vol. 36 (pp. 13 pages). Online: Neural information processing systems foundation. Scopus4 |
| 2023 | Rodriguez-Opazo, C., Marrese-Taylor, E., Fernando, B., Takamura, H., & Wu, Q. (2023). Memory-efficient Temporal Moment Localization in Long Videos. In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1901-1916). Online: Association for Computational Linguistics (ACL). Scopus5 |
| 2023 | Chen, Q., Deng, C., & Wu, Q. (2023). Learning Distinct and Representative Modes for Image Captioning. In Advances in Neural Information Processing Systems Vol. 35 (pp. 14 pages). USA: Neural information processing systems foundation. Scopus21 |
| 2023 | Huang, Y., Leung, C. H., Ma, S., Yuan, Z., Wu, Q., Wang, S., . . . Huang, Z. (2023). Towards Balanced Representation Learning for Credit Policy Evaluation. In Proceedings of the International Conference on Artificial Intelligence and Statistics Vol. 206 (pp. 3677-3692). Valencia, Spain (virtual event). Scopus4 |
| 2023 | Zhao, C., Qi, Y., & Wu, Q. (2023). Mind the Gap: Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes. In Proceedings of the 31st ACM International Conference on Multimedia (pp. 4349-4358). Ottawa ON Canada: ACM. DOI Scopus12 WoS10 |
| 2023 | Cong, G., Li, L., Qi, Y., Zha, Z. J., Wu, Q., Wang, W., . . . Huang, Q. (2023). Learning to Dub Movies via Hierarchical Prosody Models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2023-June (pp. 14687-14697). Online: IEEE. DOI Scopus24 WoS17 |
| 2023 | Guan, Q., Xie, Y., Yang, B., Zhang, J., Liao, Z., Wu, Q., & Xia, Y. (2023). Unpaired Cross-Modal Interaction Learning for COVID-19 Segmentation on Limited CT Images. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14222 (pp. 603-613). Vancouver, BC, Canada: Springer Nature Switzerland. DOI Scopus3 WoS2 |
| 2023 | Xie, Y., Gu, L., Harada, T., Zhang, J., Xia, Y., & Wu, Q. (2023). MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 14220 (pp. 13-23). Vancouver, BC, Canada: Springer Nature Switzerland. DOI Scopus12 WoS10 |
| 2022 | Tian, X., Yang, Y. L., & Wu, Q. (2022). Enhancing Person Synthesis in Complex Scenes via Intrinsic and Contextual Structure Modeling. In BMVC 2022 - 33rd British Machine Vision Conference Proceedings. |
| 2022 | Jing, C., Jia, Y., Wu, Y., Li, C., & Wu, Q. (2022). Learning the Dynamics of Visual Relational Reasoning via Reinforced Path Routing. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 Vol. 36 (pp. 1122-1130). Palo Alto, California USA: AAAI Press. DOI Scopus8 WoS4 |
| 2022 | Kazemi Moghaddam, M., Abbasnejad, E., Wu, Q., Qinfeng Shi, J., & Van Den Hengel, A. (2022). ForeSI: Success-Aware Visual Navigation Agent. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022) (pp. 3401-3410). Online: IEEE. DOI Scopus10 WoS9 |
| 2022 | Qi, Y., Pan, Z., Hong, Y., Yang, M. H., Van Den Hengel, A., & Wu, Q. (2022). The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2021) (pp. 1635-1644). online: IEEE. DOI Scopus68 WoS35 |
| 2022 | Suo, W., Sun, M., Niu, K., Gao, Y., Wang, P., Zhang, Y., & Wu, Q. (2022). A Simple and Robust Correlation Filtering Method for Text-Based Person Search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 13695 LNCS (pp. 726-742). Online: Springer Nature Switzerland. DOI Scopus74 WoS57 |
| 2022 | Gu, J., Stefani, E., Wu, Q., Thomason, J., & Wang, X. E. (2022). Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions. In PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) Vol. 1 (pp. 7606-7623). Online: ASSOC COMPUTATIONAL LINGUISTICS-ACL. DOI Scopus72 WoS45 |
| 2022 | Zhu, W., Qi, Y., Narayana, P., Sone, K., Basu, S., Wang, E. X., . . . Wang, W. Y. (2022). Diagnosing Vision-and-Language Navigation: What Really Matters. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 5981-5993). Online: ssociation for Computational Linguistics (ACL). DOI Scopus24 WoS17 |
| 2022 | Chen, Q., Tan, M., Qi, Y., Zhou, J., Li, Y., & Wu, Q. (2022). V2C: Visual Voice Cloning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2022) Vol. 2022-June (pp. 21210-21219). Online: IEEE. DOI Scopus31 WoS13 |
| 2022 | Jing, C., Jia, Y., Wu, Y., Liu, X., & Wu, Q. (2022). Maintaining Reasoning Consistency in Compositional Visual Question Answering. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 5089-5098). Online: IEEE. DOI Scopus29 WoS25 |
| 2022 | Hong, Y., Wang, Z., Wu, Q., & Gould, S. (2022). Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 15418-15428). Online: IEEE. DOI Scopus69 WoS52 |
| 2022 | Xie, Y., Zhang, J., Xia, Y., & Wu, Q. (2022). UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier. In Proceedings, Part XXI of the 17th European Conference on Computer Vision (ECCV 2022), as published in Lecture Notes in Computer Science Vol. 13681 LNCS (pp. 558-575). Online: Springer. DOI Scopus64 WoS55 |
| 2022 | Ding, Y., Yu, J., Liu, B., Hu, Y., Cui, M., & Wu, Q. (2022). MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 5079-5088). Online: IEEE. DOI Scopus115 WoS99 |
| 2022 | Qiao, Y., Qi, Y., Hong, Y., Yu, Z., Wang, P., & Wu, Q. (2022). HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2022-June (pp. 15397-15406). New Orleans, LA, USA: IEEE. DOI Scopus80 WoS53 |
| 2022 | Chen, C., Hu, Z., Jin, S., Xiao, L., Hu, M., Wu, Q., . . . Zou, M. (2022). Classification of COVID-19 in CT Scans Using Image Smoothing and Improved Deep Residual Network. In Artificial Intelligence First CAAI International Conference, CICAI 2021, Hangzhou, China, June 5–6, 2021, Proceedings, Part I Vol. 13069 LNAI (pp. 89-100). Switzerland: Springer. DOI |
| 2022 | Cao, Y., Wu, Q., Zhang, B., Liu, Z., & Li, J. (2022). FSE-MV: Compressed Domain Video Information Assisted Hybrid Real-Time Vehicle Speed Estimation. In C. T. Calafate, X. Chen, & Y. Wu (Eds.), MOBILE NETWORKS AND MANAGEMENT, MONAMI 2021 Vol. 418 (pp. 100-114). ELECTR NETWORK: SPRINGER INTERNATIONAL PUBLISHING AG. DOI |
| 2021 | Yao, Y., Chen, T., Xie, G. S., Zhang, C., Shen, F., Wu, Q., . . . Zhang, J. (2021). Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2623-2632). online: IEEE. DOI Scopus214 WoS171 |
| 2021 | Yao, Y., Sun, Z., Zhang, C., Shen, F., Wu, Q., Zhang, J., & Tang, Z. (2021). Jo-SRC: A Contrastive Approach for Combating Noisy Labels. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 5188-5197). online: IEEE. DOI Scopus158 WoS140 |
| 2021 | Deng, C., Chen, S., Chen, D., He, Y., & Wu, Q. (2021). Sketch, ground, and refine: top-down dense video captioning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2021) (pp. 234-243). online: IEEE. DOI Scopus69 WoS44 |
| 2021 | Hong, Y., Wu, Q., Qi, Y., Rodriguez Opazo, C., & Gould, S. (2021). VLN↻BERT: A Recurrent Vision-and-Language BERT for Navigation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 1643-1653). online: IEEE. DOI Scopus244 WoS166 |
| 2021 | Xu, G., Niu, S., Tan, M., Luo, Y., Du, Q., & Wu, Q. (2021). Towards Accurate Text-based Image Captioning with Content Diversity Exploration. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 12632-12641). online: IEEE. DOI Scopus66 WoS45 |
| 2021 | Gao, C., Chen, J., Liu, S., Wang, L., Zhang, Q., & Wu, Q. (2021). Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3063-3072). online: IEEE COMPUTER SOC. DOI Scopus80 WoS58 |
| 2021 | Wu, Q., Wu, C. J., Zhu, Y., & Joo, J. (2021). Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4095-4102). online: IEEE. DOI Scopus19 WoS8 |
| 2021 | Yu, J., Chai, Y., Wang, Y., Hu, Y., & Wu, Q. (2021). CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation. In IJCAI International Joint Conference on Artificial Intelligence (pp. 1274-1280). online: International Joint Conferences on Artificial Intelligence. DOI Scopus67 WoS41 |
| 2021 | Suo, W., Sun, M., Wang, P., & Wu, Q. (2021). Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention. In IJCAI International Joint Conference on Artificial Intelligence (pp. 1032-1038). online: International Joint Conferences on Artificial Intelligence. DOI Scopus10 WoS7 |
| 2021 | Gao, C., Zhu, Q., Wang, P., & Wu, Q. (2021). Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) (pp. 664-670). United States: International Joint Conferences on Artificial Intelligence. DOI Scopus2 WoS1 |
| 2021 | An, D., Qi, Y., Huang, Y., Wu, Q., Wang, L., & Tan, T. (2021). Neighbor-view Enhanced Model for Vision and Language Navigation. In MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia (pp. 5101-5109). virtual online: ACM. DOI Scopus64 WoS50 |
| 2021 | Qiao, Y., Chen, Q., Deng, C., DIng, N., Qi, Y., Tan, M., . . . Wu, Q. (2021). R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks. In MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia (pp. 2085-2093). New York, NY, United States: Association for Computing Machinery. DOI Scopus17 WoS15 |
| 2021 | Wen, Z., Xu, G., Tan, M., Wu, Q., & Wu, Q. (2021). Debiased Visual Question Answering from Feature and Sample Perspectives. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. Wortman Vaughan (Eds.), Advances in Neural Information Processing Systems 34 Vol. 5 (pp. 3784-3796). Online: Neural Information Processing Systems Foundation, Inc (NeurIPS). Scopus75 WoS41 |
| 2021 | He, K., Huang, Y., Wu, Q., Yang, J., An, D., Sima, S., & Wang, L. (2021). Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision. In Advances in Neural Information Processing Systems Vol. 2 (pp. 652-663). ELECTR NETWORK: NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). Scopus31 WoS59 |
| 2021 | Kazemi Moghaddam, M., Wu, Q., Abbasnejad, E., & Shi, J. (2021). Optimistic Agent: Accurate Graph-Based Value Estimation for More Successful Visual Navigation. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV 2021) (pp. 3732-3741). online: IEEE. DOI Scopus15 WoS13 |
| 2021 | Zheng, Y., Wen, Z., Tan, M., Zeng, R., Chen, Q., Wang, Y., & Wu, Q. (2021). Modular graph attention network for complex visual relational reasoning. In Proceedings of the 15th Asian Conference on Computer Vision (ACCV 2020), as published in Lecture Notes in Computer Science Vol. 12627 (pp. 137-153). Cham, Switzerland: Springer. DOI Scopus2 |
| 2021 | Zhu, Q., Gao, C., Wang, P., & Wu, Q. (2021). Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps. In THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE Vol. 35 (pp. 3608-3615). ELECTR NETWORK: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. DOI Scopus45 WoS36 |
| 2021 | Wang, Z., Bao, R., Wu, Q., & Liu, S. (2021). Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. In THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE Vol. 35 (pp. 2835-2843). ELECTR NETWORK: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. DOI Scopus24 WoS17 |
| 2021 | Liu, L., He, M., Xu, G., Tan, M., & Wu, Q. (2021). How to Train Your Agent to Read and Write. In THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE Vol. 35 (pp. 13397-13405). Online: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. DOI Scopus3 WoS3 |
| 2021 | Wu, Q., Qin, M., Song, J., & Liu, L. (2021). An improved method of low light image enhancement based on retinex. In 2021 6th International Conference on Image, Vision and Computing, ICIVC 2021 (pp. 233-241). online: IEEE. DOI Scopus13 |
| 2020 | Hong, Y., Rodriguez Opazo, C., Wu, Q., & Gould, S. (2020). Sub-Instruction Aware Vision-and-Language Navigation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 3360-3376). virtual online: Association for Computational Linguistics. DOI Scopus44 WoS34 |
| 2020 | Jiang, X., Yu, J., Qin, Z., Zhuang, Y., Zhang, X., Hu, Y., & Wu, Q. (2020). DualVD: An adaptive dual encoding model for deep visual understanding in visual dialogue. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence Vol. 34 (pp. 11125-11132). online: AAAI. Scopus61 WoS44 |
| 2020 | Jing, C., Wu, Y., Zhang, X., Jia, Y., & Wu, Q. (2020). Overcoming language priors in VQA via decomposed linguistic representations. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20) Vol. 34 (pp. 11181-11188). online: AAAI. DOI Scopus101 WoS65 |
| 2020 | Zhang, C., Yao, Y., Shu, X., Li, Z., Tang, Z., & Wu, Q. (2020). Data-driven Meta-set Based Fine-Grained Visual Recognition. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 2372-2381). online: ACM. DOI Scopus22 WoS16 |
| 2020 | Wang, P., Liu, D., Li, H., & Wu, Q. (2020). Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 28-36). online: ACM. DOI Scopus20 WoS16 |
| 2020 | Jing, C., Wu, Y., Pei, M., Hu, Y., Jia, Y., & Wu, Q. (2020). Visual-Semantic Graph Matching for Visual Grounding. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 4041-4050). online: ACM. DOI Scopus33 WoS21 |
| 2020 | Liu, F., Xu, G., Wu, Q., Du, Q., Jia, W., & Tan, M. (2020). Cascade Reasoning Network for Text-based Visual Question Answering. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 4060-4069). online: ACM. DOI Scopus57 WoS44 |
| 2020 | Hong, Y., Rodriguez-Opazo, C., Qi, Y., Wu, Q., & Gould, S. (2020). Language and visual entity relationship graph for agent navigation. In Advances in Neural Information Processing Systems Vol. 2020-December (pp. 1-12). online: NIPS. Scopus94 |
| 2020 | Liao, Z., Liu, L., Wu, Q., Teney, D., Shen, C., Van Den Hengel, A., & Verjans, J. (2020). Medical data inquiry using a question answering model. In Proceedings: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI 2020) Vol. 2020-April (pp. 1490-1493). online: IEEE. DOI Scopus9 WoS4 |
| 2020 | Wang, H., Wu, Q., & Shen, C. (2020). Soft Expert Reward Learning for Vision-and-Language Navigation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12354 LNCS (pp. 126-141). Switzerland: Springer Nature. DOI Scopus27 WoS20 |
| 2020 | Tang, R., Ma, C., Zhang, W. E., Wu, Q., & Yang, X. (2020). Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12364 LNCS (pp. 437-453). Switzerland: Springer International Publishing. DOI Scopus39 WoS34 |
| 2020 | Deng, C., Ding, N., Tan, M., & Wu, Q. (2020). Length-Controllable Image Captioning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12358 LNCS (pp. 712-729). Switzerland: Springer International Publishing. DOI Scopus49 WoS48 |
| 2020 | Qi, Y., Pan, Z., Zhang, S., van den Hengel, A., & Wu, Q. (2020). Object-and-Action Aware Model for Visual Language Navigation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12355 LNCS (pp. 303-317). Switzerland: Springer International Publishing. DOI Scopus78 WoS49 |
| 2020 | Jiang, X., Yu, J., Sun, Y., Qin, Z., Zhu, Z., Hu, Y., & Wu, Q. (2020). DAM: Deliberation, abandon and memory networks for generating detailed and non-repetitive responses in visual dialogue. In IJCAI International Joint Conference on Artificial Intelligence Vol. 2021-January (pp. 687-693). online: AAAI Press. Scopus10 WoS4 |
| 2020 | Zhu, Z., Yu, J., Wang, Y., Sun, Y., Hu, Y., & Wu, Q. (2020). Mucko: Multi-layer cross-modal knowledge reasoning for fact-based visual question answering. In IJCAI International Joint Conference on Artificial Intelligence Vol. 2021-January (pp. 1097-1103). online: AAAI Press. Scopus97 WoS98 |
| 2020 | Chen, Z., Wang, P., Ma, L., Wong, K. Y. K., & Wu, Q. (2020). Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension. In Proceedings of the 2020 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 10083-10092). online: IEEE. DOI Scopus61 WoS21 |
| 2020 | Qi, Y., Wu, Q., Anderson, P., Wang, X., Wang, W. Y., Shen, C., & Van Den Hengel, A. (2020). Reverie: Remote embodied visual referring expression in real indoor environments. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 9979-9988). online: IEEE. DOI Scopus292 WoS214 |
| 2020 | Liao, Z., Wu, Q., Shen, C., Van Den Hengel, A., & Verjans, J. (2020). AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering. In L. Cappellato, C. Eickhoff, N. Ferro, & A. Névéol (Eds.), Proceedings of the 11th International Conference of the CLEF Initiative (CLEF 2020), as published in CEUR Workshop Proceedings Vol. 2696 (pp. 1-14). online: CEUR-WS. Scopus8 |
| 2020 | Abbasnejad, M., Abbasnejad, I., Wu, Q., Shi, Q., & Van Den Hengel, A. (2020). Gold seeker: Information gain from policy distributions for goal-oriented vision-and-langauge reasoning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 13447-13456). online: IEEE. DOI Scopus4 WoS1 |
| 2020 | Chen, S., Jin, Q., Wang, P., & Wu, Q. (2020). Say as you wish: Fine-grained control of image caption generation with abstract scene graphs. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 9959-9968). online: IEEE. DOI Scopus236 WoS183 |
| 2020 | Chen, S., Zhao, Y., Jin, Q., & Wu, Q. (2020). Fine-grained video-text retrieval with hierarchical graph reasoning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 10635-10644). online: IEEE. DOI Scopus336 WoS157 |
| 2020 | Chen, Q., Wu, Q., Tang, R., Wang, Y., Wang, S., & Tan, M. (2020). Intelligent home 3D: Automatic 3D-house design from linguistic descriptions only. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 12622-12631). online: IEEE. DOI Scopus41 WoS26 |
| 2019 | Duan, X., Wu, Q., Gan, C., Zhang, Y., Huang, W., Van Den Hengel, A., & Zhu, W. (2019). Watch, reason and code: Learning to represent videos using program. In Proceedings of the 27th ACM International Conference on Multimedia (ACM Multimedia 2019), MM '19 (pp. 1543-1551). online: Association for Computing Machinery. DOI Scopus5 WoS1 |
| 2019 | Abbasnejad, E., Wu, Q., Shi, Q., & Van Den Hengel, A. (2019). What's to know? uncertainty as a guide to asking goal-oriented questions. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2019-June (pp. 4150-4159). online: IEEE. DOI Scopus18 WoS10 |
| 2019 | Zhang, J., Wu, Q., Zhang, J., Shen, C., & Lu, J. (2019). Mind your neighbours: Image annotation with metadata neighbourhood graph co-attention networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2019-June (pp. 2951-2959). online: IEEE. DOI Scopus22 WoS13 |
| 2019 | Wang, P., Wu, Q., Cao, J., Shen, C., Gao, L., & Hengel, A. V. D. (2019). Neighbourhood watch: Referring expression comprehension via language-guided graph attention networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2019-June (pp. 1960-1968). online: IEEE. DOI Scopus277 WoS232 |
| 2018 | Cao, I., Guo, Y., Wu, Q., Shen, C., Huang, J., & Tan, M. (2018). Adversarial learning with local coordinate coding. In 35th International Conference on Machine Learning, ICML 2018 Vol. 2 (pp. 1104-1117). online: PMLR. Scopus23 WoS11 |
| 2018 | Zhuang, Z., Tan, M., Zhuang, B., Liu, J., Guo, Y., Wu, Q., . . . Zhu, J. (2018). Discrimination-aware Channel Pruning for Deep Neural Networks. In Advances in Neural Information Processing Systems Vol. 2018-December (pp. 875-886). online: NIPS. Scopus453 WoS267 |
| 2018 | Zhang, J., Zhang, J., Wu, Q., Wu, Q., Xu, J., Lu, J., . . . Tang, Z. (2018). Historical image annotation by exploring the tag relevance. In Proceedings - 4th Asian Conference on Pattern Recognition, ACPR 2017 (pp. 646-651). Nanjing, PEOPLES R CHINA: IEEE. DOI Scopus1 WoS1 |
| 2018 | Zhuang, B., Wu, Q., Shen, C., Reid, I., & Van Den Hengel, A. (2018). HCVRD: A benchmark for large-scale human-centered visual relationship detection. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 7631-7638). New Orleans: Association for the Advancement of Artificial Intelligence. Scopus37 WoS29 |
| 2018 | Zhang, J., Wu, Q., Zhang, J., Shen, C., & Lu, J. (2018). Kill two birds with one stone: Weakly-supervised neural network for image annotation and tag refinement. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 7550-7557). New Orleans: ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. Scopus9 WoS6 |
| 2018 | Wu, Q., Wang, P., Shen, C., Reid, I., & Hengel, A. (2018). Are you talking to me? Reasoned visual dialog generation through adversarial learning. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) (pp. 6106-6115). Salt Lake City, UT: IEEE. DOI Scopus110 WoS91 |
| 2018 | Deng, C., Wu, Q., Wu, Q., Hu, F., Lyu, F., & Tan, M. (2018). Visual Grounding via Accumulated Attention. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 7746-7755). online: IEEE. DOI Scopus176 WoS135 |
| 2018 | Anderson, P., Das, A., & Wu, Q. (2018). Connecting language and vision to actions. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference Tutorial Abstracts (pp. 10-14). Melbourne: Association for Computational Linguistics. DOI |
| 2018 | Huang, Y., Wu, Q., Song, C., & Wang, L. (2018). Learning Semantic Concepts and Order for Image and Sentence Matching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 6163-6171). online: IEEE. DOI Scopus340 WoS274 |
| 2018 | Ma, C., Shen, C., Dick, A., Wu, Q., Wang, P., Van Den Hengel, A., & Reid, I. (2018). Visual Question Answering with memory-augmented network. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) (pp. 6975-6984). Salt Lake City, Utah: IEEE. DOI Scopus102 WoS79 |
| 2018 | Anderson, P., Wu, Q., Teney, D., Bruce, J., Johnson, M., Sünderhauf, N., . . . Hengel, A. V. D. (2018). Vision-and-language navigation: interpreting visually-grounded navigation instructions in real environments. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) Vol. abs/1711.07280 (pp. 3674-3683). Salt Lake City, UT: IEEE. DOI Scopus1073 WoS1257 |
| 2018 | Zhang, J., Xie, Y., Wu, Q., & Xia, Y. (2018). Skin lesion classification in dermoscopy images using synergic deep learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 11071 LNCS (pp. 12-20). Switzerland: Springer. DOI Scopus45 WoS29 |
| 2018 | Zhang, J., Wu, Q., Shen, C., Zhang, J., Lu, J., & van den Hengel, A. (2018). Goal-oriented visual question generation via intermediate rewards. In V. Ferrari, M. Hebert, C. Sminchisescu, & Y. Weiss (Eds.), Computer Vision - ECCV 2018: Proceedings, Part V Vol. Lecture Notes in Computer Science; vol. 11209 (pp. 189-204). Munich: Springer. DOI Scopus13 WoS17 |
| 2018 | Zhuang, B., Wu, Q., Shen, C., Reid, I., & van den Hengel, A. (2018). Parallel attention: a unified framework for visual object discovery through dialogs and queries. In Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018) (pp. 4252-4261). Salt Lake City, UT: IEEE. DOI Scopus138 WoS98 |
| 2018 | Wang, C., Zhao, R., Yang, X., & Wu, Q. (2018). Research of UAV Target Detection and Flight Control Based on Deep Learning. In 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD) (pp. 170-174). online: IEEE. DOI WoS14 |
| 2018 | Wu, Q., Wang, P., Liu, E., Fan, Y., Duan, D., Wang, Z., & Cai, S. (2018). Design and Implementation of Learning Management Platform for Aviation Flight Training Based on SCORM/AICC Standard-A Case Study of K Airline Company Flight Training Learning Platform. In ADVANCED SCIENCE LETTERS Vol. 24 (pp. 5194-5198). INDONESIA, Bandung: AMER SCIENTIFIC PUBLISHERS. DOI WoS1 |
| 2017 | Wang, Q., Chen, W., & Wu, Q. (2017). The research and application of an real-time embedded measurement and control system for the river discharge. In S. Li, Y. Dai, & Y. Cheng (Eds.), 2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE) (pp. 1295-1298). Changsha, PEOPLES R CHINA: IEEE. DOI |
| 2017 | Wang, P., Wu, Q., Shen, C., & van den Hengel, A. (2017). The VQA-machine: learning how to use existing vision algorithms to answer new questions. In Proceedings: 30th IEEE Conference on Computer Vision and Pattern Recognition Vol. 2017-January (pp. 3909-3918). Honolulu: IEEE. DOI Scopus73 WoS42 |
| 2017 | Wang, P., Wu, Q., Shen, C., Dick, A., & Van Den Hengel, A. (2017). Explicit knowledge-based reasoning for visual question answering. In C. Sierra (Ed.), Proceedings of the twenty-sixth International Joint Conference on Artificial Intelligence Vol. 0 (pp. 1290-1296). online: IJCAI. DOI Scopus156 WoS100 |
| 2016 | Wu, Q., Wang, P., Shen, C., Dick, A., & Van Den Hengel, A. (2016). Ask me anything: free-form visual question answering based on knowledge from external sources. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2016-December (pp. 4622-4630). Las Vegas, NV: IEEE. DOI Scopus322 WoS212 |
| 2016 | Wu, Q., Shen, C., Liu, L., Dick, A., & Van Den Hengel, A. (2016). What value do explicit high level concepts have in vision to language problems?. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 2016-December (pp. 203-212). Las Vegas, NV: IEEE. DOI Scopus430 WoS298 |
| 2016 | Wu, Q., Wang, C., Li, A., & Huang, B. (2016). Integral sliding mode controller design for near space vehicle with input constraints. In 2016 IEEE CHINESE GUIDANCE, NAVIGATION AND CONTROL CONFERENCE (CGNCC) (pp. 187-191). PEOPLES R CHINA, Nanjing: IEEE. WoS2 |
| 2016 | Gao, G., Yang, H., Wu, Q., Mao, S. -J., & Yin, W. -L. (2016). A Wideband and Low Cross Polarization Slot Antenna Based on Differential-Feed. In INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATION AND NETWORK ENGINEERING (WCNE 2016) (pp. 4 pages). PEOPLES R CHINA, Beijing: DESTECH PUBLICATIONS, INC. |
| 2016 | Gao, G., Yang, H., Jin, Z., & Wu, Q. (2016). A Broadband Dual-polarization Slot Antenna Based on Substrate-integrated Cavity. In 2016 PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM (PIERS) (pp. 1994-1998). PEOPLES R CHINA, Shanghai: IEEE. |
| 2016 | Wu, Q., Yang, H., Jin, Z., Gao, G., & Cao, D. (2016). A Design of Band-pass Filter with Steep Stopband Attenuation Based on Transmission Zeros. In 2016 PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM (PIERS) (pp. 3482-3486). PEOPLES R CHINA, Shanghai: IEEE. |
| 2016 | Wang, X., Wu, Q., & Yang, J. (2016). Extended PGA Processing of High Resolution Airborne SAR Imagery Reconstructed via Backprojection Algorithm. In 2016 CIE INTERNATIONAL CONFERENCE ON RADAR (RADAR) (pp. 3 pages). PEOPLES R CHINA, Guangzhou: IEEE. |
| 2016 | Wu, Q., Yang, H., Gao, G., Gu, L., & Zhao, F. (2016). A Design of High Gain Archimedean Spiral Antenna. In INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATION AND NETWORK ENGINEERING (WCNE 2016) (pp. 4 pages). PEOPLES R CHINA, Beijing: DESTECH PUBLICATIONS, INC. |
| 2016 | Tang, J., Guo, Y., Lai, X., Liu, Y., & Wu, Q. (2016). Study on the Correlation between Fe<SUP>2+</SUP> and Peridot's Yellow Green Color and Quality Evaluation of Color Based on CIE1976 L*a*b* Uniform Color Space. In X. Xiao, & P. Han (Eds.), PROCEEDINGS OF THE 2016 5TH INTERNATIONAL CONFERENCE ON ENVIRONMENT, MATERIALS, CHEMISTRY AND POWER ELECTRONICS Vol. 84 (pp. 599-604). PEOPLES R CHINA, Zhengzhou: ATLANTIS PRESS. WoS1 |
| 2015 | Wu, Q., Wu, Q., Zhao, S., Wei, M., & Wang, F. L. (2015). Knowledge Communication Analysis Based on Clustering and Association Rules Mining. In A. Liu, Y. Ishikawa, T. Qian, S. Nutanong, & M. A. Cheema (Eds.), DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015 Vol. 9052 (pp. 66-75). VIETNAM, Hanoi: SPRINGER-VERLAG BERLIN. DOI |
| 2015 | Wu, Q., Vogt, A., Briins, H. -D., Gronwald, F., & Schuster, C. (2015). Numerical and Experimental Evaluation of Electromagnetic Coupling between Radiating Antenna Structures inside a Computer Casing. In 2015 IEEE INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC COMPATIBILITY (EMC) (pp. 328-333). GERMANY, Dresden: IEEE. |
| 2015 | Cai, H., Wu, Q., & Hall, P. (2015). Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem. In Proceedings of the IEEE International Conference on Computer Vision Vol. 2015-February (pp. 74-79). Santigo: IEEE. DOI Scopus3 WoS4 |
| 2015 | Wu, Q., Chen, F. -C., & Huang, R. -Y. (2015). Detecting Temporal Community from Dynamic Heterogeneous Networks. In PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015) (pp. 610-613). Harbin, PEOPLES R CHINA: IEEE. |
| 2014 | Wu, Q., Cai, H., & Hall, P. (2014). Learning graphs to model visual objects across different depictive styles. In D. Fleet, T. Pajdia, B. Schiele, & T. Tuytelaars (Eds.), Proceedings of the 13th European Conference on Computer Vision Vol. VII (pp. 313-328). Zurich, Switzerland: Springer. DOI Scopus17 WoS10 |
| 2013 | Wu, Q., & Hall, P. (2013). Modelling visual objects Invariant to depictive style. In T. Burghardt, D. Damen, W. Mayol-Cuevas, & M. Mirmehdi (Eds.), Proceedings of the British Machine Vision Conference (pp. 23.1-23.12). Bristol, UK: BMVA Press. DOI Scopus4 |
| 2013 | Hao, Y., Wu, Q., & Liu, B. (2013). Literature Review on the Impact of Income Distribution Gap on Consumer Demand. In G. Lee (Ed.), PSYCHOLOGY, MANAGEMENT AND SOCIAL SCIENCE Vol. 18 (pp. 65-70). PEOPLES R CHINA, Shenzhen: INFORMATION ENGINEERING RESEARCH INST, USA. |
| 2012 | Wu, Q., Fu, X., & Shen, X. (2012). Automatic micro-expression analysis. In INTERNATIONAL JOURNAL OF PSYCHOLOGY Vol. 47 (pp. 144-145). JOHN WILEY & SONS LTD. |
| 2012 | Wu, Q., & Hall, P. (2012). Prime shapes in natural images. In R. Bowden, J. Collomosse, & K. Mikolajcczk (Eds.), Proceedings of the British Machine Vision Conference (pp. 45-1-45-12). Surrey, UK: BMVA Press. DOI Scopus4 WoS2 |
| 2011 | Hoffman, J., Wang, L. -M., Wu, Q., & Morton, K. (2011). Uptake of 2-deoxyglucose analogs by thrombotically activated cells. In JOURNAL OF NUCLEAR MEDICINE Vol. 52 (pp. 2 pages). SOC NUCLEAR MEDICINE INC. |
| 2008 | Liu, Y., Yin, Y., Teng, Z., Wu, Q., & Li, G. (2008). Activities prediction of drug molecules by using the optimal ensemble based on uniform design. In D. S. Huang, D. C. Wunsch, D. S. Levine, & K. H. Jo (Eds.), ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS Vol. 5226 (pp. 106-+). PEOPLES R CHINA, Shanghai: SPRINGER-VERLAG BERLIN. WoS1 |
| 2007 | Wu, Q., Shao, T. -C., & Chen, T. (2007). Robust self-calibration from single image using RANSAC. In G. Bebis, R. Boyle, B. Parvin, D. Koracin, N. Paragios, S. M. Tanveer, . . . T. Malzbender (Eds.), ADVANCES IN VISUAL COMPUTING, PT I Vol. 4841 (pp. 230-+). NV, Lake Tahoe: SPRINGER-VERLAG BERLIN. WoS5 |
| 2006 | Wu, Q., Song, M., Bu, J., & Chen, C. (2006). EigenExpress approach in recognition of facial expression using GPU. In T. S. Huang, N. Sebe, M. S. Lew, V. Pavlovic, M. Kolsch, A. Galata, & B. Kisacanin (Eds.), COMPUTER VISION IN HUMAN-COMPUTER INTERACTION Vol. 3979 (pp. 12-20). AUSTRIA, Graz: SPRINGER-VERLAG BERLIN. WoS1 |
| Year | Citation |
|---|---|
| 2024 | Li, Y., Yu, J., Gai, K., Liu, B., Xiong, G., & Wu, Q. (2024). T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval. DOI Scopus2 |
| Year | Citation |
|---|---|
| 2024 | Zhou, G., Hong, Y., Wang, Z., Wang, X. E., & Wu, Q. (2024). NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. |
| 2024 | Zhou, G., Hong, Y., Wang, Z., Zhao, C., Bansal, M., & Wu, Q. (2024). SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts. |
| 2024 | Wei, Y., Fu, S., Jiang, W., Zhang, Z., Zeng, Z., Wu, Q., . . . Zhang, Y. (2024). GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning. |
| 2024 | Chen, Q., Zhang, B., Wang, G., & Wu, Q. (2024). Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles. |
| 2024 | Chen, Q., Zhao, R., Wang, S., Phan, V. M. H., Hengel, A. V. D., Verjans, J., . . . Wu, Q. (2024). A Survey of Medical Vision-and-Language Applications and Their Techniques. |
| 2024 | Phan, V. M. H., Xie, Y., Qi, Y., Liu, L., Liu, L., Zhang, B., . . . Verjans, J. W. (2024). Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework.. |
| 2023 | Chen, Q., Pitawela, D., Zhao, C., Zhou, G., Chen, H. -T., & Wu, Q. (2023). WebVLN: Vision-and-Language Navigation on Websites. |
| 2021 | Chen, Q., Li, Y., Qi, Y., Zhou, J., Tan, M., & Wu, Q. (2021). V2C: Visual Voice Cloning. |
| 2021 | Moghaddam, M. K., Abbasnejad, E., Wu, Q., Shi, J., & Hengel, A. V. D. (2021). Learning for Visual Navigation by Imagining the Success. |
| 2019 | Parvaneh, A., Abbasnejad, E., Wu, Q., & Shi, J. (2019). Show, Price and Negotiate: A Hierarchical Attention Recurrent Visual Negotiator.. |
-
MyIP-7370, CERA grants, Anton van den Hengel, Anthony Dick, Qi Wu, Answer Me Why:Explainability is Critical if We are to Trust Automated Decision Making, 98,000 AUD
-
MyIP-7370, CERA grants, Anton van den Hengel, Anthony Dick, Qi Wu, Robust long-term Autonomous Navigation, 98,000 AUD
-
Facebook’s Research and Academic Relations Program, Peter Anderson, Qi Wu, Damien Teney, Niko Sunderhauf, Stephen Gould, Anton van den Hengel, Treasure Hunt: Natural Language N
-
Computer Vision
-
Machine Learning
-
Algorithms and Data Structure Analysis
-
Research Methods
-
Advanced Topics in Computer Science
| Date | Role | Research Topic | Program | Degree Type | Student Load | Student Name |
|---|---|---|---|---|---|---|
| 2025 | Principal Supervisor | CNN-TTT Fusion: Advancing 3D Medical Image Segmentation | Master of Philosophy | Master | Full Time | Mr Yuming Chen |
| 2025 | Principal Supervisor | Embodied Vision-and-Language Navigation: Deploy Vision-and-Language Navigation in Real-World via Knowledge Distillation from Large Foundation Models | Doctor of Philosophy | Doctorate | Full Time | Mr Zerui Li |
| 2025 | Principal Supervisor | Foundation Models for Embodied Navigation | Doctor of Philosophy | Doctorate | Full Time | Mr Xiangyu Shi |
| 2025 | Co-Supervisor | Multi-agent Vision-and-Language Navigation Base on Large Foundation Models | Doctor of Philosophy | Doctorate | Full Time | Mr Qunchao Jin |
| 2025 | Principal Supervisor | Towards Building Real-World Embodied Vision Language Navigation Agents | Doctor of Philosophy | Doctorate | Full Time | Mr Xunyi Zhao |
| 2024 | Principal Supervisor | Vision-language Pre-training in Medical Domain | Doctor of Philosophy | Doctorate | Full Time | Ms Sinuo Wang |
| 2024 | Principal Supervisor | Direct Fitting 3D Generative Models Using Volume Rendering | Master of Philosophy | Master | Full Time | Mr Jian Zhou |
| 2024 | Principal Supervisor | Parameter-efficient Tuning Large Vision-Language Models | Doctor of Philosophy | Doctorate | Full Time | Mr Shuai Fu |
| 2023 | Principal Supervisor | Vision-and-Language in the Wild | Doctor of Philosophy | Doctorate | Full Time | Mr Zheng Yu |
| 2023 | Principal Supervisor | Efficient Video Foundation Model | Doctor of Philosophy | Doctorate | Full Time | Mr Feng Chen |
| 2022 | Principal Supervisor | Vision-and-Language Methods in Clinical Applications | Doctor of Philosophy | Doctorate | Full Time | Mr Chaohan Wang |
| 2022 | Co-Supervisor | MUDE: Mixed-reality Unified Development Environment for Context-Aware AI Automation Tasks | Doctor of Philosophy | Doctorate | Full Time | Miss Xiaoyan Wei |
| 2022 | Principal Supervisor | Spatiotemporal Multimodal Learning in Embodied AI | Doctor of Philosophy | Doctorate | Full Time | Mr Gengze Zhou |
| Date | Role | Research Topic | Program | Degree Type | Student Load | Student Name |
|---|---|---|---|---|---|---|
| 2022 - 2023 | Principal Supervisor | Vision-and-Language Navigation in the Real-World | Master of Philosophy | Master | Full Time | Mr Chongyang Zhao |
| 2021 - 2024 | Principal Supervisor | Multi-modal Generation, Synergy and Evaluation | Doctor of Philosophy | Doctorate | Full Time | Mr Qi Chen |
| 2021 - 2025 | Co-Supervisor | Finding the Optimal Path in Real-World Environments Using Natural Language Instructions | Doctor of Philosophy | Doctorate | Full Time | Mr Bahram Mohammadi |
| 2020 - 2023 | Principal Supervisor | General Vision and Language Methods in Real Applications: A Focus on Vision-and-Language Navigation | Doctor of Philosophy | Doctorate | Full Time | Miss Yanyuan Qiao |
| 2020 - 2024 | Principal Supervisor | Language-based Visual Understanding | Doctor of Philosophy | Doctorate | Full Time | Mr Chaorui Deng |
| 2019 - 2022 | Co-Supervisor | Towards Optimistic, Imaginative, and Harmonious Reinforcement Learning in Single-Agent and Multi-Agent Environments |
Doctor of Philosophy | Doctorate | Full Time | Mr Mahdi Kazemi Moghaddam |
| 2018 - 2021 | Co-Supervisor | Fully Convolutional Instance-level Visual Recognition | Doctor of Philosophy | Doctorate | Full Time | Mr Zhi Tian |
| 2018 - 2022 | Co-Supervisor | 3D Scene Reconstruction from A Monocular Image | Doctor of Philosophy | Doctorate | Full Time | Mr Wei Yin |
| 2018 - 2021 | Co-Supervisor | Multi-modality Data Analysis Using Deep Reinforcement Learning | Doctor of Philosophy | Doctorate | Full Time | Mr Hu Wang |
| 2018 - 2022 | Co-Supervisor | Efficient Deep Networks for Image Matting | Doctor of Philosophy | Doctorate | Full Time | Ms Yutong Dai |
| 2017 - 2018 | Co-Supervisor | Text Detection and Recognition in Natural Scene Images | Doctor of Philosophy | Doctorate | Full Time | Mrs Hui Li |