Mr Yichao Cai
Higher Degree by Research Candidate
School of Computer Science and Information Technology
College of Engineering and Information Technology
I develop machine learning methods for learning meaningful and controllable representations of the world, robust to noise, bias, and incomplete supervision. My work lies at the intersection of representation learning, multimodal modeling, and causality, guided by a central question:
"What makes a representation aligned with human conceptual understanding and amenable to control?"
I combine theoretical modeling with empirical analysis of vision–language systems to design interpretable, reliable representations that move beyond scale alone.
Current directions include:
- Causal and counterfactual modeling of multimodal signals
- Mitigating bias and misalignment in cross-modal training
- Language-guided identifiability and semantic control
My long-term goal is to enable trustworthy, human-aligned AI systems through principled representation learning.
| Language | Competency |
|---|---|
| Chinese (Mandarin) | Can read, write, speak, understand spoken and peer review |
| English | Can read, write, speak, understand spoken and peer review |
| Date | Institution name | Country | Title |
|---|---|---|---|
| 2016 - 2019 | Wuhan University of Technology | China | M.S. |
| 2012 - 2016 | Wuhan University of Technology | China | B.Eng. |
| Year | Citation |
|---|---|
| 2018 | Cai, Y., Li, D., Zhou, X., & Mou, X. (2018). Robust Drivable Road Region Detection for Fixed-Route Autonomous Vehicles Using Map-Fusion Images. SENSORS, 18(12), 15 pages. WoS13 Europe PMC4 |
| Year | Citation |
|---|---|
| 2024 | Cai, Y., Liu, Y., Zhang, Z., & Shi, J. Q. (2024). CLAP: Isolating Content from Style Through Contrastive Learning with Augmented Prompts. In Lecture Notes in computer science Vol. 15079 (pp. 130-147). Milan, Italy: Springer Nature Switzerland. DOI Scopus5 |
| Year | Citation |
|---|---|
| 2025 | Cai, Y., Liu, Y., Gao, E., Jiang, T., Zhang, Z., Hengel, A. V. D., & Shi, J. Q. (2025). On the Value of Cross-Modal Misalignment in Multimodal Representation Learning. |
- Head Tutor, Statistical Machine Learning (Semester 2, 2025) @ The University of Adelaide
- Teaching Assistant, Using Machine Learning Tools (Trimester 2, 2025) @ The University of Adelaide
- Teaching Assistant, Concepts in AI and ML (Trimester 1, 2025) @ The University of Adelaide