CVPR Workshop on Causal and Object-Centric Representations for Robotics

Call for Papers We are sourcing two different types of papers: four-page papers and one-page abstracts, focusing on structured and causal representations and their applications in robotics. Specific topics of interest include:

Causal representation learning: how to learn representations with deep networks that conform to cause-and-effect transformations in the pixel space.
Object-centric learning: how to learn representations that are object-specific without requiring closed-world manual annotations.
Scaling structured representations: how to learn object-centric learning and causal representations on real-world image and video data such as MS COCO images or videos from YouTube
Downstream applications of structured representations: how to use causal and object-centric representations for tasks such as reinforcement learning, planning, and decision-making.
Learning of interventions: how can a robot algorithm transform and control different components of the environment to achieve certain goals.
Causal Reinforcement Learning for Embodied AI: how to best learn RL policies to achieve goals if cause-and-effect relations are known.
Benchmarks that quantify the benefits of causal and object-centric representations (e.g. systematic generalization, OOD performance, robustness wrt. interventions, etc.).
Relations and possible synergies to foundation models.

Submission Policy:

We encourage two types of submissions:
1. Novel Papers: 4-page submissions presenting new perspectives or experimental results. Should contain up to 4 pages of main paper plus any number of pages for references and supplementary material
2. Abstracts from previously published work: One-page abstracts of previously submitted work (e.g., from CVPR or in other related conferences). The single page should contain a summary of the paper and outline its relation to the workshop topic and a note on where the paper was presented earlier.
We ask authors to use the supplementary material only for minor details that do not fit in the main paper. We reserve the right to desk reject papers that strongly violate this format (e.g., more than 4 pages of main content before references)
4-page submissions should be fully anonymized for double-blind review.
Papers should use this style template.
Accepted submissions will appear on the workshop website (non-archival).

Important Dates and Links

Submission site opens	April 01 '24 12:00 AM UTC
Submission site	Submission should be made on OpenReview.
Submission deadline (4-page submissions)	April 28 '24 12:00 PM UTC
Submission deadline (1 page abstracts)	May 10 '24 12:00 PM UTC
Decisions announced	~~April 30th~~ 3rd of May
Camera-ready due	~~April 30th~~ 3rd of May

References

Seitzer, M., Horn, M., Zadaianchuk, A., Zietlow, D., Xiao, T., Simon-Gabriel, C.J., He, T., Zhang, Z., Schölkopf, B., Brox, T. and Locatello, F., 2022, September. Bridging the gap to real-world object-centric learning. In The Eleventh International Conference on Learning Representations.
Gu, Qiao, Alihusein Kuwajerwala, Sacha Morin, Krishna Murthy Jatavallabhula, Bipasha Sen, Aditya Agarwal, Corban Rivera et al. "Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning." arXiv preprint arXiv:2309.16650 (2023).
Rana, Krishan, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, and Niko Suenderhauf. "Sayplan: Grounding large language models using 3d scene graphs for scalable task planning." arXiv preprint arXiv:2307.06135 (2023).
Locatello, Francesco, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, and Thomas Kipf. "Object-centric learning with slot attention." Advances in neural information processing systems 33 (2020): 11525-11538.
Gu, Qiao, Zhaoyang Lv, Duncan Frost, Simon Green, Julian Straub, and Chris Sweeney. "EgoLifter: Open-world 3D Segmentation for Egocentric Perception." arXiv preprint arXiv:2403.18118 (2024).
Lu, S., Chang, H., Jing, E.P., Boularias, A., Bekris, K.: Ovir-3d: Open-vocabulary 3d instance retrieval without training on 3d data. In: Conference on Robot Learning. pp. 1610-1620. PMLR (2023)
Siddharth Patki, Jacob Arkin, Nikola Raicevic and Thomas M. Howard, "Language Guided Temporally Adaptive Perception for Efficient Natural Language Grounding in Cluttered Dynamic Worlds," 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 2023, pp. 7854-7861, doi: 10.1109/IROS55552.2023.10341527.
Judea Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, USA, 2nd edition, 2009. ISBN 052189560X
Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search, Second Edition. Adaptive computation and machine learning. MIT Press, 2000. ISBN 978-0-262-19440-2.
Jonas Peters, Dominik Janzing, and Bernhard Schlkopf. Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017. ISBN 0262037319.
Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research. PMLR, 09-15 Jun 2019.
Patki, Siddharth, Andrea F. Daniele, Matthew R. Walter, and Thomas M. Howard. "Inferring compact representations for efficient natural language understanding of robot instructions." In 2019 International Conference on Robotics and Automation (ICRA), pp. 6926-6933. IEEE, 2019.
Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, and Yoshua Bengio. Toward causal representation learning. Proceedings of the IEEE, 109(5), 2021.
Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, and Efstratios Gavves. CITRIS: Causal Identifiability from Temporal Intervened Sequences. In Proceedings of the 39th International Conference on Machine Learning, ICML, 2022.
Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, and Efstratios Gavves. BISCUIT: Causal Representation Learning from Binary Interactions. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, UAI, 2023.