Referenced Papers (8)
Making sense of vision and touch: Learning multimodal representations for contact tasks
Lee
IEEE Int. Conf. Robot. Autom.
"This paper is cited as an example of learning multimodal representations from vision and touch for contact tasks in robotics."
SMELLNET: A large-scale dataset for real-world smell recognition
Dewei Feng, Carol Li, Wei Dai, Paul Pu Liang
ArXiv, 2025
"This paper provides a database for real-world smell recognition, cited in the context of AI for physical sensing, specifically AI for smell."
VideoPoet: A large language model for zero-shot video generation
D Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Rachel Hornung
ICML, 2023
"The speaker uses this model as an example of multimodal generative AI that can translate between different modalities like text-to-video and video-to-audio."
Developing ICU clinical behavioral atlas using ambient intelligence and computer vision
Wei Dai, Ehsan Adeli, Zelun Luo, Dev Dash, S Lakshmikanth, Zane Durante
NEJM AI, 2025
"This work is cited as an example of applying AI for holistic health, specifically for understanding the physical health of patients in an ICU by analyzing their behavior."
OpenFace 3.0: A lightweight multitask system for comprehensive facial behavior analysis
Jiewen Hu, Leena Mathur, Paul Pu Liang, Louis-Philippe Morency
IEEE Int Conf Autom Face Gesture Recognit, 2025
"Cited in the context of holistic health, this work on facial behavior analysis is relevant to understanding social wellness through non-verbal cues."
Advancing Social Intelligence in AI: Technical Challenges and Open Questions
Mathur
EMNLP
"This paper highlights the challenges in developing AI with social intelligence, a key component of the speaker's discussion on holistic health."
WebArena: A realistic web environment for building autonomous agents
Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar
Int Conf Learn Represent, 2023
"This paper is cited as an example of work on interactive AI agents that operate on the web, illustrating a shift from single-step prediction to multi-step action sequences."
VideoWebArena: Evaluating multimodal agents on video understanding web tasks
Jang
ICLR
"This citation showcases advancements in building and evaluating interactive, multimodal agents for web-based tasks."