Multimodal Methods for Video Data Analytics

Video data, comprising interdependent text, image, and audio modalities that collectively characterize the same source, offer a wealth of information for business researchers. However, the challenge is to how to comprehensively account for within- and between-modality interdependencies, when highlighting the vital role of both verbal and nonverbal cues embedded in video data. My talk tackles this challenge by automating video data analytics with advanced deep learning transformers, multimodal data fusion, and explainable Artificial Intelligence (XAI) methods. Through empirical demonstrations of measuring emotion dynamics in entrepreneur pitches and trustworthiness of sellers in live streaming commerce on Tik Tok, we underline the crucial role of interpersonal interactions in the success of startups and microenterprises. By bridging business research with cutting-edge computational AI/ML techniques, we provide practitioners with actionable strategies for enhancing communication effectiveness and fostering trust-based business relationships. We provide access to our data and algorithms to research that leverages video datasets in other contexts.