Work2Vec: Measuring the Latent Structure of the Labor Market

Department of Decision Sciences and Managerial Economics

Job postings provide unique insights about the demand for skills, tasks, and occupations. Using the full text of data from millions of online job postings, we leverage natural language processing (NLP) in a machine learning model with over 100 million parameters to classify job postings’ occupation labels and salaries. To derive additional insights from the model, we develop a method of injecting deliberately constructed text snippets reflecting occupational content into postings. We apply this text injection technique to understand the returns to several information technology skills including machine learning itself. We further extract measurements of the topology of the labor market, building a “jobspace” using the relationships learned in the text structure. Our measurements of the jobspace imply expansion of the types of work available in the U.S. labor market from 2010 to 2019. We compare change rates across occupations, finding substantial heterogeneity across categories. We also demonstrate that this technique can be used to construct indices of occupational technology exposure with an application to remote work. Moreover, our analysis shows that data-driven hierarchical taxonomies can be constructed from job postings to augment existing occupational taxonomies like the SOC (Standard Occupational Classification) system.