InnoVAE: Generative AI for Understanding Patents and Innovation

Department of Decision Sciences and Managerial Economics

A lack of interpretability limits the use of common unsupervised learning techniques (e.g., PCA, t-SNE) in contexts where they are meant to augment managerial decision-making. We develop a generative deep learning model based on a Variational AutoEncoder (“InnoVAE”) that converts unstructured patent text into an interpretable, spatial representation of innovation (“Innovation Space”). After validating the internal consistency of the model, we apply it to three decades of computing system patents to show that our approach can be used to construct economically interpretable measures—at scale—that characterise a firm’s IP portfolio from the text of its patents, such as whether a patent is a breakthrough innovation, the volume of intellectual property enclosed by a portfolio of patents, or the density of patents at a point in Innovation Space. We show that for explaining innovation outcomes, these interpretable, engineered features have explanatory power that augments and often surpasses the structured patent variables that have informed the very large and influential literature on patents and innovation. Our findings illustrate the potential of using generative methods on unstructured data to guide managerial decision-making.