LLM for Large-Scale Optimization Model Auto-Formulation: A Few-Shot Learning Approach

As modern business problems grow increasingly complex, large-scale optimization modeling becomes a critical backbone of decision-making. However, traditional optimization modeling is often labor-intensive, which can be both costly and time-consuming. We address this by proposing LEAN-LLM-OPT, a novel LightwEight few-shot leArNing framework for LLM-guided large-scale OPTimization model auto-formulation, that takes a query (a problem description and associated datasets) as input and orchestrates a team of LLM agents to output the optimization formulation. LEAN-LLM-OPT innovatively uses few-shot learning to demonstrate how to use customized retrieval-augmented generation (RAG) tools to enhance LLM agents in optimization modeling. Specifically, upon receiving a query, a problem classification agent first determines the type of the problem. Then, a few-shot example generation agent consolidates a set of references to demonstrate how optimization models are built for problems of same type. Finally, a model generation agent learns from these examples to extract relevant information from the input datasets using RAG tools and generate the final optimization model. Extensive simulations demonstrate that LEAN-LLM-OPT achieves state-of-the-art modeling accuracy compared to existing methods, especially on large-scale optimization modeling. Additionally, we validate its effectiveness through a Singapore Airlines flight scheduling use case. As another practical contribution, we introduce Large-Scale-OR and Air-NRM, the first set of large-scale optimization model formulation benchmarks based on real-world applications across different domains. Our results thus provide a resource-efficient solution for large-scale optimization model auto-formulation and offers evidences for LLMs as few-shot learners in this regime. A demo of LEAN-LLM-OPT is available at https://lean-opt-llm.streamlit.app.