DE / EN

Talk at Aix-Marseille School of Economics

Prof. Ratkovic visited the group of Asst. Prof. Romain Ferrali at the Aix-Marseille School of Economics and gave a talk entitled „Large Language Models for Statistical Inference: Context Augmentation with Applications to the Two-Sample Problem, Regression, and Concordance“.

The abstract reads:
Text data pose significant challenges for statistical inference due to their high dimensionality and unstructured nature, and conventional methods based on fixed embeddings or topic models often oversimplify linguistic complexity without providing formal inferential guarantees. In this work, I integrate large language models (LLMs) with statistical inference by employing them to generate latent contexts that serve as auxiliary information. I introduce a clause function that quanti­fies the interaction between an observed text string and its LLM-generated latent contexts, and by averaging over these contexts, obtain string-level statistics. Under standard support and ignorability assumptions along with additional regularity conditions, I characterize the limiting distribution of these statistics—a result that extends to averages, regression coefficients, and robust rank- and quantile-based alternatives. Applications to a two-sample test and a regression model with text-based predictors and outcomes demonstrate the utility of this framework for rigorous inference on textual data.

Zurück