Devharish, N (2025) LARGE LANGUAGE MODELS THROUGH THE LENS OF CAUSALITY: INSIGHTS AND APPLICATIONS. Masters thesis, Indian Institute of Science Education and Research Kolkata.
|
Text (MS Dissertation of N Devharish (20MS042))
20MS042_Thesis_file.pdf - Submitted Version Restricted to Repository staff only Download (3MB) |
Abstract
Large Language Models (LLMs) have rapidly become integral to high-stakes domains, from clinical diagnostics to financial risk assessment. As these systems gain influence, the demand for interpretability—particularly through a causal lens—has become critical. Existing causal discovery methods for LLMs often rely on pairwise or iterative strategies, which fragment systemic relationships and struggle with scalability and higher-order dependencies. In this work, we introduce a unified framework for comprehensive causal graph discovery, leveraging the strengths of LLMs both with and without node metadata. Our approach comprises: (1) a prompt-based method using in-context learning (ICL) for full-graph generation when metadata is available, and (2) a data-driven method, causal_llm, for scenarios lacking such metadata. Empirical evaluations across benchmark datasets, including Asia, Sachs, and DREAM, demonstrate superior edge accuracy—up to 40% improvement—alongside significant gains in inference speed and fairness-critical precision. Building on these insights, we propose SYNC (Synergizing Generative Knowledge with Data for Causal DAG Discovery), a hybrid framework that combines LLM-generated priors with domain-specific constraints to guide reinforcement learning agents. SYNC achieves notable improvements in accuracy (≈ 117%) and runtime efficiency (≈ 42%) over leading baselines like NOTEARS and RL-BIC, while maintaining robust performance across mixed data types and large-scale graphs (≥ 70 nodes). This work underscores the potential of LLMs as foundational tools for scalable, interpretable, and domain-adaptive causal discovery.
| Item Type: | Thesis (Masters) |
|---|---|
| Additional Information: | Supervisor: Dr. Kripabandhu Ghosh1 ; Dr. Anirvan Chakraborty2. |
| Uncontrolled Keywords: | Large Language Models, Synergizing Generative Knowledge, Lens of Casualty |
| Subjects: | Q Science > QA Mathematics |
| Divisions: | Department of Mathematics and Statistics |
| Depositing User: | IISER Kolkata Librarian |
| Date Deposited: | 07 Jan 2026 05:02 |
| Last Modified: | 07 Jan 2026 10:16 |
| URI: | http://eprints.iiserkol.ac.in/id/eprint/1979 |
Actions (login required)
![]() |
View Item |
