Unraveling the Influence of Training Data and Internal Structures in Large Language Models for Enhanced Explainability (Student Abstract)

Li, Lingfang; Sen, Procheta

doi:10.1609/AAAI.V39I28.35268

Unraveling the Influence of Training Data and Internal Structures in Large Language Models for Enhanced Explainability (Student Abstract)

Lingfang Li, Procheta Sen

AAAI 2025 pp. 29407-29409

doi:10.1609/AAAI.V39I28.35268 /aaai/2025/li2025aaai-unraveling/

Abstract

Recent advances in deep learning have expanded the application of large language models (LLMs) across fields such as medicine, finance, and education. Understanding the mechanisms underlying these models is essential to mitigate issues like hallucinations and bias. This study provides deep learning practitioners with insights into how specific training data points and internal structures influence model behaviour. Using influence functions and mechanistic interpretability, we will analyze the impact of data on model predictions across various tasks. Preliminary findings indicate that semantic search techniques, such as FAISS, enable efficient identification of influential training points in GPT-2 small. Future work will extend these methods to additional tasks and more complex models, with a focus on further elucidating LLM structures to improve interpretability.

PDF AAAI Semantic Scholar

Cite

Text

Li and Sen. "Unraveling the Influence of Training Data and Internal Structures in Large Language Models for Enhanced Explainability (Student Abstract)." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I28.35268

Markdown

[Li and Sen. "Unraveling the Influence of Training Data and Internal Structures in Large Language Models for Enhanced Explainability (Student Abstract)." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/li2025aaai-unraveling/) doi:10.1609/AAAI.V39I28.35268

BibTeX

@inproceedings{li2025aaai-unraveling,
  title     = {{Unraveling the Influence of Training Data and Internal Structures in Large Language Models for Enhanced Explainability (Student Abstract)}},
  author    = {Li, Lingfang and Sen, Procheta},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {29407-29409},
  doi       = {10.1609/AAAI.V39I28.35268},
  url       = {https://mlanthology.org/aaai/2025/li2025aaai-unraveling/}
}