Authors: Alex Fawzi, Stefano De Paoli
As researchers at Abertay University, we’ve long recognised the power of qualitative inquiry. Methods like Thematic Analysis (TA) allow us to uncover rich, nuanced patterns in human experience from interviews and texts. At the same time, we’re also keenly aware of one of its biggest challenges: the sheer amount of time it takes. The process of manually reading, re-reading, and coding data is meticulous and rewarding, but can also be incredibly time-consuming.
With this barrier in mind, we saw the rise of Large Language Models (LLMs) and wondered: could we leverage this technology to reduce the burden of manual analysis? Our goal was to create a tool that could make the entire qualitative research process more accessible, manageable, and efficient, without sacrificing methodological rigour.
That initial idea evolved into TALLMesh (Thematic Analysis with Large Language Models), a new tool we’re excited to share with the research community.

A look at the academic landscape shows why such a tool is needed. In major social science fields like political science, published research is often dominated by quantitative methods, by a ratio of more than 3 to 1 in some top journals. While there are many reasons for this, the intensive labour cost of qualitative work is undoubtedly a significant factor that can deter researchers, especially those on tight deadlines.
Beyond the time commitment, a new technical barrier has emerged. While LLMs are remarkably good at natural language processing (NLP) tasks like summarisation and classification of texts, using them effectively requires programming skills, a familiarity with APIs, and the ability to craft efficient prompts. This digital divide risks leaving many brilliant researchers in the social sciences and humanities unable to benefit from these technological advancements.
Our Solution: TALLMesh
We wanted to bridge that gap. We are Professor Stefano De Paoli and Alex Fawzi from Abertay University. Building on Stefano’s foundational research into applying LLMs to thematic analysis, and with vital support from The British Academy, Alex wrote the code that would become TALLMesh.
TALLMesh is a free, open-source application with a simple graphical interface that guides you through the core phases of thematic analysis. It was designed for researchers who don’t have a background in programming, allowing you to go from raw interview transcripts to final themes without writing a single line of code. It uses Python and the popular Streamlit framework to connect to LLM providers like OpenAI and Microsoft Azure in a secure and robust environment.
The tool operationalises a multi-step process, allowing users to:
- Set up projects and upload interview transcripts.
- Perform initial coding, where the LLM helps extract codes, descriptions, and quotes.
- Run a code reduction process to intelligently merge duplicates into a clean “unique codebook”.
- Generate themes by sorting and aggregating the unique codes into meaningful patterns.
- Explore results through a variety of visualisations to better understand your data’s structure.

How is TALLMesh Different?
From the beginning, our development was guided by a commitment to methodological transparency, flexibility, and user-focused design. While several excellent AI-assisted tools exist, our approach with TALLMesh is different in a few key ways:
- A Foundation in Methodological Research: It was important for us that TALLMesh wasn’t just another wrapper for an LLM. It’s the direct implementation of a repeatable workflow developed through dedicated academic research, adapting the widely recognised six-phase model of Thematic Analysis for the LLM era.
- A Built-in Metric for Validity: We included a unique “Saturation Metric” feature. This function calculates what our research calls ‘Inductive Thematic Saturation’ (ITS) by measuring the ratio of unique to total initial codes. It gives researchers a crucial, quantitative check on the quality of the automated coding process, helping to assess if the analysis is reaching a point of saturation.
- User-Centric Design and Flexibility: The idea for TALLMesh came in-part from talking to other researchers. To ensure it was genuinely usable, we ran “think-aloud” user testing and used that direct feedback to refine everything from button labels to the overall workflow. We also included a Custom Prompt Editor page and adjustable model parameters, giving researchers the flexibility to move beyond the presets and tailor the analysis to your specific research questions.

Figure 3: Initial coding page with LLM settings
Current Status and What’s Next
TALLMesh is currently available as a beta open-source software on GitHub. We’re thrilled that it has been selected as an experimental pilot for the newly funded Horizon project GRAPHIA, where it will be further tested and enhanced with input from the wider SSH community.
We are actively working to build a user community and have created a Discord server for discussions and support. However, we know there’s more to do. Our next steps are focused on creating more comprehensive documentation and video tutorials, incorporating our latest methodological enhancements into the app, and exploring the integration of open-source LLMs to provide a cost-effective alternative to commercial models.
TALLMesh represents a significant step toward democratising the use of advanced AI in qualitative research. We believe it can empower researchers by automating tedious tasks, freeing you up to focus on what humans do best: higher-level interpretation and critical thinking. We invite you to explore the tool, provide feedback, and join us in shaping the future of qualitative analysis.
Find the Project on GitHub: https://github.com/sdptn/TALLMesh_multi_page
Join the Conversation on Discord: https://discord.gg/2TPQMmMx3a
For Further Reading
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative research in psychology, 3(2), 77-101.
De Paoli, S. (2024). Performing an Inductive Thematic Analysis of Semi-Structured Interviews With a Large Language Model: An Exploration and Provocation on the Limits of the Approach. Social Science Computer Review, 42(4), 997-1019.
De Paoli, S., & Fawzi, A. (2025). TALLMesh: a simple application for performing Thematic Analysis with Large Language Models. arXiv preprint arXiv:2504.13892.
De Paoli, S., & Mathis, W. S. (2024). Reflections on inductive thematic saturation as a potential metric for measuring the validity of an inductive thematic analysis with LLMs. Quality & Quantity, 59, 683-709.
Jacobs, A. M. (2017). The quantitative-qualitative divide in political science. In The Oxford Handbook of Political Science. Oxford University Press.
This post is part of the GRAPHIA series that presents use-cases and pilots of the GRAPHIA project OPERAS is part of.
