Harnessing ChatGPT for Thematic Analysis: Are We Ready?

ChatGPT (OpenAI) is an advanced natural language processing tool with growing applications across various disciplines in medical research. Thematic analysis, a qualitative research method to identify and interpret patterns in data, is one application that stands to benefit from this technology. This viewpoint explores the use of ChatGPT in three core phases of thematic analysis within a medical context: (1) direct coding of transcripts, (2) generating themes from a predefined list of codes, and (3) preprocessing quotes for manuscript inclusion. Additionally, we explore the potential of ChatGPT to generate interview transcripts, which may be used for training purposes. We assess the strengths and limitations of using ChatGPT in these roles, highlighting areas where human intervention remains necessary. Overall, we argue that ChatGPT can function as a valuable tool during analysis, enhancing the efficiency of the thematic analysis and offering additional insights into the qualitative data. While ChatGPT may not adequately capture the full context of each participant, it can serve as an additional member of the analysis team, contributing to researcher triangulation through knowledge building and sensemaking.


Introduction
Thematic analysis is a method to analyze qualitative data, commonly obtained through semistructured interviews or focus groups, with the aim of identifying and interpreting patterns of meaning or themes within the data [1].As a method, thematic analysis is inherently flexible and dependent on the researcher's underlying philosophical assumptions [2].For instance, positivist approaches may place greater emphasis on coding reliability, while interpretivist approaches may place more significance on reflexivity and the researcher's role (including subjectivity) in knowledge production [2].Accordingly, thematic analysis may be well-suited to meet varying research needs and requirements [3].While there are multiple methods for thematic analysis, Braun and Clarke's six phases of thematic analysis is one of the most widely used approaches (see Figure 1) [1].Given the flexibility of thematic analysis, there is room for creativity when engaging with the data and exploring tools that may aid the researcher's analytic process.With the increasing adoption of natural language processing (NLP) in healthcare research, such as diagnostic evaluation of electronic health records and the prediction of clinical outcomes based on consultation notes [4][5][6], researchers have begun to explore if there is space for Artificial Intelligence (AI) within the domain of qualitative research.To date, several AI-based tools, such as AILYZE and MonkeyLearn, are available to aid researchers in conducting thematic analysis [7,8].For instance, AILYZE is able to summarize interview transcripts, provide suggestions for themes, and extract relevant quotes for each theme [7].Nevertheless, full access to these tools often requires subscription payments.
In November 2022, OpenAI released version 3.5 of Chat Generative Pre-trained Transformer (ChatGPT-3.5), a large language model-based chatbot capable of performing a wide range of text-based tasks based on context and past conversations (e.g., summarizing research articles, answering domain-specific questions, generating outlines for manuscripts) [9].ChatGPT-3.5 is the chatbot adaptation of GPT-3.5 and is specifically optimized for interactive conversations, though GPT-3.5 and ChatGPT-3.5 share the same foundational model.ChatGPT-3.5 is able to process a request and provide a response within a combined limit of 4,096 tokens (i.e.textual units, equivalent to approximately 3,000 words in English), typically within a few seconds and free of charge [9,10].
Due to its availability to the public and free-to-use model, there has been a proliferation of discussion in the scientific community about incorporating GPT and ChatGPT into various aspects of research, including literature review, data processing, and manuscript writing [11,12].Qualitative studies have begun exploring the use of GPT and ChatGPT for conducting various aspects and types of qualitative analysis, from transcription cleaning to theme generation via thematic analysis [13][14][15][16][17][18][19].Table 1 summarizes these studies by describing how GPT and ChatGPT were utilized in the analysis process, the main findings, and the challenges faced during the process.While all these studies are in preprint form and some are awaiting formal peer review, they provide an early glimpse into the feasibility of harnessing ChatGPT as an assistive tool when conducting qualitative analysis.
Whereas these previous papers have focused on the broader use of (Chat)GPT in thematic analysis, its integration into medical research has yet to be investigated.In this viewpoint, we therefore explore the utilization of ChatGPT for thematic analysis in the medical domain, while addressing the unique challenges that arise within a medical context.We begin by assessing the use of ChatGPT for generating codes based on an interview transcript, followed by extracting themes from a list of generated codes.Subsequently, we utilize ChatGPT for cleaning quotes for manuscript preparation.Finally, we use ChatGPT to generate interview transcripts, which may be used for various academic and educational purposes.For each application, we identify the areas where human intervention may still be required.GPT-3.5 Turbo was able to provide themes with synthetic descriptions.However, some inferred themes that were not considered relevant by human researchers and ChatGPT missed out on themes that were reported by human researchers.
• Interviews had to be divided into chunks due to token limit.ChatGPT was able to summarize and balance opposing ideas but tended to express ideas using descriptive terms at a lower level of abstraction compared to human researcher.
• Hallucination (e.g., made up information in summary of texts) • Codes inadequately captured content of transcript • Unproductive repetition of output • Inappropriate use of terms (e.g., "we can form some qualitative analyses") Tabone & de Winter (2023, preprint) [17] Netherlands GPT-3.5 Turbo and GPT-4-0613 Study utilized GPT to i) conduct sentiment analysis, ii) provide meta-summaries of interviews and iii) identify differences between two thinkaloud transcripts.
Ratings (r=.98) and summaries generated with GPT-3.5 were strongly correlated or generally in-line with those generated by human researchers.GPT-3.5 was also able to summarize descriptive differences between two transcripts.
• Prompt dependent (e.g., modified prompt increased correlation of ratings) • Summary by GPT-4.0 was richer and touches on more facets than GPT-3.5 Turbo, however, some topics that were identified did not emerge in content analysis conducted by humans.

GPT-3
Study utilized GPT to conduct deductive coding using an expert-developed codebook.
• The model occasionally produced incorrect labels.
Note: GPT, Generative Pre-trained Transformer; UK, The United Kingdom; USA; The United States of America

Utilizing ChatGPT in Thematic Analysis
Given ChatGPT's ability to handle large textual data and provide sets of meaningful codes and themes, as demonstrated by De Paoli [13], ChatGPT has the potential to improve the efficiency of the thematic analysis process.As thematic analysis is typically conducted using a cyclical and iterative manner (e.g., data collection and data analysis should occur concurrently, with insights from the analysis informing subsequent rounds of data collection and vice versa) [21], being able to digest and process large amounts of information efficiently (e.g., through requesting a summary of an interview or generating an initial set of codes to help breakdown a transcript) can be helpful to researchers streamline this cyclical process.
Beyond that, efficiency during this process may translate into lower research costs due to fewer hours spent on analysis.
To illustrate the various ways ChatGPT can be employed for thematic analysis, we used ChatGPT-3.5,which is free of charge.We have also used ChatGPT-4.0,which currently requires a fee, to see whether the results would improve when using a newer version of ChatGPT.Analyses were done on a transcript of the first episode of "Diabetes Discussion -A Diabetes UK Podcast", in which two guests share their experiences of living with diabetes [22].

Coding the transcript
Our starting point is to investigate the capability of ChatGPT to code transcripts directly.To this end, we made the following request in ChatGPT: The following is a transcript of an interview for a scientific paper focused on experiences of living with diabetes.Label the text by codes as is done in thematic analysis.Give the codes in the following format: CODE -First words of the sentence(s) that was/were labeled with this code

[transcript]
A section of the coded transcript by ChatGPT-3.5 is displayed in Figure 2. ChatGPT-3.5 successfully identified multiple codes in the transcript within a single answer, and the corresponding codes match the textual content.However, as the transcript progresses (Supplementary Figure 1), the coding by ChatGPT-3.5 becomes less detailed.While the subsequent answer contains multiple topics, ChatGPT-3.5only assigned a single code, resulting in a loss of information in the coded output.Additionally, the second interview question (see Supplementary Figure 1) is also coded, which is not a common practice in thematic analysis.
Using ChatGPT-4.0for the same analysis results in a noticeable improvement in the output (Supplementary Figure 2).Not only does ChatGPT-4.0ignore the transcript of the sections corresponding to the interviewer, but it also captures more details of the transcript.ChatGPT-4.0assigned five different codes for the second question (as opposed to one by ChatGPT-3.5),resulting in a set of codes that give a more complete picture of the transcript.ChatGPT demonstrates a promising ability to code transcripts, but its performance depends on the GPT model utilized.Consequently, a human researcher is still necessary to review the codes generated and ensure that the codes appropriately capture all essential data.
Furthermore, while the codes generated by ChatGPT sufficiently describe important concepts within the transcript, initial codes generated by human researchers during data analysis may continue to evolve to become more specific or reworded to better capture the data as new transcripts are being analyzed.Given ChatGPT's limited context window, ChatGPT may not remember the full text of an interview transcript and/or the codes it previously generated, resulting in a loss of contextual understanding.For these reasons, a human researcher will be required to consolidate the codes generated to ensure that the codes adequately capture concepts or patterns of interest within the context of the whole dataset.

Extracting themes from codes
Another way that ChatGPT may be utilized is to extract the themes and subthemes from the generated codes.These codes may be either obtained from ChatGPT or a human analyzer.
We used the following request: The following codes were obtained via coding of a transcript.Please identify the overarching themes and subthemes as is done in thematic analysis.These themes should have as little overlap as possible, and will be used in a scientific paper focused on experiences of living with  The resulting themes are shown in Figure 3; the subthemes are tabulated in Supplementary Table 1.The themes and subthemes were derived from 81 codes obtained through coding of the Spotify transcript.Both ChatGPT and the human analyzer identified five unique themes, though their analyses had notable differences.For example, ChatGPT identified the "Diet and Nutrition Management" theme, a subject that the human analyst neither classified as a theme nor a subtheme (Supplementary Table 1).
We conducted several rounds of analysis with ChatGPT-3.5 by resubmitting the same prompt, which led to interesting variations in the identified themes.For example, in the second analysis round, ChatGPT identified the "Pregnancy and Diabetes" theme (Supplementary Table 1), whereas pregnancy only emerged as a subtheme in ChatGPT's first analysis round.The human analyzer, in contrast, did not identify pregnancy as a theme nor as a subtheme in their analysis.
It is impossible to state which of these themes more accurately reflects the essence of the interview, as there is no absolute truth in thematic analysis.Instead, we see the identification of different themes as an advantage.After all, a greater diversity in themes indicates that the codes were interpreted from different angles, thereby adding more layers to the overall analysis.In this framework, ChatGPT should be viewed as an additional team member when doing thematic analysis by offering fresh perspectives and proposing alternative interpretations of the identified codes.
Similar to the coding process, all themes and subthemes generated by ChatGPT should still be reviewed by human researchers to ensure that the themes and subthemes generated are aligned with the research question(s) and essential data have been appropriately captured by the themes.

Cleaning quotes
Another potential use of ChatGPT is to clean quotes from the interview transcript for manuscript preparation.We used the following request: Clean the following transcript so it may be used as a direct quote for an academic paper: Omit all text that is not essential for the main message.Any altered or inserted words must be shown between square brackets, "[ ]", and omitted text must be replaced by three dots, "…".Include stop words, pauses, and words such as "I think..,", "uhh", and "I mean...", so it looks like an authentic transcript.
This request was made without any additional instructions, such as an example transcript or a list of themes or codes.A snapshot of the generated transcript is shown in Textbox 1.The resulting transcript matches the requested format and uses the stopwords that were asked for, making it similar to an actual conversation between two people.However, some differences exist between the generated transcripts and those from real-life interviews.The ChatGPTgenerated transcript has a very direct question-answer structure, with all answers being onpoint and of similar length.In contrast, a real-life interview is often more organic.For example, the interviewee may not understand the question and give answers of various lengths, and the interviewer may go in more depth before moving to the next question.Given these advantages, we envision several potential applications for the ChatGPTgenerated transcripts.Firstly, these transcripts may be used as instruction material for students learning thematic analysis.A second approach worth investigating is to use the generated transcripts as a training set for NLP models, particularly in topic modeling.As reallife transcripts are often limited or hard to get, ChatGPT offers a practical way to expand the dataset, thereby exposing the model to a larger diversity of text.
In short, while ChatGPT-generated transcripts are not (yet) a perfect substitute for real ones, they offer a promising alternative for various academic and educational applications.

Challenges when using ChatGPT for thematic analysis
While ChatGPT is a promising assistive tool for thematic analysis, previous studies have identified challenges when working with ChatGPT (see Table 1) [13][14][15][16][17][18][19].Major challenges relevant to thematic analysis include hallucination (i.e., responses produced by the system that is not justified by the data used), the output being prompt-dependent (e.g., prompts requesting the same output but phrased differently will lead to different outputs), and missing themes or codes previously reported by researchers [13].Similarly, we encountered several challenges when utilizing ChatGPT to conduct thematic analysis.When working with patient data, the primary concern is data confidentiality.Inputs to ChatGPT may be used as training data to improve their services, and network activities may be shared with third parties [23].For this reason, uploading sensitive information, such as patient interview transcripts, to ChatGPT should be avoided.This precaution restricts the use of ChatGPT for coding (i.e., Phase 2 in Figure 1) unless the transcript holds no confidential information.
A more practical challenge is that ChatGPT has a word limit for each prompt, which may prevent users from inputting full transcripts and very long lists of codes or quotes.One potential solution is to split the input.However, as discussed above, ChatGPT has a limited context window, so it may forget the earlier parts of the interview transcript and/or codes it previously generated.As a result, ChatGPT may not be able to adequately capture patterns of ideas or concepts within the context of the whole dataset.When coding, researchers consider existing knowledge (e.g., the research question or current information about the topic), knowledge obtained through data collection (e.g., interviews and field notes), and existing codes from previously analyzed transcripts.Without further information beyond the input, ChatGPT may adopt a narrower lens and generate results that are highly specific to a singular transcript.Accordingly, at this point in time, it is still essential for human researchers to collate codes generated for each transcript and review them within the context of the study.
In the context of understanding text, ChatGPT, though advanced, may not capture every nuance that a human analyst would pick up [24].In certain instances, ChatGPT may overlook underlying emotions or implicit themes that would otherwise be evident to human analyzers.
It is thus important to review the output of ChatGPT to ensure that all essential aspects of the transcript are captured in its thematic analysis.Beyond that, we also found that ChatGPT sometimes excludes existing codes or introduces new codes when generating themes.Hence, it is advisable to double-check whether all codes have been correctly assigned to the identified themes and subthemes.
Finally, ChatGPT may give different answers to the same questions, leading to nonreproducible results.Yet, in the context of thematic analysis, we do not see this variability as a drawback, as humans would also generate different results when doing thematic analysis.
Instead, the different responses by ChatGPT may be seen as an opportunity because it may provide new insights that were not captured during the first round of thematic analysis.

Conclusions and Recommendations
ChatGPT has the potential to enrich the space of qualitative research.In our investigation, ChatGPT demonstrated its ability to code interview transcripts, generate themes from a list of codes, clean quotes for manuscript preparation, and generate unique transcripts for education and training purposes.Nevertheless, limitations such as the inability to manage multiple transcripts and not fully capturing nuanced data essential to the research question necessitate the involvement of human researchers to collate and review the output generated.At this stage, ChatGPT requires human-AI collaboration, where researchers have to remain in the loop to intervene when necessary [13,15].We present Figure 5 to show the opportunities available for ChatGPT to assist in thematic analysis and areas where human involvement is still required.
Given the need for considerable interaction between ChatGPT and human researchers, it will be more valuable to recognize ChatGPT as an additional member of the analysis team, contributing to researcher triangulation by adding to knowledge building and sensemaking rather than a replacement for human researchers.However, with the ongoing progress in the field of natural language processing, the role of ChatGPT in qualitative research will evolve to remain mindful of the implications when working with AI tools that store data, regardless of purpose [25].
In summary, ChatGPT has the potential to function as a valuable tool during analysis, enhancing the efficiency of the thematic analysis and offering additional insights into the qualitative data.While the current viewpoint remains an exercise to investigate the potential feasibility of using ChatGPT for thematic analysis, findings from the investigation can serve as a starting point for future studies that intend to further push the boundaries of AI involvement within qualitative research.

Figure 2 :
Figure 2: Transcript coded by ChatGPT-3.5.A larger portion of the coded transcript is given in Supplementary Figure 1.

Figure 4 .
Figure 4. Direct quotes generated by ChatGPT-3.5 (left) and ChatGPT-4.0(right).The yellow highlights indicate the text that has been included in the quote, whereas the red highlights indicate the text that has been rephrased by ChatGPT without proper use of square brackets.

Textbox 1 .
Snapshot of the transcript generated by ChatGPT-3.5.The full transcript is given in the Supplementary Information.Despite these differences, the ChatGPT-generated transcripts have multiple advantages.By slightly adjusting the ChatGPT request, the generated transcripts may focus on any topic of interest or feature different respondent characteristics.This adaptability of ChatGPT allows for the quick generation of a large and diverse collection of transcripts.Furthermore, generating transcripts with ChatGPT eliminates any privacy-and confidentiality concerns, which is a common issue with actual interviews.
rapidly.This fast-paced development, in combination with the growing use of ChatGPT in research, necessitates further discussions regarding the use of ChatGPT in qualitative research.For example, how should ChatGPT's contribution be acknowledged, and what are the best practices regarding prompt formulation?Another important point of consideration is the confidentiality of the data, especially when working with patient data such as interview transcripts.The recent ChatGPT data breach in March 2023 should encourage researchers

Figure 5 .
Figure 5. Opportunities for the utilization of ChatGPT during the thematic analysis process and where human intervention is needed.

Table 1 .
Summary of studies that utilized ChatGPT/GPT for different phase(s) of thematic analysis Sure, uhh, I was diagnosed about five years ago.It was, umm, I think it was in my early 40s.I have type 2 diabetes.Yeah,I mean, it was a bit of a shock, you know?I didn't really see it coming.Oh, absolutely.Well, I remember feeling, you know, a mix of emotions.I mean, first there was this sense of disbelief, like, "Is this really happening to me?" And, uh, I guess there was some fear too, you know, about what it would mean for my future.I think I was also a bit confused about how it all worked, you know, managing blood sugar levels and all that.It sounds like you've made some positive changes to manage your diabetes.Can you tell me about any specific challenges you've faced in your daily life as a result of living with diabetes?I can see how those situations can be tough.How about the emotional aspect of living with diabetes?Have you experienced any emotional ups and downs related to your condition?Oh, definitely.I mean, there have been times when I've felt a bit down about it.You know, I think it's natural to have those moments of, like, "Why me?" or feeling, you know, a bit overwhelmed.And then there are moments when I'm proud of myself, you know, when I've managed to keep my blood sugar levels in check or I've hit my exercise goals.It's a bit of a rollercoaster, I mean, emotionally.Absolutely, it's important to acknowledge both the challenges and the victories.How has your support system been throughout this journey?Have you received support from family, friends, or healthcare professionals?Yeah, I'm really grateful for the support I've received.Umm, my family has been, you know, quite understanding and they've tried to make, like, healthier meals when I'm around.And my friends, well, they've been pretty great too, you know, making sure there are options that I can eat if we go out.And my doctor, uh, they've been really helpful in explaining things to me and, you know, answering my questions.So, yeah, I think having a good support system makes a big difference.It sounds like you have a strong network to lean on.Looking back, what advice would you give to someone who has recently been diagnosed with diabetes?I would say, you know, take it one step at a time.It can be overwhelming at first, but you'll learn as you go.And, uh, don't be afraid to ask questions, whether it's from your doctor or, you know, support groups online.And, umm, also, I think it's important to be kind to yourself.There will be days when things don't go as planned, and that's okay.You're doing your best, and that's what matters.Great advice.Finally, can you share any positive aspects that have come from your experience with diabetes?Have there been any unexpected silver linings?Oh, yeah, definitely.I mean, I think, you know, having diabetes has made me more conscious of my overall health.I've, umm, I've learned a lot about nutrition and exercise that I might not have paid as much attention to before.And, you know, it's also brought me closer to some people in my life.Like, my friends and family, they've really shown their support and it's been, you know, heartwarming.So, yeah, I think there have been some positive aspects despite the challenges.That's wonderful to hear.Thank you so much for sharing your experiences with us today.Your insights will certainly contribute to our understanding of living with diabetes.You're welcome.I'm glad I could help.If my experiences can, you know, help someone else going through the same thing, then that's great.Uhh, thanks for having me.Thank you again, and take care.
Interviewer: 00:54It's completely understandable to feel that way.Managing diabetes can be complex.How has your daily routine changed since your diagnosis?Patient 1: 01:02 Oh, my daily routine has changed quite a bit, actually.I mean, now I have to, you know, check my blood sugar levels regularly, and, umm, I have to be more mindful of what I eat.So, I've had to make some adjustments to my diet, like cutting back on sugary stuff and, uhh, watching my portion sizes.And, uh, I've also started incorporating, you know, regular exercise into my routine.It's been a bit challenging to juggle everything, but I think I'm getting the hang of it.It's a bit frustrating, I mean, you know, I'm trying my best here.