Skip to Main Content

Systematic Reviews and Related Evidence Syntheses

Responsible Use & Potential Tools

Artificial Intelligence (AI) in Evidence Synthesis

This guide is based on ongoing testing of multiple AI and automation tools by experienced methodologists. AI can help increase efficiency and automate certain tasks in evidence synthesis — but it cannot replace human judgment or oversight. These findings align with the recent Artificial Intelligence (AI) Methods in Evidence Synthesis webinar series by Cochrane. We share this resource to support responsible exploration of AI tools and workflows. Use with care, transparency, and critical thinking.

Additional recommendations for responsible use:

  • Use AI tools to support - not replace - critical judgment and domain expertise.
  • Ensure outputs are transparent, reproducible, and aligned with evidence synthesis standards (available below in Explore More).
  • Validate AI-generated content before including it in your review.
  • Follow publisher and journal guidelines regarding AI use and disclosure.
  • Respect institutional data privacy and security requirements.
  • Clearly document how and where AI was used in your workflows.

Workflow Stages & Potential Tools

Plan

Identify

Extract & Evaluate

Combine, Summarize &
Share

Frame research questions

Microsoft CoPilot

Google Gemini

Generate search terms

Gemini, PubReminer,

Yale MeSH Analyzer

Auto-extract from PDFs

ChatPDF

Meta-analysis

Meta-mar

Summarize the literature

Google Notebook LM

Citation searching

CitationChaser

Research Rabbit

RCT evaluation

Robot Reviewer

Writing assistants

Grammarly

Project/Meeting notes

Microsoft Copilot

Screening/Deduplication

Covidence

*uses active learning for study

ranking and auto-deduplicates

Bias-assessment visualization

robvis

Multi-function

(search, summarize, report)

Ellicit, Consensus,

Perplexity

Potential Issues with AI tools

Large Language Models (LLMs) for Designing Searches

They can help find synonyms, related organizations, conference names, and terms in other languages. However:

  • Results vary between prompts and platforms, reducing reproducibility. 
  • Risk of hallucinations (plausible but incorrect info) and factual errors. 
  • Subscription-based database content is often inaccessible, so key studies may be missed. 
  • Always verify outputs, and if in doubt, consult your librarian.

Multi-Function Products 

Some tools (e.g., Elicit) claim to handle the full review process. In practice, results can vary widely—even with identical prompts—and important studies may be missed (Bernard et al., 2025).

  • Tools like Consensus, Elicit, and Perplexity pull mainly from open-access sources (e.g., Semantic Scholar, the web).
  • Proprietary databases content (e.g., Ovid, Web of Science, etc.) will often be absent.

Example Prompts and Use Cases

Using LLMs to Frame a Research Question

LLMs can help researchers translate a broad interest into a structured research question using common frameworks like PICO (Population, Intervention, Comparator, Outcome), PEO (Population, Exposure, Outcome), or PCC (Population, Concept, Context).

Use Case: 

A public health researcher wants to study the impact of urban green spaces on mental health. They prompt an LLM with:

“Please help me frame a systematic review question on how green space exposure influences mental health outcomes, using PICO.”

The model suggests: 

  • Population: Adults living in urban environments.
  • Intervention/Exposure: Access to or time spent in green spaces.
  • Comparator: Adults with limited or no access to green spaces.
  • Outcome: Mental health outcomes such as depression, anxiety, or well-being.

This gives the researcher a structured starting point, but the researcher must adapt it by refining the population, clarifying exposures, prioritizing outcomes, and aligning with project scope.

Tip:

The prompting process is iterative. If the LLM’s response is vague or off-track, add more context (e.g., specify age groups or study designs) or rephrase the request.

Key point:

AI can accelerate idea generation, but the final framing requires human expertise for accuracy and reproducibility.

Use AI to Create a Table of Related Reviews

Objective: Use Google NotebookLM (or a GPT-based tool) to generate a comparative table summarizing key aspects of review articles.

Step-by-Step Instructions: 

1. Log in to Google NotebookLM using your institutional NetID and password.

2. Add content by uploading article PDFs.

3. Enter this prompt:

Create a table with the following columns, and one row per article: 
​​​- ​First author’s last name and publication year. 
- Type of Review.
- Eligibility criteria.
- Databases searched.
- Years covered by the search.

4. Review the AI - generated table for accuracy and missing data. Revise the prompt if needed to improve clarity or add context.

5. Click "Save to Note" of you want to keep the output in your Notebook.  

Use Gemini or Copilot to Generate Search Terms for a Research Question

Objective: Use a conversational AI (Gemini or Copilot Chat) to brainstorm search terms for a database search.

Step-by-Step Instructions:

1. Open your preferred AI tool (Gemini, Copilot, ChatGPT, etc.).

2. Enter your prompt. Here is an example:

I'm conducting a systematic review on the effectiveness of mindfulness interventions for reducing stress in healthcare workers. Please generate a list of relevant keywords I could use in a library database search. Organize them by concept (e.g., population, intervention, outcome).

3. Review the output. Look for:

  • Suggested synonyms and variant phrases.
  • Any incorrect, vague, or overly broad suggestions.
  • Always verify any suggested MeSH terms.

4. Refine or follow up with additional prompts like:

  • Include British and American spelling variations.
  • Turn this into a sample Boolean search string for PubMed.

Use Yale MeSH Analyzer to Identify Controlled Vocabulary

Objective: Analyze MeSH terms across multiple relevant articles using the Yale MeSH Analyzer (an automation tool).

Step-by-Step Instructions:

1. Go to Yale MeSH Analyzer.

2. Enter your PubMed IDs into the input box. Here are sample IDs to experiment with:

35012345, 34789765, 34011229, 33567890

3. Click “Go.”

4. Examine the resulting table, which includes: Article titles, MeSH terms, and more.
            Note: To display abstracts, select this option on the search page.

5. Reflect:

  • Which MeSH terms are consistently used across your articles?
  • Are there any that surprise you or reveal new angles (e.g., population focus, intervention type)?

A similar activity can be done using PubReminer which generates frequency tables for keywords and MeSH terms.


Explore More