In The AI Toolkit for Librarians, I walk through four runs of the same deep-research prompt using ChatGPT 4o, Grok 3, Gemini 2.5 Pro, and Perplexity Pro. This page adds what the book cannot by giving you the reports themselves, a quick comparison view, and ready-to-use materials for teaching and PD.

Open the Reports

Use the links below to view each full report and its trace or notes.

At-a-Glance Highlights

Rather than repeating numbers from the book, here are the standout strengths from each run.

  • ChatGPT 4o: Most comprehensive narrative and depth

  • Grok 3: Fastest turnaround and to-the-point summary

  • Gemini 2.5 Pro: Clean structure with helpful table formatting

  • Perplexity Pro: Strongest citation practice and source diversity, included video sources

How to Reproduce This Study

  1. Copy the original prompt from the book.

  2. Add two constraints before you run it: region and timeframe.

  3. Run the prompt in one tool, then another, with no further guidance.

  4. Export or copy the sources from each run so you can compare them.

  5. Record three notes: what it covered well, what it missed, and how usable the citations are.

Evaluation Rubric

Use this quick rubric to compare any two reports. Score each item from 1 to 5.

  • Relevance and focus on information literacy in K-12, academic, public, and special libraries

  • Citation quality and diversity of reputable sources

  • Recency of evidence and stated timeframes

  • Transparency of methods and trace or notes

  • Practicality of strategies and examples for librarians

  • Balance across sectors and viewpoints

  • Clarity of structure and readability

Classroom and PD Use

  • 10-Minute Pair Activity: Partners skim two reports, star three useful ideas, circle one weak spot, and share one “next action” for their library.

  • Source Check Exercise: Pick five citations from any report and label each as primary source, policy, scholarly, news, or vendor. Discuss what is missing.

Version Notes and Caveats

Tool behavior changes over time and by plan level. Results shown here reflect the versions and settings used during my tests. Always rerun with your own constraints and document what changed.