Culture

ProPublica’s embrace of large language models echoes what I’ve been saying all along

This distinction between “AI” as a vague, catch-all term and the specific type of advanced pattern-finding model actually at play here matters immensely.

Caleb Jacobo

March 14, 2025

1857

SOURCEHuman Patterns

Recently, ProPublica revealed how it uses what it often calls “AI” in its investigative work. More precisely, ProPublica is deploying large language models (LLMs) to analyze data for its stories. This distinction between “AI” as a vague, catch-all term and the specific type of advanced pattern-finding model actually at play here matters immensely. In many contexts, people use “AI” to describe everything from machine learning classification algorithms to generative image models—and that often leads to confusion.

Although I’ve been deeply exploring large language models ever since the earliest publicly available tools were released—both as a user and on the development side—I only began articulating my stance in a public, structured way once I rebooted my blog in September 2024. By February 22, 2025, I had published a more comprehensive overview of how I view LLMs: as tools that can amplify meaningful analysis and streamline how we process, organize, and communicate complex information, provided they’re used responsibly and transparently. Below, I share my thoughts on ProPublica’s recent article and how it aligns with my perspective as a high-masking autistic thinker who relies on LLMs to refine communication and draw deeper insight from my personal troves of data.

1. Key takeaways from the ProPublica discussion

Responsibility and transparency

ProPublica acknowledges that using LLMs can be contentious. They’re clear on precisely how these tools fit into their process: summarizing or categorizing large data sets, never supplanting the core journalistic tasks of sourcing, interviewing, and fact-checking. As they put it, “Of course, members of our staff reviewed and confirmed every detail before we published our story … which remains a must-do even in the world of AI.” Their openness helps cut through the confusion that arises when “AI” is casually invoked without specificity.

Pattern recognition and time savings

ProPublica’s reporters used an LLM to scan thousands of grants flagged by Sen. Ted Cruz for supposedly containing “woke” themes. The system rapidly identified which terms triggered the “woke” label (e.g., “diversify” or “female”). While it sped up the review, they still manually verified everything. “The story was a great example of how artificial intelligence can help reporters analyze large volumes of data and try to identify patterns,” the ProPublica article notes.

Not a shortcut to truth

LLMs can generate convincing but inaccurate information (sometimes called “hallucinations”). ProPublica addressed this by structuring clear prompts and double-checking all outputs. By “making sure to tell the model not to guess if it wasn’t sure,” they mitigated the risk of the automated system inserting errors into the final story.

Broader investigative utility

Beyond the grants example, ProPublica has used LLM-based tools to sift through large volumes of audio/video from the 2022 school shooting in Uvalde, Texas, and to identify patterns among mental health professionals disciplined in Utah. In each case, the LLM aided human-driven investigations rather than replacing them. They emphasize, “We know full well that AI does not replicate the very time-intensive work we do … There’s a lot about AI that needs to be investigated, including the companies that market their products.”

2. Clarifying why definitions matter

People often treat “AI” as a sweeping term that implies human-level intellect or a monolithic technology. In reality:

Machine Learning: Refers to algorithms that learn patterns from data, such as classification systems for images or text.
Large Language Models (LLMs): A specific type of machine learning focused on understanding and generating text by detecting statistical patterns in enormous datasets of language.
Other Specialized Models: Image generators, speech-to-text systems, recommendation engines, etc., all exist under the broader machine learning umbrella but solve very different problems.

When these distinctions aren’t made, the conversation becomes muddled. For instance, the public may conflate an LLM’s text analysis with doomsday visions of fully autonomous “artificial intelligence.” That confusion leads to poor decision-making and mistrust, eroding the legitimacy of people and organizations who use LLMs in genuinely beneficial ways.

3. My perspective as an autistic systematizer

I’ve often described LLMs as an “external cognitive exoskeleton” for me—amplifying strengths I already have as a pattern-oriented thinker. But there’s an equally important dimension: refining how I communicate. Many autistic individuals, myself included, struggle with turning elaborate or idiosyncratic thought processes into language that others can parse easily. LLMs provide a scaffolding that helps me:

Shape complex or hyper-detailed ideas into a structured outline.
Detect logical gaps or ambiguities that might derail the reader’s understanding.
Polish overly technical or specialized language into something more widely accessible.

For those of us with communication challenges, LLMs can be a game-changer. They reduce the friction between having clear, internally consistent ideas and expressing them in a way that other people grasp immediately. If society reflexively dismissed these tools under a vague, negative label of “AI,” many individuals—particularly neurodiverse thinkers—would miss out on an indispensable resource.

4. Why ProPublica’s use of LLMs matters

ProPublica’s candor creates a mainstream reference point for how to integrate LLMs ethically. Many journalists and writers have been hesitant to admit they use advanced machine learning tools, fearing backlash about so-called “AI.” By being upfront, ProPublica demonstrates that with careful verification, transparency, and editorial oversight, LLMs can enhance journalism’s speed and depth while preserving its integrity.

From my vantage point, this is a validation of everything I’ve been saying and practicing. LLMs don’t make the final decisions—they are assistive tools. Much like aggregator software that helps journalists comb through digital archives, an LLM can amplify human skill but never replace it. Recognizing and naming it properly—rather than just stamping “AI” on everything—allows for more nuanced discussions about best practices and potential pitfalls.

5. The core principle: Tool neutrality, human responsibility

In a previous blog post, I discussed the neutrality of structured thinking. A technology’s moral weight hinges on the people who wield it. An advanced machine learning model can reveal groundbreaking insights or perpetuate bias, depending on how it’s trained, deployed, and overseen.

To responsibly use an LLM, it’s essential to:

Acknowledge Limitations: The tool may produce inaccuracies without warning.
Maintain Transparency: Disclose where and how it’s used.
Prioritize Data Privacy: Ensure sensitive data is handled securely.
Retain Human Control: Final editorial decisions belong to people—not algorithms.

6. My ongoing strategy

I’m continually refining how I deploy LLMs. One of my goals is to self-host a model for full data control, especially when working with personal or sensitive content. For now, I use available cloud-based services selectively and apply strong oversight.

Why I’m optimistic

Greater Acceptance: With respected institutions like ProPublica leading by example, we can move beyond all-or-nothing attitudes into constructive dialogue about LLM integration.
Better Collaboration: As more organizations share their methods openly, shared guidelines can maximize benefits while minimizing ethical risks.
Improved Accessibility: For cognitively diverse communities (including many autistic individuals), LLMs can reduce communication barriers, unlocking potential that benefits everyone.

Conclusion

ProPublica’s decision to clarify how it harnesses large language models for data analysis exemplifies the nuanced stance I’ve been advocating: LLMs are powerful pattern-finding systems whose impact—positive or negative—depends on the people designing, prompting, and verifying them. Their team’s open acknowledgment “that there’s a lot about AI that needs to be investigated” stands in contrast to the alarmist takes that either demonize or lionize “AI” indiscriminately.

For my part, I’ll continue documenting precisely how I use LLMs—both to leverage my autistic pattern-recognition abilities and to bridge communication gaps. My hope is that the journalism, writing, and research sectors will see that transparency, nuance, and correct definitions lead to more informed and responsible adoption of these technologies than any single-word label (“AI”) could ever accomplish.

Transparency Note: This post was structured and edited with the assistance of a Large Language Model (LLM). However, every idea, argument, and insight originates from my own thinking. The LLM is used solely to refine communication—never to generate artistic or literary works. (For more, see my Transparency Policy.)

FALL FUNDRAISER

If you liked this article, please donate $5 to keep NationofChange online through November.

[give_form id="735829"]

ProPublica’s embrace of large language models echoes what I’ve been saying all along

1. Key takeaways from the ProPublica discussion

Responsibility and transparency

Pattern recognition and time savings

Not a shortcut to truth

Broader investigative utility

2. Clarifying why definitions matter

3. My perspective as an autistic systematizer

4. Why ProPublica’s use of LLMs matters

5. The core principle: Tool neutrality, human responsibility

6. My ongoing strategy

Why I’m optimistic

Conclusion

FALL FUNDRAISER

COMMENTS

POPULAR

Trump may be blowing the economy up on purpose

Trump and Bukele defy Supreme Court as Maryland resident remains imprisoned in El Salvador

Trump-Musk plan to gut the IRS could cost US public $30 million per day...

Why Sanders’ call to ‘fight oligarchy’ resonates more than ever

It’s time for ‘unified groups’ to declare war against Trump’s unscrupulous retribution schemes