AI Workflows

Building a Continuously Updating News Intelligence Pipeline

Name: Currents News API
Availability: InStock
Author: Currents

Currents Team 06 Apr 2026 5 min read

We built a pipeline that converts continuous news ingestion into persistent dossiers, relationship tracking, and usable working memory for AI workflows.

ai agents news intelligence knowledge base workflows

News delivery is often treated as the end state of a news system. Articles are collected, ranked, and returned, and everything after that is left to the user or to a separate application. This is sufficient for simple reading. However, it is less useful for AI workflows that need to retain context, compare developments over time, and reason repeatedly about the same entities or events.

It is already known that language models can summarize documents and extract entities or relationships from text. However, the mechanism by which a raw news stream becomes durable working memory remains poorly specified. A feed can indicate what is new, but it usually cannot indicate what changed around an entity, which relationships remain active, or what context should persist.

Here, we built a continuously updating news intelligence pipeline on top of Currents to test whether ongoing ingestion, structured extraction, and persistent maintenance could produce a more useful state representation than a stream of articles alone. At the time of writing, the system is tracking 547 entity dossiers.

The problem

Raw articles are necessary, but they are not sufficient for persistent reasoning.

The same event often appears across multiple outlets with different framing. Entity naming varies across sources. Important developments arrive incrementally. In addition, the relationships that matter most are usually embedded in text rather than exposed as clean metadata.

As a result, a retrieval-first workflow tends to answer one question well:

What was published today?

Long-running AI systems usually need a different question:

What changed, for whom, and in relation to what?

That difference is operationally important. Without maintained state, the same context must be reconstructed repeatedly from raw articles.

Pipeline design

We organized the workflow into six stages:

Ingest → Extract → Compile → Relate → Index → Maintain

Ingest

We pull fresh articles from Currents on a recurring schedule. This provides timely external input in a stable format.

Extract

Each article is passed through a language model that identifies entities, events, relationships, and context. We do not force a rigid schema too early. Instead, the extraction layer is used to surface recurring structure before full normalization.

Compile

Each entity is assigned a persistent dossier. This is the key design choice. Rather than treating each mention as disposable output, the system merges new evidence into an existing record whenever possible.

This changes the unit of memory from article to entity state.

Relate

The system tracks relationships between entities as well as the entities themselves. This allows it to characterize dynamic interaction rather than isolated mention frequency.

Index

Compiled dossiers and relationships are written into a queryable index. This makes persistent context available to downstream workflows.

Maintain

A recurring maintenance pass handles pruning, deduplication, reconciliation, and health checks. This is less visible than extraction, but likely just as important. Without maintenance, duplicate entities accumulate, weak relationships persist, and stale records reduce selectivity.

Why dossiers matter

The main conceptual shift was moving from snapshots to memory.

A traditional article-driven workflow can identify what was said recently about a topic. A dossier-based workflow can support a more useful class of questions:

What changed around this entity over the last 30 days?
Which relationships are now more prominent?
Which themes are stable, intensifying, or deteriorating?
What context should an agent retain when this topic reappears?

These questions are more consistent with how long-running AI systems operate. Such systems do not simply need recent input. They need updateable state.

A simplified example

A simplified dossier might contain:

Federal Reserve

Type: Central Bank
Jurisdiction: United States
Key figure: Jerome Powell

Recent activity

held rates steady in the latest meeting
continued balance sheet reduction
signaled that future cuts remain data-dependent

Connected entities

Jerome Powell
US Treasury
major equity indices
gold markets
other central banks

The exact representation can vary. However, the functional role is the same: the dossier is a cumulative record rather than a one-time summary.

What changed after reading Karpathy

A later improvement followed from Andrej Karpathy’s note on using language models to build and maintain knowledge bases.

The useful insight was architectural. Raw material can remain in one layer, while the model incrementally compiles that material into a more structured knowledge layer.

We applied that logic in Hermes by formalizing a personal-wiki skill with three components:

links/ for raw source material
notes/ for observations and working notes
wiki/ for compiled, model-maintained knowledge

This separation reduced the tendency to treat every article or research session as disposable context. Useful outputs could instead be folded back into a longer-lived structure. We also maintained a master INDEX.md and periodic health checks for stale pages, contradictions, missing compilations, and broken cross-links.

Main observations

Several points became clear during implementation.

First, deduplication often matters more than model cleverness. If the entity layer is noisy, the intelligence layer is likely to become noisy as well.

Second, incremental updates appear more effective than repeated full rebuilds. Once state begins to accumulate, merging only what is new is cheaper and more stable.

Third, relationship extraction likely carries a large fraction of the long-term value, but it is also where drift appears fastest. It therefore needs explicit cleanup logic.

Finally, maintenance is not optional. The success of this approach is expected to depend on pruning, merging, validation, and coverage checks being treated as first-class operations.

Where Currents fits

Currents provides the ingestion layer in this architecture.

That matters because it allows the rest of the system to focus on extraction, organization, memory maintenance, and queryability rather than on source collection and normalization.

You can start here:

Closing

The original question was simple: if an LLM is continuously exposed to fresh news, can it build a more useful representation of the world than a list of headlines?

Our results suggest that the answer may be yes, provided that ingestion is coupled with extraction, persistent compilation, relationship tracking, and ongoing maintenance.

The implication is practical. News delivery alone is not sufficient for working memory. However, continuous ingestion combined with persistent dossiers appears capable of supporting a more stable and updateable knowledge layer for AI workflows.