Event and Case Evolution Over Time

Dynamic Updates and Recalculation

Events and Cases are dynamically updated as new information becomes available. Rather than recomputing on every new article, recalculation is only triggered when the set of top 10 articles (ranked by SESAMm's Article Reliability Score) changes. This happens when, for example, a new article enters the top 10, or an existing one is annotated and excluded or re-included.

The pipeline runs daily, ensuring that titles, summaries, and metadata always reflect the most relevant available information.

Stability Over Time

A key quality of the methodology is that Cases and Events remain stable as new documents arrive. To measure this, we track daily changes in clustering using two complementary metrics:

Adjusted Rand Index (ARI): A standard measure of similarity between two clusterings, ranging from 0 (no similarity) to 1.0 (identical). Each day, we compare the current grouping to the previous day's grouping to quantify how much the clustering has shifted.

Percentage of static Cases/Events: The share of Cases or Events whose composition is completely unchanged compared to the previous day.

Results

Evaluated on a representative example (Lactalis, 2019–2024), the clustering demonstrates strong stability over time:

  • Case-level stability: The ARI remains very close to 1.0 throughout the entire period, with only occasional short-lived dips. The percentage of static cases stays above 90% the vast majority of the time, meaning that on any given day, fewer than 10% of cases are affected by the arrival of new documents.
  • Event-level stability: Events are even more stable than cases. After an initial settling period at the start of the observation window, the percentage of static events remains near 100%. This is expected by design. Once an Event is formed around a set of documents, newly arriving articles relate to recent developments and do not reach back to disrupt older Events.

This stability gives users a reliable and consistent view of ESG controversies over time, ensuring that historical Cases and Events are not constantly reshuffled as new information comes in.