Supervision#

The supervision mechanism enables data quality management when working with external data sources. As an analyst, you can correct erroneous values while the system maintains the original data and ensures that provider corrections automatically flow through.

The supervision workflow#

Consider a typical scenario: you receive daily market data from a provider. Sometimes this data contains errors that need immediate correction for your reports. The supervision system allows you to:

  1. Apply manual corrections to fix errors immediately

  2. Continue receiving updates from the provider

  3. Have provider corrections automatically replace your manual fixes

  4. Maintain full audit trail of all changes

Making manual corrections#

Manual corrections can be applied in three ways:

>>> corrected_values = pd.Series([95.5, 96.0],
...                              index=[pd.Timestamp('2023-03-15'),
...                                     pd.Timestamp('2023-03-16')])
>>> tsa.update('market-prices', corrected_values, 'analyst@corp.com', manual=True)

The same manual corrections can be made through the web interface (tsview) or the Excel client - both automatically use manual=True when you edit values.

Your corrections appear immediately in the data. When the provider later sends corrected data through the normal update process, their fixes automatically replace your manual corrections.

Viewing data provenance#

The .edited method shows you which values are original versus manually edited:

>>> series, markers = tsa.edited('market-prices')

The returned series contains the current values, while markers is True where data was manually edited and False for provider data.

Understanding supervision status#

Every series has a supervision status in its metadata:

  • unsupervised: series updated only through normal updates (never with manual=True)

  • hand-crafted: series created and maintained entirely with manual updates (always with manual=True, whether from Python, Excel client, or web UI)

  • supervised: series containing both provider updates and manual corrections

This helps you quickly identify which series have manual interventions.

Supervision API Reference#

class mainsource(uri, namespace='tsh', tshclass=<class 'tshistory.tsio.timeseries'>, othersources=None)

API façade for the main source (talks directly to the storage)

The api documentation is carried by this object. The http client provides exactly the same methods.

Parameters:
  • uri (str)

  • namespace (str)

  • tshclass (type)

edited(name, revision_date=None, from_value_date=None, to_value_date=None, inferred_freq=False, _keep_nans=False)

Returns the base series and a second boolean series whose entries indicate if an override has been made or not.

Parameters:
  • name (str)

  • revision_date (Timestamp | None)

  • from_value_date (Timestamp | None)

  • to_value_date (Timestamp | None)

  • inferred_freq (bool | None)

  • _keep_nans (bool)

Return type:

Tuple[Series, Series]

supervision_status(name)

Returns the supervision status of a series. Possible values are unsupervised, handcrafted and supervised.

Parameters:

name (str)

Return type:

str