Package overview#

Purpose#

The “timeseries refinery” is a powerful tool to build complete time series information systems.

It provides:

  • a versioned time series (compact) storage system with a simple yet powerful API (direct in python + REST)

  • an edition/supervision workflow for experts

  • computed versioned time series using a simple yet expressive dedicated formula language (think about Excel formulas, but simpler and specialised for time series) with a simple API

  • a cache system for the computed series

  • a central repository of time series (stored or computed)

  • rich metadata support and querying/filtering over various kinds of metadata

  • support for timezone aware and naive time series

  • a data-mesh distributed architecture allowing to connect to other refineries (for collaborative work) or other time series silos

  • a task manager and a mini-framework to handle regular ingestion of time series from external or internal sources, and also model runs

  • a tool to detect/watch the series lagging behind (because of failing sources)

  • a powerful low-code declarative dashboard system (specially targetted at energy comodity markets)

  • a rich web user interface allowing exploration of the catalog, visualisation and edition of series, formulas, caches, tasks, watchers and dashboards

It can be used as a building block for machine learning, model optimization and validation, both for inputs and outputs.

Usages#

The time series repository can be used from the user interface. It has also been prominently designed with a data analyst friendly Python level API (in addition to the standard REST API).

The refinery can be used:

  • as a standalone tool

  • as a framework to build your own applications on top

  • as a library that you can embed in your (Python) applications

It provides several extension points through a plugin system. The API can be extended, the formula system can receive new custom operators/functions.

It has been designed to cooperate nicely within any existing IT infrastructure addressing time series data needs.

Its primary purposes and design goals are:

  • to give maximum autonomy to non-programmers working with time series data (analysts, data scientists), without having to beg the IT department

  • to give very user-friendly APIs to manipulate data and computations

  • to provide an extensive audit trail from the acquired external data to the final dashboard product (it makes it easy to understand the provenance of all the data)

  • to foster good data governance and collaboration over the time series repository

  • all that while providing an efficient storage system and horizontal scalability

License#

The Timeseries Refinery - Timeseries management tool
Copyright (C) 2024  Pythonian

This library is an open source software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

To know more on LGPL license, click here.