Timeseries Store Usage#

Starting with a fresh database#

You need a postgresql database. You can create one like this:

createdb mydb

Then, initialize the tshistory tables, like this:

tsh init-db postgresql://me:password@localhost/mydb

From this you’re ready to go !

Creating a series#

However here’s a simple example:

>>> import pandas as pd
>>> from tshistory.api import timeseries
>>>
>>> tsa = timeseries('postgresql://me:password@localhost/mydb')
>>>
>>> series = pd.Series([1, 2, 3],
...                    pd.date_range(start=pd.Timestamp(2017, 1, 1),
...                                  freq='D', periods=3))
# db insertion
>>> tsa.update('my_series', series, 'babar@pythonian.fr')
...
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    3.0
Freq: D, Name: my_series, dtype: float64

# note how our integers got turned into floats
# (there are no provisions to handle integer series as of today)

# retrieval
>>> tsa.get('my_series')
...
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    3.0
Name: my_series, dtype: float64

Note that we generally adopt the convention to name the time series api object tsa.

Updating a series#

This is good. Now, let’s insert more:

 >>> series = pd.Series([2, 7, 8, 9],
 ...                    pd.date_range(start=pd.Timestamp(2017, 1, 2),
 ...                                  freq='D', periods=4))
 # db insertion
 >>> tsa.update('my_series', series, 'babar@pythonian.fr')
 ...
 2017-01-03    7.0
 2017-01-04    8.0
 2017-01-05    9.0
 Name: my_series, dtype: float64

 # you get back the *new information* you put inside
 # and this is why the `2` doesn't appear (it was already put
 # there in the first step)

 # db retrieval
 >>> tsa.get('my_series')
 ...
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    7.0
2017-01-04    8.0
2017-01-05    9.0
Name: my_series, dtype: float64

It is important to note that the third value was replaced, and the two last values were just appended. As noted the point at 2017-1-2 wasn’t a new information so it was just ignored.

Retrieving history#

We can access the whole history (or parts of it) in one call:

>>> history = tsa.history('my_series')
...
>>>
>>> for idate, series in history.items(): # it's a dict
...     print('insertion date:', idate)
...     print(series)
...
insertion date: 2018-09-26 17:10:36.988920+02:00
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    3.0
Name: my_series, dtype: float64
insertion date: 2018-09-26 17:12:54.508252+02:00
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    7.0
2017-01-04    8.0
2017-01-05    9.0
Name: my_series, dtype: float64

Note how this shows the full serie state for each insertion date. Also the insertion date is timzeone aware.

Specific versions of a series can be retrieved individually using the get method as follows:

>>> tsa.get('my_series', revision_date=pd.Timestamp('2018-09-26 17:11+02:00'))
...
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    3.0
Name: my_series, dtype: float64
>>>
>>> tsa.get('my_series', revision_date=pd.Timestamp('2018-09-26 17:14+02:00'))
...
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    7.0
2017-01-04    8.0
2017-01-05    9.0
Name: my_series, dtype: float64

It is possible to retrieve only the differences between successive insertions:

>>> diffs = tsa.history('my_series', diffmode=True)
...
>>> for idate, series in diffs.items():
...   print('insertion date:', idate)
...   print(series)
...
insertion date: 2018-09-26 17:10:36.988920+02:00
2017-01-01    1.0
2017-01-02    2.0
2017-01-03    3.0
Name: my_series, dtype: float64
insertion date: 2018-09-26 17:12:54.508252+02:00
2017-01-03    7.0
2017-01-04    8.0
2017-01-05    9.0
Name: my_series, dtype: float64

You can see a series metadata:

>>> tsa.update_metadata('series', {'foo': 42})
>>> tsa.metadata('series')
{foo: 42}