Timeseries Store Usage ====================== Starting with a fresh database ------------------------------ You need a postgresql database. You can create one like this: .. code:: shell createdb mydb Then, initialize the ``tshistory`` tables, like this: .. code:: python tsh init-db postgresql://me:password@localhost/mydb From this you’re ready to go ! Creating a series ----------------- However here’s a simple example: .. code:: python >>> import pandas as pd >>> from tshistory.api import timeseries >>> >>> tsa = timeseries('postgresql://me:password@localhost/mydb') >>> >>> series = pd.Series([1, 2, 3], ... pd.date_range(start=pd.Timestamp(2017, 1, 1), ... freq='D', periods=3)) # db insertion >>> tsa.update('my_series', series, 'babar@pythonian.fr') ... 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 3.0 Freq: D, Name: my_series, dtype: float64 # note how our integers got turned into floats # (there are no provisions to handle integer series as of today) # retrieval >>> tsa.get('my_series') ... 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 3.0 Name: my_series, dtype: float64 Note that we generally adopt the convention to name the time series api object ``tsa``. Updating a series ----------------- This is good. Now, let’s insert more: .. code:: python >>> series = pd.Series([2, 7, 8, 9], ... pd.date_range(start=pd.Timestamp(2017, 1, 2), ... freq='D', periods=4)) # db insertion >>> tsa.update('my_series', series, 'babar@pythonian.fr') ... 2017-01-03 7.0 2017-01-04 8.0 2017-01-05 9.0 Name: my_series, dtype: float64 # you get back the *new information* you put inside # and this is why the `2` doesn't appear (it was already put # there in the first step) # db retrieval >>> tsa.get('my_series') ... 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 7.0 2017-01-04 8.0 2017-01-05 9.0 Name: my_series, dtype: float64 It is important to note that the third value was *replaced*, and the two last values were just *appended*. As noted the point at ``2017-1-2`` wasn’t a new information so it was just ignored. Retrieving history ------------------ We can access the whole history (or parts of it) in one call: .. code:: python >>> history = tsa.history('my_series') ... >>> >>> for idate, series in history.items(): # it's a dict ... print('insertion date:', idate) ... print(series) ... insertion date: 2018-09-26 17:10:36.988920+02:00 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 3.0 Name: my_series, dtype: float64 insertion date: 2018-09-26 17:12:54.508252+02:00 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 7.0 2017-01-04 8.0 2017-01-05 9.0 Name: my_series, dtype: float64 Note how this shows the full serie state for each insertion date. Also the insertion date is timzeone aware. Specific versions of a series can be retrieved individually using the ``get`` method as follows: .. code:: python >>> tsa.get('my_series', revision_date=pd.Timestamp('2018-09-26 17:11+02:00')) ... 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 3.0 Name: my_series, dtype: float64 >>> >>> tsa.get('my_series', revision_date=pd.Timestamp('2018-09-26 17:14+02:00')) ... 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 7.0 2017-01-04 8.0 2017-01-05 9.0 Name: my_series, dtype: float64 It is possible to retrieve only the differences between successive insertions: .. code:: python >>> diffs = tsa.history('my_series', diffmode=True) ... >>> for idate, series in diffs.items(): ... print('insertion date:', idate) ... print(series) ... insertion date: 2018-09-26 17:10:36.988920+02:00 2017-01-01 1.0 2017-01-02 2.0 2017-01-03 3.0 Name: my_series, dtype: float64 insertion date: 2018-09-26 17:12:54.508252+02:00 2017-01-03 7.0 2017-01-04 8.0 2017-01-05 9.0 Name: my_series, dtype: float64 You can see a series metadata: .. code:: python >>> tsa.update_metadata('series', {'foo': 42}) >>> tsa.metadata('series') {foo: 42}