Quantcast
Channel: Hacker News
Viewing all articles
Browse latest Browse all 25817

Cachier: Persistent, stale-free local/cross-machine caching for Python functions

$
0
0

README.rst

PyPI-StatusPyPI-VersionsBuild-StatusLICENCE

Persistent, stale-free, local and cross-machine caching for Python functions.

from cachier import cachierimport datetime@cachier(stale_after=datetime.timedelta(days=3))deffoo(arg1, arg2):"""foo now has a persistent cache, trigerring recalculation for values stored more than 3 days."""return {'arg1': arg1, 'arg2': arg2}

Install cachier with:

  • Pure Python.
  • Compatible with Python 2.7+ and Python 3.5+.
  • Tested on Linux and OS X systems. Does not support Windows.
  • A simple interface.
  • Defining "shelf life" for cached values.
  • Local caching using pickle files.
  • Cross-machine caching using MongoDB.
  • Thread-safety.

Cachier is not:

  • Meant as a transient cache. Python's @lru_cache is better.
  • Especially fast. It is meant to replace function calls that take more than... a second, say (overhead is around 1 millisecond).

Future features

  • Windows support.
  • S3 core.
  • Multi-core caching.

The positional and keyword arguments to the wrapped function must be hashable (i.e. Python's immutable built-in objects, not mutable containers). Also, notice that since objects which are instances of user-defined classes are hashable but all compare unequal (their hash value is their id), equal objects across different sessions will not yield identical keys.

Setting up a Cache

You can add a deafult, pickle-based, persistent cache to your function - meaning it will last across different Python kernels calling the wrapped function - by decorating it with the cachier decorator (notice the ()!).

from cachier import cachier@cachier()deffoo(arg1, arg2):"""Your function now has a persistent cache mapped by argument values!"""return {'arg1': arg1, 'arg2': arg2}

Resetting a Cache

The Cachier wrapper adds a clear_cache() function to each wrapped function. To reset the cache of the wrapped function simply call this method:

Cache Shelf Life

Setting Shelf Life

You can set any duration as the shelf life of cached return values of a function by providing a corresponding timedelta object to the stale_after parameter:

import datetime@cachier(stale_after=datetime.timedelta(weeks=2))defbar(arg1, arg2):return {'arg1': arg1, 'arg2': arg2}

Now when a cached value matching the given arguments is found the time of its calculation is checked; if more than stale_after time has since passed, the function will be run again for the same arguments and the new value will be cached and returned.

This is usefull for lengthy calculations that depend on a dynamic data source.

Fuzzy Shelf Life

Sometimes you may want your function to trigger a calculation when it encounters a stale result, but still not wait on it if it's not that critical. In that case you can set next_time to True to have your function trigger a recalculation in a separate thread, but return the currently cached stale value:

Further function calls made while the calculation is being performed will not trigger redundant calculations.

Per-function call arguments

Cachier also accepts several keyword arguments in the calls of the function it wraps rather than in the decorator call, allowing to modify its behaviour for a specific function call.

Ignore Cache

You cah have cachier ignore any existing cache for a specific function call by passing ignore_cache=True to the function call. The cache will neither be checked nor updated with the new return value.

@cachier()defsum(first_num, second_num):return first_num + second_numdefmain():print(sum(5, 3, ignore_cache=True))

Overwrite Cache

You cah have cachier overwrite an existing cache entry - if one exists - for a specific function call by passing overwrite_cache=True to the function call. The cache will not be checked, but will be updated with the new return value.

Verbose Cache Call

You cah have cachier print out a detailed explanation of the logic of a specific call by passing verbose_cachee=True to the function call. This can be usefull if you are not sure why a certain function result is or is not returned.

Pickle Core

The default core for Cachier is pickle based, meaning each function will store its cache is a seperate pickle file in the ~/.cachier directory. Naturally, this kind of cache is both machine-specific and user-specific.

You can slightly optimize pickle-based caching if you know your code will only be used in a single thread environment by setting:

@cachier(pickle_reload=False)

This will prevent reading the cache file on each cache read, speeding things up a bit, while also nullfying inter-thread functionality (the code is still thread safe, but different threads will have different versions of the cache at times, and will sometime make unecessary function calls).

MongoDB Core

You can set a MongoDB-based cache by assigning mongetter with a callable that returns a pymongo.Collection object with writing permission:

@cachier(mongetter=False)

This allows you to have a cross-machine, albeit slower, cache. This functionality requires that the installation of the pymongo python package.

Package author and current maintainer is Shay Palachy (shay.palachy@gmail.com); You are more than welcome to approach him for help. Contributions are very welcomed.

Installing for development

Clone:

git clone git@github.com:shaypal5/cachier.git

Install in development mode with test dependencies:

cd cachier
pip install -e ".[test]"

Running the tests

To run the tests use:

python -m pytest --cov=cachier

Adding documentation

This project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, please follow these conventions.

Created by Shay Palachy (shay.palachy@gmail.com).


Viewing all articles
Browse latest Browse all 25817

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>