
It tries to give a similar functionaly similar to what relational databases give with SQL.
But in this case all it is done with pure Python data structures and no daemon processes.
Basically it offers adding, removing and selecting of data. The data is handled as dictionaries because all the data is indexed by its keys.
The selecting operation is what we use to extract the desired dictionaries by a given pattern.
Let's see an example:
>>> import multikey_dictionary
>>> md = multikey_dictionary.Multikey_Dictionary()
>>> d1 = {'name': 'Foo', 'age': 20, 'city': 'Paris'} # We use dictionaries to store the data
>>> d2 = {'name': 'Bar', 'age': 25, 'city': 'Paris'} # The keys are the names of the data fields
>>> d3 = {'name': 'Fubar', 'age': 18, 'city': 'Tokyo'}
>>> d4 = {'name': 'Bob', 'age': 20, 'city': 'Madrid'}
>>> md.add_all([d1, d2, d3, d4])
>>> md['name', 'Foo'] # Now we want to know who is called Foo
[{'name': 'Foo', 'age': 20, 'city': 'Paris'}]
>>> md['age', 20] # Who is 20?
[{'name': 'Foo', 'age': 20, 'city': 'Paris'}, {'name': 'Bob', 'age': 20, 'city': 'Madrid'}]
>>> md['age', 20, 'city', 'Paris'] # Who is 20 and lives in Paris?
[{'name': 'Foo', 'age': 20, 'city': 'Paris'}]
>>> md['city', 'Paris', lambda x: x['age'] > 20] # We can also use functions/lambdas to filter
[{'name': 'Bar', 'age': 25, 'city': 'Paris'}]
>>> md['name'] # Here we search for indexed names
['Foo', 'Bar', 'Fubar', 'Bob']
>>> md['age'] # Show the indexed ages
[18, 20, 25]
>>> md['city'] # Show the indexed cities
['Paris', 'Tokyo', 'Madrid']
>>> md['age', lambda x: x != 20] # We can also filter the indexed keys
[18, 25]
>>> md.select_all(sort_key = 'age') # We can get all the dictionaries sorted
[{'name': 'Fubar', 'age': 18, 'city': 'Tokyo'}, {'name': 'Foo', 'age': 20, 'city': 'Paris'},
{'name': 'Bob', 'age': 20, 'city': 'Madrid'}, {'name': 'Bar', 'age': 25, 'city': 'Paris'}]
Take note that the pattern is a chain of subpatterns. Each subpattern selects some dictionaries. At the end, an intersection is done (sets semantics) between all the subpatterns. Also each subpattern may have attached a function/lambda to do some filter operation. You can specify as many subpatterns as you want.
md['age', 20, 'city', 'Paris'] ==> intersection(md['age', 20], md['city', 'Paris'])
The selecting operation can optionally use an union at the end. Also, we can use the Python slice
syntax to specify the pattern or a traditional method.
The performance should be acceptable for medium data sets. The implementation follows a simplistic design not having performance as a main feature. A few hundred dictionaries should be managed without any trouble.
This module uses internally Python sets and dictionaries, so some limitations are implied.
Every dictionary added to a multikey dictionary must be hashable. It can not contain mutable
containers like others dictionaries or lists. You may transform your containers to tuples or
another inmutable Python type to overcome this limitation.
In order to modify a dictionary inside a multikey dictionary, the dictionary must be removed, then modified and finally readded.
For serialization operations, use the standard Python library.
The current version is 1.0. Tested with Python 2.5 and 2.6. Licensed under GPLv3.
Last modified date: Wed May 5 21:44:58 CEST 2010