Black Box Structure

The Translation to an Execution Tree

The Translator unit will translate the Query Tree into an Execution Tree. All Domains have been intersected at this point and every PredicateTree node of the QT has an Intersection object attached to it.

The nodes may be translated in a straightforward fashion:

U -> Bag/Set Union (depending on distinct flag)
I -> Bag/Set Intersection
E -> Bag/Set Difference
B -> Bag Query
A -> Assoc Query

The translation

P -> Scope Query

would be straightforward if we did not take any optimization into account. Effectively, in general we have

P -> Set Union of several Scope Queries

The following Scope Queries are generated from a Predicate Tree Leaf Node:

(At most) 4 SQs for each IndexISect of the Intersection object (f, fs, ff, p). As discussed above, the predicate strings will differ for each of these.
Each of these 4 SQs will be split into as many new SQs as there are partitions the Databases reside upon. This mapping is given by the Partition Map.
Further Splitting may be requested by giving the maximal number of containers queried by one single SQ.

Aspects of Optimization

The Predicates involved in the Query Tree may be optimized in order to have an Execution Tree that executes faster. There are several factors that might influence the speed of a query. We already optimize for

Type of nodes in the nodelist (full, partial)
Number of partitions involved in a query

We may invent additional optimizations considering

Number of threads that may be spawned
Number of bags that have to be stored
Complexity of a predicate

Threads Split a query into as many parts as may be optimal for multithreading.
Complexity To which extent does it matter to Objectivity if the predicate is complex or simple? Maybe there should be a possibility to split a complex predicate to test for this, the splitting can go as far as one query per node.

All these optimizations tend to produce more SQ leaves on the XT. The bounds for the size of the XT are set by considering Intermediate bags There should be as few intermediate bags as possible, especially if they have a large content. Memory Objectivity allocates for each thread a certain amount of memory when it initializes and it allocates a handle. There is an upper bound for the number of handles. Load balancing The I/O channels associated with each partition should have the same amount of load to manage.

Cost

There is an object indicating the cost to retrieve objects from a Domain. This information is be calculated from the constructed Execution Tree and the Query Tree it was constructed from. The number of SQ leaves on the XT, the number of different partitions involved, the number of Databases/Containers to be opened are returned.