The Translation to an Execution
Tree
The Translator unit will translate the Query Tree into an Execution Tree.
All Domains have been intersected at this point and every PredicateTree
node of the QT has an Intersection object attached to it.
The nodes may be translated in a straightforward fashion:
-
U -> Bag/Set Union (depending on distinct flag)
-
I -> Bag/Set Intersection
-
E -> Bag/Set Difference
-
B -> Bag Query
-
A -> Assoc Query
The translation
would be straightforward if we did not take any optimization into account.
Effectively, in general we have
-
P -> Set Union of several Scope Queries
The following Scope Queries are generated from a Predicate Tree Leaf Node:
-
(At most) 4 SQs for each IndexISect of the Intersection object (f, fs,
ff, p). As discussed above, the predicate strings will differ for each
of these.
-
Each of these 4 SQs will be split into as many new SQs as there are partitions
the Databases reside upon. This mapping is given by the Partition Map.
-
Further Splitting may be requested by giving the maximal number of containers
queried by one single SQ.
Aspects of Optimization
The Predicates involved in the Query Tree may be optimized in order to
have an Execution Tree that executes faster. There are several factors
that might influence the speed of a query. We already optimize for
-
Type of nodes in the nodelist (full, partial)
-
Number of partitions involved in a query
We may invent additional optimizations considering
-
Number of threads that may be spawned
-
Number of bags that have to be stored
-
Complexity of a predicate
Threads Split
a query into as many parts as may be optimal for multithreading.
Complexity To which extent does it matter to Objectivity
if the predicate is complex or simple? Maybe there should be a possibility
to split a complex predicate to test for this, the splitting can go as
far as one query per node.
All these optimizations tend to produce more SQ leaves on the XT. The bounds
for the size of the XT are set by considering
Intermediate bags There should be as few intermediate bags
as possible, especially if they have a large content.
Memory Objectivity allocates for each thread a certain amount of
memory when it initializes and it allocates a handle. There is an upper
bound for the number of handles.
Load balancing The I/O channels associated with each partition should
have the same amount of load to manage.
Cost
There is an object indicating the cost to retrieve objects from
a Domain. This information is be calculated from the constructed Execution
Tree and the Query Tree it was constructed from. The number of SQ leaves
on the XT, the number of different partitions involved, the number of Databases/Containers
to be opened are returned.