Distributed Joins, Modeling for Colocation

This post will build on my previous post Distributed Joins, The Basics.   So we will extend the example from a customer table and an orders table, to a customer table , orders table, and order_items table.  Logically the customer skey  is the primary key in customer and foreign key in orders, and order key is the primary key in orders and the foreign key in order items.  We will assume all of these tables are too big to broadcast. 

To execute a distributed join first customer and orders need to be distributed on customer key.  Either they are both already distributed that way or one or both are redistributed.  That custord result set and order_items need to be distributed on order key.  Since custord is currently sitting with a customer key distribution from the first join, it HAS TO redistribute to be distributed on order key, and order_items either redistributes to order key, or is already physically distributed that way.

Looking at this model it seems impossible to avoid at least some on the fly redistribution of data, and possibly several redistributions.  But there is a simple way to avoid this.  Even though it is not logically needed, carry the customer key redundantly through to the order_items table.   Then we could distribute all three tables on customer key and each snippet processor has a set of customers, all their orders, and all their order_items together on it’s dedicated disk. 

But there is a catch ,  join customer to orders on customer key, and then orders to order_items on order key and it will redistribute on order key in the second step anyway because it doesn’t know that the order items belong to the customer and are already local to the snippet processor.  The solution is simple.  Join customer to orders on customer key, and then orders to order_items on order key (needed for the relation) AND customer key (needed for distribution).  What the optimizer will do is figure out that I already have everything together by customer key, and that key alone spreads things out well, so I don’t need to move things around.  It will then do the join between customer and orders, and nicely right in the same snippet with the same distribution do the join between the custord result set to order_items.  It’s critical to take advantage of the distribution to always include the customer key in the join even when just joining orders to order_items .

Even if the distribution is not on this carried through key, having it in the model and using it in joins reduces the number of redistributions.  Let’s say everything is distributed on random.  When we don’t have the key carried through, it will redistribute both customer and order on customer key, and then redistribute that custord result set on order key and also redistribute order_items by order key.  So that’s 3 tables and one result set redistributed.  If we carry the key, it will distribute all 3 on customer and then join them in one snippet.  You avoid the redistribution of the result set which makes it faster. 

Carrying keys through your model, using them for common distribution, and making sure to alway include those distribution keys in your joins can greatly improve performance by achieving good colocation through several joins.

3 comments on this post.
  1. Distributed Joins, Process Skew | Netezza Tips, Tricks, And Cool SQL:

    [...] SQL Things I learned the hard way, but you don't have to Skip to content HomeAbout ← Distributed Joins, Modeling for Colocation Distribution, Not Just for Joins [...]

  2. Ranga:

    I have a table that has 985 million rows and another table that has 53 thousand rows. I have joined them on the single distribution key along with other join criteria. When I check the query history, the smaller always has a “Phase 1 Pre-Broadcast”. I was expecting the colocation to kick in and avoid broadcast. I tried with:

    set enable_2phase_planner = 1;
    set enable_factrel_planner = true;
    set factrel_size_threshold = 1000;

    and

    set enable_2phase_planner = 1;
    set enable_factrel_planner = true;
    set factrel_size_threshold = 3000000;

    I could not get rid of the broadcast. Any idea what is going on?

    I am using version 7.0.4.4-P1

  3. NZGuy:

    In the world of Netezza 53 thousand is considered very small so a pre-broadcast is not surprising and probably fast. I would have to understand the other parts of the query and plan to see if there was some upstream or downstream reason for doing this. Things like joins to other tables or aggregations may effect it. In the snippet doing the join, is the distribution still on your distribution key, or had it been redistributed on the fly already for some other reason?

Leave a comment