Category Archives: sql

Left Semi Join on Hive

Instead of writing:

SELECT a.key, a.value
WHERE a.key in
 (SELECT b.key
  FROM B);

Let’s write:

SELECT a.key, a.val
FROM a LEFT SEMI JOIN b ON (a.key = b.key)

So LEFT SEMI JOIN is just allows to implement efficiently IN/EXISTS queries’ semantics in your queries.

Note that right hand side of query cannot be used in WHERE clauses – it should only be used in ON join condition.