Rowid generator provided in Hivemall

You can use rowid() function to generate an unique rowid in Hivemall v0.2 or later.

select
  rowid() as rowid, -- returns ${task_id}-${sequence_number} as string
  *
from 
  xxx;

Also, rownum() is supported since Hivemall v0.5-rc.1 or later.

select
  rownum() as rowid, -- returns sprintf(`%d%04d`,sequence,taskId) as long
  *
from
  xxx;

Other Rowid generation schemes using SQL

CREATE TABLE xxx
AS
SELECT 
  regexp_replace(reflect('java.util.UUID','randomUUID'), '-', '') as rowid,
  *
FROM
  ..;

Another option to generate rowid is to use row_number(). However, the query execution would become too slow for large dataset because the rowid generation is executed on a single reducer.

CREATE TABLE xxx
AS
select 
  row_number() over () as rowid, 
  * 
from a9atest;

results matching ""

    No results matching ""