SPLITTING COLUMN BY DELIMITER INTO UNIQUE ROWS IN HIVE -
i have dataset. please see below sample row:
94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460695483:440507;1460777656:440515;1460778054:440488;1460778157:440481,440600;
each column separated space(in total 3 columns). columns names id (int), unid (string), time_stamp (string).
i split dataset such each unique element such below rows:-
- 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460695483:440507
- 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460777656:440515
- 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460778054:440488
- 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460778157:440481
- 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460778157:440600
each sub-point each row. have used following query giving me output above. have used following code , not working:-
select id, unid,time_date table lateral view explode (split (time_date,'\;')) time_date time_date;
output:- 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460695483:440507;1460777656:440515;1460778054:440488;1460778157:440481,440600; (the following row repeated 5 times)
help appreciated! in advance :)
firstly, had replace semi-colons pipe. so:
create temporary table tbl (id int, unid string, time_stamp string); insert tbl values ( 94654, '6802d326-9f9b-4fc8-b2dd-f878eade31f2' , '1460695483:440507|1460777656:440515|1460778054:440488|1460778157:440481,440600'); select id, unid, time_stamp ( select id, unid, split(time_stamp,'\\|') ts tbl ) t lateral view explode(t.ts) bar time_stamp;
which give us:
94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460695483:440507 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460777656:440515 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460778054:440488 94654 6802d326-9f9b-4fc8-b2dd-f878eade31f2 1460778157:440481,440600
you have split , explode in separate steps. split in derived table, , explode/lateral view in outer query.
Comments
Post a Comment