hive中如何读取数组,如何获取Hive中的数组中的前n个元素

2023-05-16

I use split function to create an array in Hive, how can I get the first n elements from the array, and I want to go through the sub-array

code example

select col1 from table

where split(col2, ',')[0:5]

'[0:5]'looks likes python style, but it doesn't work here.

解决方案

This is a tricky one.

First grab the brickhouse jar from here

Then add it to Hive : add jar /path/to/jars/brickhouse-0.7.0-SNAPSHOT.jar;

Now create the two functions we will be usings :

CREATE TEMPORARY FUNCTION array_index AS 'brickhouse.udf.collect.ArrayIndexUDF';

CREATE TEMPORARY FUNCTION numeric_range AS 'brickhouse.udf.collect.NumericRange';

The query will be :

select a,

n as array_index,

array_index(split(a,','),n) as value_from_Array

from ( select "abc#1,def#2,hij#3" a from dual union all

select "abc#1,def#2,hij#3,zzz#4" a from dual) t1

lateral view numeric_range( length(a)-length(regexp_replace(a,',',''))+1 ) n1 as n

Explained :

select "abc#1,def#2,hij#3" a from dual union all

select "abc#1,def#2,hij#3,zzz#4" a from dual

Is just selecting some test data, in your case replace this with your table name.

lateral view numeric_range( length(a)-length(regexp_replace(a,',',''))+1 ) n1 as n

numeric_range is a UDTF that returns a table for a given range, in this case, i asked for a range between 0 (default) and the number of elements in string (calculated as the number of commas + 1)

This way, each row will be multiplied by the number of elements in the given column.

array_index(split(a,','),n)

This is exactly like using split(a,',')[n] but hive doesn't support it.

So we get the n-th element for each duplicated row of the initial string resulting in :

abc#1,def#2,hij#3,zzz#4 0 abc#1

abc#1,def#2,hij#3,zzz#4 1 def#2

abc#1,def#2,hij#3,zzz#4 2 hij#3

abc#1,def#2,hij#3,zzz#4 3 zzz#4

abc#1,def#2,hij#3 0 abc#1

abc#1,def#2,hij#3 1 def#2

abc#1,def#2,hij#3 2 hij#3

If you really want a specific number of elements (say 5) then just use :

lateral view numeric_range(5 ) n1 as n

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

hive中如何读取数组,如何获取Hive中的数组中的前n个元素 的相关文章

随机推荐