Feeds:
Posts
Comments

Posts Tagged ‘hadoop bigdata’

If one launches Hive using the “-S” or “–silent” option then Hive does not print progress information. However if you are already inside the Hive command line shell then you can control this behaviour by setting the value of silent.


set silent=on; -- make hive silent

set silent=false; -- make hive print progress information

 

Advertisements

Read Full Post »

window functions allow one to look at the previous values or next values of a column. for example if i want to subtract the previous row value from the current row value then window functions lag and lead can be used.

let us take up an example.

first create a text file containing numbers 1 through 9 with a single number on each line like so


1

2

3

.

.

9

call the file data.txt.

next we create a table in hive.


create table foo (a int);

next we load our data.txt file in to this created table using the following


load data local inpath 'data.txt' overwrite into table foo;

we want to access previous and next values over column ‘a’ note therefore the over clause in the following query


select lag(a, 1) over (order by a) as previous, a, lead(a, 1) over (order by a) as next from foo;

which outputs the following:


previous a next
NULL 1 2
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 NULL

note how the previous and next values are NULL at the edge cases. You could specify a value in such cases as in the following query which specifies 0.


select lag(a, 1, 0) over (order by a) as previous, a, lead(a, 1, 0) over (order by a) as next from foo;

which outputs the following:


previous a next
0 1 2
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 0

lag(a, 1) will fetch the previous value while lag(a, 2) will fetch the previous to previous value.

Read Full Post »

case matters.

had created table as such

create table foo(

a int,

b string

) stored as orc tblproperties(“orc.compress”=”snappy”);

but when i went to populate the table using an “insert overwrite table foo select * from …” statement then i faced an error.

turns out i should have used “SNAPPY” instead of “snappy”. case matters.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC

Read Full Post »