Array
-----
Array is collection of similar type of values.
we use datatype array to declare array in a hive table
syntax
------
colname array<type>
cat > courses.txt
1,satya,hadoop$c$cpp$java
2,aaaa,hadoop$c$cpp$java
3,bbbb,photo$animation
syntax
------
create table tablename(col1 type,col2 type,...)
row format delimited fields terminated by 'symbol'
collection items terminated by 'symbol'
ex:
hive> create table courses(id int,iname string,course array<string>)
> row format delimited fields terminated by ','
> collection items terminated by '$';
hive> load data local inpath 'courses.txt' into table courses;
hive> select *from courses;
1 satya ["hadoop","c","cpp","java"]
2 aaaa ["hadoop","c","cpp","java"]
3 bbbb ["photo","animation"]
Internal representation of array
---------------------------------
------
haoop|0
-----
c |1
-----
cpp |2
-----
java |3
-----
An array always starts with index 0 and ends with size-1
query1
------
Display id,courses from courses table
Display id,name,first 2 courses from array in a courses table
Display all courses whose id is 1
Display first course from courses table whose id is 1
Display 3,4 courses from courses table whose id is 3
hive> select id,course from courses;
OK
1 ["hadoop","c","cpp","java"]
2 ["hadoop","c","cpp","java"]
3 ["photo","animation"]
hive> select id,iname,course[0],course[1] from courses;
1 satya hadoop c
2 aaaa hadoop c
3 bbbb photo animation
hive>select course from courses where id=1;
["hadoop","c","cpp","java"]
hive> select course[0] from courses where id=1;
hadoop
hive> select course[2],course[3] from courses where id=3;
NULL NULL
Note:if we try to get values from an array whose index does not exist then hive returns NULL values if the index is not found.
Map
---
A Map is a collection used to represent data in the form of key and value pairs.
key=value
key must be unique,values can be duplcated.
we use key to perform operations on a map.
ex:
1=manohar
2=sanjay
syntax
------
key1=val1<symbol>key2=val2<symbol>key3=val3......
cat>items.txt
1,more,1=soap#2=sampoo#3=perfume
2,spencer,1=choclate#2=gum#3=lays
3,reliance,1=cola#2=maggy#3=colgate
How to declare map in a hive
----------------------------
we represent map in hive table using map datatype
syntxa
------
colname map<type1,type2>
while creating a table with map datatype we should specify 3 delimiters with
i)row format
ii)collection items
iii)Map keys
syntax
-----
create table tablename(colname type1,........)
row format delimited fields termianted by 'symbol'
collection items terminated by 'symbol'
map keys terminated by 'symbol';
ex:
hive> create table items(id int,name string,products map<int,string>)
> row format delimited fields terminated by ','
> collection items terminated by '#'
> map keys terminated by '=';
hive>load data local inpath 'items.txt' into table items;
hive>select *from items;
1 more {1:"soap",2:"sampoo",3:"perfume"}
2 spencer {1:"choclate",2:"gum",3:"lays"}
3 reliance {1:"cola",2:"maggy",3:"colgate"}
How to access data from a map collection
---------------------------------
colname[key]
display a map collection whose id is 3,2
display a map collection whose key is 1,2
display a map collection whose key is 1,2 where id is 1
display a map collection whose key is 2,3 where id is 1,3
display a map collection whose key is 3 where id is 1,2,3
hive> select *from items where id=2 or id=3;
hive> select products[1],products[2] from items;
hive> select products[1],products[2] from items where id=1;
hive> select products[1],products[2] from items where id=2 or id=3;
structure
---------
A structure is collection of different types of values.
we declare a strcuture in a hive table using the datatype struct.
syntax
------
colname struct<var1:type,var2:type,.....>
ex:
address struct<city:string,pin:int,stno:int>
How to represent struct in a dataset
-------------------------------------
emp.txt
-------
1,aaa,hyd$5002$43
2,bbb,bombay$5002$43
3,ccc,delhi$5002$43
Here we should we should use 2 delimiters with
i)row format
ii)collection items
table syntax
------------
create table tablename(colname type1,........)
row format delimited fields termianted by 'symbol'
collection items terminated by 'symbol';
ex:
hive> create table emp_struct(id int,name string,addr struct<city:string,pin:int,stno:int>)
row format delimited fields termianted by ','
collection items terminated by '$';
hive>load data local inpath 'emp.txt' into table emp_struct;
hive>select *from emp_struct;
How to access data from a struct
--------------------------------
syntax
------
colname.varnamecut
ex:
address.city
display city,pin from table emp_struct
display eno,name,stno from emp
display eno,name,stno from emp where id is 1,2
hive> select addr.city,addr.pin from emp_struct;
hive> select id,name,addr.stno from emp_struct;
hive> select id,name,addr.stno from emp_struct where id=1 or id=2;
Note:hive does not support union datatype.
-----
Array is collection of similar type of values.
we use datatype array to declare array in a hive table
syntax
------
colname array<type>
cat > courses.txt
1,satya,hadoop$c$cpp$java
2,aaaa,hadoop$c$cpp$java
3,bbbb,photo$animation
syntax
------
create table tablename(col1 type,col2 type,...)
row format delimited fields terminated by 'symbol'
collection items terminated by 'symbol'
ex:
hive> create table courses(id int,iname string,course array<string>)
> row format delimited fields terminated by ','
> collection items terminated by '$';
hive> load data local inpath 'courses.txt' into table courses;
hive> select *from courses;
1 satya ["hadoop","c","cpp","java"]
2 aaaa ["hadoop","c","cpp","java"]
3 bbbb ["photo","animation"]
Internal representation of array
---------------------------------
------
haoop|0
-----
c |1
-----
cpp |2
-----
java |3
-----
An array always starts with index 0 and ends with size-1
query1
------
Display id,courses from courses table
Display id,name,first 2 courses from array in a courses table
Display all courses whose id is 1
Display first course from courses table whose id is 1
Display 3,4 courses from courses table whose id is 3
hive> select id,course from courses;
OK
1 ["hadoop","c","cpp","java"]
2 ["hadoop","c","cpp","java"]
3 ["photo","animation"]
hive> select id,iname,course[0],course[1] from courses;
1 satya hadoop c
2 aaaa hadoop c
3 bbbb photo animation
hive>select course from courses where id=1;
["hadoop","c","cpp","java"]
hive> select course[0] from courses where id=1;
hadoop
hive> select course[2],course[3] from courses where id=3;
NULL NULL
Note:if we try to get values from an array whose index does not exist then hive returns NULL values if the index is not found.
Map
---
A Map is a collection used to represent data in the form of key and value pairs.
key=value
key must be unique,values can be duplcated.
we use key to perform operations on a map.
ex:
1=manohar
2=sanjay
syntax
------
key1=val1<symbol>key2=val2<symbol>key3=val3......
cat>items.txt
1,more,1=soap#2=sampoo#3=perfume
2,spencer,1=choclate#2=gum#3=lays
3,reliance,1=cola#2=maggy#3=colgate
How to declare map in a hive
----------------------------
we represent map in hive table using map datatype
syntxa
------
colname map<type1,type2>
while creating a table with map datatype we should specify 3 delimiters with
i)row format
ii)collection items
iii)Map keys
syntax
-----
create table tablename(colname type1,........)
row format delimited fields termianted by 'symbol'
collection items terminated by 'symbol'
map keys terminated by 'symbol';
ex:
hive> create table items(id int,name string,products map<int,string>)
> row format delimited fields terminated by ','
> collection items terminated by '#'
> map keys terminated by '=';
hive>load data local inpath 'items.txt' into table items;
hive>select *from items;
1 more {1:"soap",2:"sampoo",3:"perfume"}
2 spencer {1:"choclate",2:"gum",3:"lays"}
3 reliance {1:"cola",2:"maggy",3:"colgate"}
How to access data from a map collection
---------------------------------
colname[key]
display a map collection whose id is 3,2
display a map collection whose key is 1,2
display a map collection whose key is 1,2 where id is 1
display a map collection whose key is 2,3 where id is 1,3
display a map collection whose key is 3 where id is 1,2,3
hive> select *from items where id=2 or id=3;
hive> select products[1],products[2] from items;
hive> select products[1],products[2] from items where id=1;
hive> select products[1],products[2] from items where id=2 or id=3;
structure
---------
A structure is collection of different types of values.
we declare a strcuture in a hive table using the datatype struct.
syntax
------
colname struct<var1:type,var2:type,.....>
ex:
address struct<city:string,pin:int,stno:int>
How to represent struct in a dataset
-------------------------------------
emp.txt
-------
1,aaa,hyd$5002$43
2,bbb,bombay$5002$43
3,ccc,delhi$5002$43
Here we should we should use 2 delimiters with
i)row format
ii)collection items
table syntax
------------
create table tablename(colname type1,........)
row format delimited fields termianted by 'symbol'
collection items terminated by 'symbol';
ex:
hive> create table emp_struct(id int,name string,addr struct<city:string,pin:int,stno:int>)
row format delimited fields termianted by ','
collection items terminated by '$';
hive>load data local inpath 'emp.txt' into table emp_struct;
hive>select *from emp_struct;
How to access data from a struct
--------------------------------
syntax
------
colname.varnamecut
ex:
address.city
display city,pin from table emp_struct
display eno,name,stno from emp
display eno,name,stno from emp where id is 1,2
hive> select addr.city,addr.pin from emp_struct;
hive> select id,name,addr.stno from emp_struct;
hive> select id,name,addr.stno from emp_struct where id=1 or id=2;
Note:hive does not support union datatype.
No comments:
Post a Comment