Monday, June 12, 2017

Hive--Basic operations

Hive operations
---------------
[cloudera@quickstart ~]$ cat emp.txt
eno  name  sal    g dno
101,AAAAA,1000.00,m,11
102,BBBBB,2000.00,m,12
103,CCCCC,3000.00,f,13
104,DDDDD,4000.00,f,14
105,EEEEE,5000.00,m,15
106,FFFFF,6000.00,f,16
107,GGGGG,7000.00,m,17
108,HHHHH,8000.00,m,18
109,xxxxx,1000.00,f,11
110,yyyyy,2000.00,f,18
111,zzzzz,2000.00,f,15


hive> create table empdata(eno int,name string,sal double,gender string,dno int)
    > row format delimited fields terminated by ',';

hive>desc empdata;

hive>load data local inpath 'emp.txt'  into table empdata;

hive>select *from empdata;

Queries
-------
display all females
display all males
display all males whose dno=11,12
All females not in  dno 13,15
All rows of 11,14
Display all dno except dno 13,15.


hive> select *from empdata where gender='f';
hive> select *from empdata where gender='m';
hive> select *from empdata where (dno=11 or dno=12) and gender='m';
hive> select *from empdata where (dno!=13 and dno!=15) and  gender='f'
hive> select *from empdata where dno=11 or dno=14;
hive> select *from empdata where dno!=13 and dno!=14;


in operator
-----------
In is display rows within a given set.
hive> select *from empdata where dno in(11,12) and gender='m';
hive> select *from empdata where dno in(11,14);
hive> select *from empdata where dno not in(13,15) and gender='f';
hive> select *from empdata where dno not in(13,15);


between
-------
It is an operator used to display rows between given condition
display all employees whose sal is between 3000 and 7000

hive> select *from empdata where  sal>=3000  and sal<=7000;
hive> select *from empdata where  sal between   3000 and 7000;


Not between
-----------
display all employees whose sal is not between 3000 and 7000

hive> select *from empdata where  sal<3000 or sal>7000;
hive> select *from empdata where  sal not between   3000 and 7000;

displaying top rows
-------------------
limit
-----
display top n rows in a table.

syntax
------
hive>select *from tablename limit n;

ex:
hive>select *from  empdata limit  5;

sorting rows of a table
------------------------
order by
--------
we use order by clause to sort rows of a table.


syntax
------
select *from tablename order by colname asc;
or
select *from tablename order by colname desc;


hive> select *from empdata order by sal;
hive> select *from empdata order by sal desc;

Adding new column to a table
----------------------------
syntax
------
alter table tablename add columns(col1 type,col2 type,....);

ex:
hive> alter table empdata add columns(tax double);

hive> alter table empdata add columns(tax int);


updating a table
----------------
In hive we if we add a new col to a table or if we want to update a col of table we go for table to table copying.

hive> insert overwrite table empdata
    > select eno,name,sal,gender,dno,sal*0.2 from empdata;


like
----
create a table replica.


syntax
------
create table tablename like tablename2;

we copy data from table to table by using insert overwrite/insert into.

hive> create table staff like emp;

hive> insert overwrite table staff
    > select *from empdata;

hive> insert into table staff
    > select *from empdata;

Note1:insert overwrite ....;
     replace old records with new records.

Note2:insert into  ....;
      appends records to existing table.


deleting rows from a table
--------------------------
delete all female rows from a staff table but dont' use delete.

hive> insert overwrite table staff
    > select *from staff where gender='m';

No comments: