17.3. 管理 Hive

17.3. 管理 Hive
上一頁	第 17 章 Apache Hive	下一頁

如果你又任何一種關係型資料庫的使用經驗，那麼你將在這裡看到非常熟悉的操作。

17.3.1. 表管理

17.3.1.1. 創建表

CREATE TABLE member (name string, age int, sex int);

				
hive> CREATE TABLE member (name string, age int, sex int);
OK
Time taken: 0.687 seconds
hive>

基于現有的數據建表

				
hive> create table newtable as select * from oldtable;

17.3.1.2. 顯示表

				
hive> SHOW TABLES;
OK
test
Time taken: 0.041 seconds, Fetched: 1 row(s)
hive>

通配符匹配表名稱

show tables '*t*';

17.3.1.3. 刪除表

				
hive> DROP TABLE test;
OK
Time taken: 1.337 seconds
hive>

17.3.1.4. 查看表結構

				
hive> CREATE TABLE member (name string, age int, sex int);
OK
Time taken: 0.273 seconds

hive> desc member;
OK
name                	string              	                    
age                 	int                 	                    
sex                 	int                 	                    
Time taken: 0.035 seconds, Fetched: 3 row(s)
hive>

17.3.1.5. 為表增加欄位

增加一個欄位 phone 字元串類型

				
hive> ALTER TABLE member ADD COLUMNS (phone String);
OK
Time taken: 0.188 seconds
hive> desc member;
OK
name                	string              	                    
age                 	int                 	                    
sex                 	int                 	                    
phone               	string              	                    
Time taken: 0.033 seconds, Fetched: 4 row(s)

17.3.1.6. 修改表名稱

將 test 表重命名為 vipuser

				
hive> CREATE TABLE test (name string, age int, sex int);
OK
Time taken: 0.311 seconds
hive> ALTER TABLE test RENAME TO vipuser;
OK
Time taken: 0.115 seconds
hive> desc vipuser;
OK
name                	string              	                    
age                 	int                 	                    
sex                 	int                 	                    
Time taken: 0.032 seconds, Fetched: 3 row(s)
hive>

17.3.1.7. 使用已有表結構創建新表

僅僅創建表結構，不會複製數據過來。

				
hive> CREATE TABLE news_2017 LIKE news;
OK
Time taken: 0.311 seconds

17.3.2. 分區表

17.3.2.1. 創建分區表

				
hive> CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING);

17.3.2.2. 顯示分區情況

				
hive> SHOW PARTITIONS passwd;
OK
computer=hadoop
computer=hbase
computer=hive
Time taken: 0.056 seconds, Fetched: 3 row(s)

17.3.2.3. 增加分區

				
hive> alter table member add partition (province='shenzhen');

17.3.2.4. 向分區表導入數據

				
hive> CREATE TABLE passwd (a string, b string, c string, d string, e string, f string) PARTITIONED BY (computer string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ':';
OK
Time taken: 0.323 seconds
hive> load data local inpath '/etc/passwd' overwrite into table passwd partition(computer="hive");
Loading data to table default.passwd partition (computer=hive)
OK
Time taken: 0.499 seconds
hive> select * from passwd;
OK
root	x	0	0	root	/root	hive
bin	x	1	1	bin	/bin	hive
daemon	x	2	2	daemon	/sbin	hive
adm	x	3	4	adm	/var/adm	hive
lp	x	4	7	lp	/var/spool/lpd	hive
sync	x	5	0	sync	/sbin	hive
shutdown	x	6	0	shutdown	/sbin	hive
halt	x	7	0	halt	/sbin	hive
mail	x	8	12	mail	/var/spool/mail	hive
operator	x	11	0	operator	/root	hive
games	x	12	100	games	/usr/games	hive
ftp	x	14	50	FTP User	/var/ftp	hive
nobody	x	99	99	Nobody	/	hive
dbus	x	81	81	System message bus	/	hive
polkitd	x	999	998	User for polkitd	/	hive
avahi	x	70	70	Avahi mDNS/DNS-SD Stack	/var/run/avahi-daemon	hive
avahi-autoipd	x	170	170	Avahi IPv4LL Stack	/var/lib/avahi-autoipd	hive
postfix	x	89	89		/var/spool/postfix	hive
sshd	x	74	74	Privilege-separated SSH	/var/empty/sshd	hive
ntp	x	38	38		/etc/ntp	hive
rpc	x	32	32	Rpcbind Daemon	/var/lib/rpcbind	hive
qemu	x	107	107	qemu user	/	hive
unbound	x	998	996	Unbound DNS resolver	/etc/unbound	hive
rpcuser	x	29	29	RPC Service User	/var/lib/nfs	hive
nfsnobody	x	65534	65534	Anonymous NFS User	/var/lib/nfs	hive
saslauth	x	997	76	"Saslauthd user"	/run/saslauthd	hive
radvd	x	75	75	radvd user	/	hive
nagios	x	1000	1000		/home/nagios	hive
apache	x	48	48	Apache	/usr/share/httpd	hive
exim	x	93	93		/var/spool/exim	hive
tss	x	59	59	Account used by the trousers package to sandbox the tcsd daemon	/dev/null	hive
git	x	996	994		/var/opt/gitlab	hive
gitlab-www	x	995	993		/var/opt/gitlab/nginx	hive
gitlab-redis	x	994	992		/var/opt/gitlab/redis	hive
gitlab-psql	x	993	991		/var/opt/gitlab/postgresql	hive
nginx	x	992	990	nginx user	/var/cache/nginx	hive
www	x	80	80	Web Application	/www	hive
mysql	x	27	27	MySQL Server	/var/lib/mysql	hive
redis	x	991	989	Redis Database Server	/var/lib/redis	hive
epmd	x	990	988	Erlang Port Mapper Daemon	/tmp	hive
rabbitmq	x	989	987	RabbitMQ messaging server	/var/lib/rabbitmq	hive
solr	x	1001	1001	Apache Solr	/srv/solr	hive
mongodb	x	184	986	MongoDB Database Server	/var/lib/mongodb	hive
test	x	1002	1002		/home/test	hive
sysaccount	x	988	985		/home/sysaccount	hive
systemd-bus-proxy	x	987	983	systemd Bus Proxy	/	hive
systemd-network	x	986	982	systemd Network Management	/	hive
elasticsearch	x	985	980	elasticsearch user	/home/elasticsearch	hive
zabbix	x	984	979	Zabbix Monitoring System	/var/lib/zabbix	hive
mysqlrouter	x	983	978	MySQL Router	/var/lib/mysqlrouter	hive
hadoop	x	1003	1003		/home/hadoop	hive
Time taken: 0.118 seconds, Fetched: 51 row(s)

hive> SHOW PARTITIONS passwd;
OK
computer=hive
Time taken: 0.058 seconds, Fetched: 1 row(s)

17.3.3. 視圖管理

17.3.3.1. 創建視圖

				
hive> CREATE VIEW v_test AS SELECT name,age FROM member where age>20;
hive> select * from v_test;

17.3.3.2. 刪除視圖

				
hive> drop view test;
OK
Time taken: 0.276 seconds

判斷視圖是否存在

				
hive> DROP VIEW IF EXISTS v_test;
OK
Time taken: 0.495 seconds

17.3.4. 數據管理

17.3.4.1. 從文本檔案導入數據

首先創建一個文本檔案，如下：

				
[root@localhost ~]# cat /tmp/hive.txt 
1	2	3
2	3	4
3	4	5
6	7	8

				
hive> CREATE TABLE test (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 0.294 seconds
hive> LOAD DATA LOCAL INPATH '/tmp/hive.txt' OVERWRITE INTO TABLE test;
Loading data to table default.test
OK
Time taken: 0.541 seconds
hive> select * from test;
OK
1	2	3
2	3	4
3	4	5
6	7	8
Time taken: 0.952 seconds, Fetched: 5 row(s)

17.3.4.2. 從其他表查詢數據並創建新表

				
hive> CREATE TABLE mytable AS SELECT * FROM anytable;

17.3.4.3. 從其他表查詢數據然後插入指定表中

				
 INSERT OVERWRITE TABLE mytable SELECT * FROM other ;

17.3.4.4. 從現有表中查詢數據然後插入到新的分區表中

				
hive> insert into table table2 partition(created_date) select * from table1;
hive> insert into table newtable partition(type='1') select * from oldtable;

17.3.5. HDFS與本地檔案系統管理

17.3.5.1. HDFS 目錄遷移

				
[hadoop@localhost ~]$ hdfs dfs -ls /user/hive/warehouse
Found 3 items
drwxrwxr-x   - hadoop supergroup          0 2017-06-29 03:36 /user/hive/warehouse/member
drwxrwxr-x   - hadoop supergroup          0 2017-06-29 03:32 /user/hive/warehouse/test
drwxrwxr-x   - hadoop supergroup          0 2017-06-29 03:41 /user/hive/warehouse/vipuser

[hadoop@localhost ~]$ hdfs dfs -cp /user/hive/warehouse/vipuser /user/hive/warehouse/vipuser2

17.3.5.2. 導出表數據到本地檔案

				
hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/test' SELECT * FROM test;

hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/test' SELECT * FROM member;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20170629040540_ddeda146-efed-44c4-bb20-a6453c21cc8e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1498716998098_0001, Tracking URL = http://localhost:8088/proxy/application_1498716998098_0001/
Kill Command = /srv/apache-hadoop/bin/hadoop job  -kill job_1498716998098_0001
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2017-06-29 04:05:49,221 Stage-1 map = 0%,  reduce = 0%
Ended Job = job_1498716998098_0001
Moving data to local directory /tmp/test
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 10.54 seconds

17.3.5.3. 導出到HDFS

				
hive> insert overwrite directory '/usr/tmp/test' select * from test;

上一頁	上一級	下一頁
17.2. beeline	起始頁	17.4. HiveQL - Hive查詢語言

17.3. 管理 Hive

17.3.1. 表管理

17.3.1.1. 創建表

17.3.1.2. 顯示表

17.3.1.3. 刪除表

17.3.1.4. 查看表結構

17.3.1.5. 為表增加欄位

17.3.1.6. 修改表名稱

17.3.1.7. 使用已有表結構創建新表

17.3.2. 分區表

17.3.2.1. 創建分區表

17.3.2.2. 顯示分區情況

17.3.2.3. 增加分區

17.3.2.4. 向分區表導入數據

17.3.3. 視圖管理

17.3.3.1. 創建視圖

17.3.3.2. 刪除視圖

17.3.4. 數據管理

17.3.4.1. 從文本檔案導入數據

17.3.4.2. 從其他表查詢數據並創建新表

17.3.4.3. 從其他表查詢數據然後插入指定表中

17.3.4.4. 從現有表中查詢數據然後插入到新的分區表中

17.3.5. HDFS與本地檔案系統管理

17.3.5.1. HDFS 目錄遷移

17.3.5.2. 導出表數據到本地檔案

17.3.5.3. 導出到HDFS

17.3.5.4.