在执行HiveSQL时,出现了如下的错误:
2018-11-05 16:43:02,568 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-30333]: Failed with exception Directory hdfs://master:8020/user/hive/bus_optimation_xm/g_operate_stat istic/date_time=201806124/date_type=1440/date_flag=1/index_id=100201001 could not be cleaned up. org.apache.hadoop.hive.ql.metadata.HiveException: Directory hdfs://master:8020/user/hive/bus_optimation_xm/g_operate_statistic/date_time=201806124/date_type=1440/date_flag=1/index_id=100201001 could not be cleaned up. at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2936) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1442) at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1636) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:388) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
已经下面这段错误:
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied by sticky bit setting: user=bus, inode=000000_0 at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkStickyBit(DefaultAuthorizationProvider.java:388) at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:166) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6621) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4078) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4030) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4014) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:841) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:308) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:597) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) at org.apache.hadoop.ipc.Client.call(Client.java:1471) at org.apache.hadoop.ipc.Client.call(Client.java:1408) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy14.delete(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:531) at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy15.delete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2038) ... 30 more
无法清除目录大约就是权限的问题,可是即使我很暴力的加了777权限依旧无法删除该分区目录。在继续追查日志的时候发现如下信息:
Permission denied by sticky bit setting: user=bus, inode=000000_0
这算是错误日志中唯一的和权限有关系的报错提示,于是Google大法了一把,了解到原来是由于sticky bit
导致的错误,那么什么是sticky bit呢?
sticky bit 不同于suid, guid,对于others的execute权限位,则可以设置sticky bit标志, 用t来表示,如果该位置本来就有可执行权限位,即x,则t和x叠加后用大写的T来表示。 sticky bit只对目录起作用,如果一个目录设置了sticky bit,则该目录下的文件只能被 该文件的owner或者root删除,其他用户即使有删除权限也无法删除该文件。 例如,/tmp目录,它的权限为d rwx rwx rwt,该目录中的文件(或目录)只能被owner 或root删除,这样大家都可以把自己的临时文件往该目录里面放,但是你的文件别人是无法 删除的。 注意:suid, sgid只对文件起作用,而sticky bit只对目录起作用。
来看下配置了sticky bit的效果:
-rwxrwxrwt 3 hdfs hive 636 2018-11-06 10:19 /user/hive/xxxx/000000_0 -rwxrwxrwt 3 hdfs hive 635 2018-11-06 10:19 /user/hive/xxxx/000001_0
原来如此,由于我们的Hive有时候会通过控制台访问,有时候会通过HUE访问,目前统计了下发现大约有5个用户在操作hive的表。所以如果是hadoop用户创建的文件,那么一旦配置了sticky bit,那么别的用户将无法删除这个文件。根据官方CDH官方文档,配置sticky bit的方法是如下操作
hadoop fs -chmod -R 1777 /user/hive/xxxx/
所以,解决办法就是将sticky bit配置给移除。
hadoop fs -chmod -R -t /user/hive/xxxx/
目前我试了下只能通过-t这个参数取消sticky bit的配置,了解linux chmod命令的同志们应该都清楚啥意思。无法通过chmod 777来取消这个参数。
参考文档:https://www.cloudera.com/documentation/enterprise/5-13-x/topics/cdh_sg_sticky_bit_set.html