hadoop集群运行pi 失败

sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 2 10
Number of Maps = 2
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Starting Job
16/09/09 16:17:01 INFO client.RMProxy: Connecting to ResourceManager at liujunjie-1/10.144.67.160:8032
16/09/09 16:17:02 INFO input.FileInputFormat: Total input paths to process : 2
16/09/09 16:17:03 INFO mapreduce.JobSubmitter: number of splits:2
16/09/09 16:17:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1462370365985_0022
16/09/09 16:17:04 INFO impl.YarnClientImpl: Submitted application application_1462370365985_0022
16/09/09 16:17:04 INFO mapreduce.Job: The url to track the job: http://liujunjie-1:8088/proxy/ ... 0022/
16/09/09 16:17:04 INFO mapreduce.Job: Running job: job_1462370365985_0022
16/09/09 16:17:19 INFO mapreduce.Job: Job job_1462370365985_0022 running in uber mode : false
16/09/09 16:17:19 INFO mapreduce.Job: map 0% reduce 0%
16/09/09 16:17:29 INFO mapreduce.Job: map 100% reduce 0%
16/09/09 16:17:44 INFO mapreduce.Job: map 100% reduce 17%
16/09/09 16:17:51 INFO mapreduce.Job: map 100% reduce 100%
16/09/09 16:17:51 INFO mapreduce.Job: Task Id : attempt_1462370365985_0022_r_000000_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)

16/09/09 16:17:52 INFO mapreduce.Job: map 100% reduce 0%
16/09/09 16:18:09 INFO mapreduce.Job: map 100% reduce 17%
16/09/09 16:18:16 INFO mapreduce.Job: Task Id : attempt_1462370365985_0022_r_000000_1, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)

16/09/09 16:18:17 INFO mapreduce.Job: map 100% reduce 0%
16/09/09 16:18:32 INFO mapreduce.Job: map 100% reduce 17%
16/09/09 16:18:39 INFO mapreduce.Job: Task Id : attempt_1462370365985_0022_r_000000_2, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#10
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)

16/09/09 16:18:40 INFO mapreduce.Job: map 100% reduce 0%
16/09/09 16:18:55 INFO mapreduce.Job: map 100% reduce 17%
16/09/09 16:19:02 INFO mapreduce.Job: map 100% reduce 100%
16/09/09 16:19:02 INFO mapreduce.Job: Job job_1462370365985_0022 failed with state FAILED due to: Task failed task_1462370365985_0022_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1

16/09/09 16:19:02 INFO mapreduce.Job: Counters: 37
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=223897
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=532
HDFS: Number of bytes written=0
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Failed reduce tasks=4
Launched map tasks=2
Launched reduce tasks=4
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=64144
Total time spent by all reduces in occupied slots (ms)=332356
Total time spent by all map tasks (ms)=16036
Total time spent by all reduce tasks (ms)=83089
Total vcore-seconds taken by all map tasks=16036
Total vcore-seconds taken by all reduce tasks=83089
Total megabyte-seconds taken by all map tasks=3207200
Total megabyte-seconds taken by all reduce tasks=16617800
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=36
Map output materialized bytes=67
Input split bytes=296
Combine input records=0
Spilled Records=4
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=383
CPU time spent (ms)=1400
Physical memory (bytes) snapshot=278884352
Virtual memory (bytes) snapshot=1361793024
Total committed heap usage (bytes)=122560512
File Input Format Counters
Bytes Read=236
Job Finished in 120.999 seconds
java.io.FileNotFoundException: File does not exist: hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1132)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1750)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
[root@liujunjie-1 hadoop-yarn]# yarn logs -applicationId application_1462370365985_0022
16/09/09 16:20:12 INFO client.RMProxy: Connecting to ResourceManager at liujunjie-1/10.144.67.160:8032
hdfs://liujunjie-1:8020/tmp/yarn-log/root/logs/application_1462370365985_0022does not exist.
Log aggregation has not completed or is not enabled.

fish - Hadooper

赞同来自: 唯思可达

因为你提交任务用的是sudo -u hdfs hadoop jar xxx,所以,在检查log的时候,也请使用sudo -u hdfs yarn logs xxx。

fish - Hadooper

赞同来自: 唯思可达

mapreduce看样子是执行成了的。只不过集群中有存在问题的机器。 你将liujunjie-3上的nodemanager停掉,然后再重新提交任务,看是不是就没错误提示了?

fish - Hadooper

赞同来自: 唯思可达

我重启了所有的nodemanager。 另外,我看到liujunjie-1机器上你启动了两个fuse,状态很奇怪,一个挂到/data/hadoop_fuse,一个从命令行看挂到./hadoop_fuse(我不确定你这任务从什么地方启动的)。 对/data/hadoop_fuse进行umount以及停掉两个dfs_fuse的Java进程。我还不能确定这两个进程是否影响了之前mapreduce任务的执行。   不过现在执行mapreduce任务已经正常了。   但/data/hadoop_fuse目录状态不太正常,你如果想用fuse,先挂到其他目录吧。

唯思可达

赞同来自:

       <name>yarn.log-aggregation-enable</name>      <value>true</value> 这个配置是true 但是运行 yarn logs -applicationId application_1462370365985_0022命令失败,提示Log aggregation has not completed or is not enabled.

fish - Hadooper

赞同来自:

hdfs上建了这个目录么:hdfs://liujunjie-1:8020/tmp/yarn-log。

唯思可达

赞同来自:

有这个目录  drwxrwxrwt   - yarn      supergroup          0 2016-09-09 15:02 /tmp/yarn-log drwxrwx---   - hdfs      supergroup          0 2016-05-04 22:01 /tmp/yarn-log/hdfs ls: Permission denied: user=root, access=READ_EXECUTE, inode="/tmp/yarn-log/hdfs":hdfs:supergroup:drwxrwx--- drwxrwx---   - root      supergroup          0 2016-09-09 15:02 /tmp/yarn-log/root drwxrwx---   - root      supergroup          0 2016-09-09 16:21 /tmp/yarn-log/root/logs drwxrwx---   - root      supergroup          0 2016-09-09 15:03 /tmp/yarn-log/root/logs/application_1462370365985_0017  

唯思可达

赞同来自:

帮忙看看是什么原因 [root@liujunjie-1 hadoop-mapreduce]# sudo -u hdfs yarn logs -applicationId application_1462370365985_0022 16/09/09 16:45:22 INFO client.RMProxy: Connecting to ResourceManager at liujunjie-1/10.144.67.160:8032 16/09/09 16:45:26 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 16/09/09 16:45:26 INFO compress.CodecPool: Got brand-new decompressor [.deflate] Container: container_1462370365985_0022_01_000005 on liujunjie-1_46171 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:stdout Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:8118 Log Contents: 2016-09-09 16:17:57,694 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:17:58,047 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:17:58,047 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system started 2016-09-09 16:17:58,066 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2016-09-09 16:17:58,066 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1462370365985_0022, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@64375c2a) 2016-09-09 16:17:58,329 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2016-09-09 16:17:59,179 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:18:01,114 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2016-09-09 16:18:01,957 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:18:02,044 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 2016-09-09 16:18:02,223 INFO [main] org.apache.hadoop.mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@65306276 2016-09-09 16:18:02,306 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=70968936, maxSingleShuffleLimit=17742234, mergeThreshold=46839500, ioSortFactor=64, memToMemMergeOutputsThreshold=64 2016-09-09 16:18:02,316 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_1 Thread started: EventFetcher for fetching Map Completion Events 2016-09-09 16:18:02,336 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_1: Got 2 new map-outputs 2016-09-09 16:18:02,352 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#1 2016-09-09 16:18:02,353 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#1 2016-09-09 16:18:02,353 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-4:13562 with 1 to fetcher#2 2016-09-09 16:18:02,353 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-4:13562 to fetcher#2 2016-09-09 16:18:02,560 WARN [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:02,565 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1462370365985_0022&reduce=0&map=attempt_1462370365985_0022_m_000000_0 sent hash and received reply 2016-09-09 16:18:02,585 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#1 in 233ms 2016-09-09 16:18:02,684 INFO [fetcher#2] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-09-09 16:18:02,714 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#2 about to shuffle output of map attempt_1462370365985_0022_m_000000_0 decomp: 24 len: 34 to MEMORY 2016-09-09 16:18:02,721 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 24 bytes from map-output for attempt_1462370365985_0022_m_000000_0 2016-09-09 16:18:02,722 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 24, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->24 2016-09-09 16:18:02,728 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-4:13562 freed by fetcher#2 in 375ms 2016-09-09 16:18:15,582 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#2 2016-09-09 16:18:15,582 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#2 2016-09-09 16:18:15,587 WARN [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:15,588 FATAL [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Shuffle failed with too many fetch failures and insufficient progress! 2016-09-09 16:18:15,588 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2 2016-09-09 16:18:15,589 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2         at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:15,590 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#2 in 8ms 2016-09-09 16:18:15,620 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task 2016-09-09 16:18:15,659 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/out/_temporary/1/_temporary/attempt_1462370365985_0022_r_000000_1 2016-09-09 16:18:15,667 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system... 2016-09-09 16:18:15,668 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system stopped. 2016-09-09 16:18:15,668 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system shutdown complete. Container: container_1462370365985_0022_01_000006 on liujunjie-2_35140 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:stdout Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:8139 Log Contents: 2016-09-09 16:18:21,442 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:18:21,695 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:18:21,696 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system started 2016-09-09 16:18:21,719 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2016-09-09 16:18:21,719 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1462370365985_0022, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@64375c2a) 2016-09-09 16:18:21,966 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2016-09-09 16:18:22,580 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:18:24,174 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2016-09-09 16:18:24,974 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:18:25,005 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 2016-09-09 16:18:25,189 INFO [main] org.apache.hadoop.mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@6f22c544 2016-09-09 16:18:25,253 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=70968936, maxSingleShuffleLimit=17742234, mergeThreshold=46839500, ioSortFactor=64, memToMemMergeOutputsThreshold=64 2016-09-09 16:18:25,260 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_2 Thread started: EventFetcher for fetching Map Completion Events 2016-09-09 16:18:25,304 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#9 2016-09-09 16:18:25,305 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#9 2016-09-09 16:18:25,308 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-4:13562 with 1 to fetcher#10 2016-09-09 16:18:25,310 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-4:13562 to fetcher#10 2016-09-09 16:18:25,311 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_2: Got 2 new map-outputs 2016-09-09 16:18:25,359 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1462370365985_0022&reduce=0&map=attempt_1462370365985_0022_m_000000_0 sent hash and received reply 2016-09-09 16:18:25,367 WARN [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:25,392 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#9 in 87ms 2016-09-09 16:18:25,400 INFO [fetcher#10] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-09-09 16:18:25,421 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#10 about to shuffle output of map attempt_1462370365985_0022_m_000000_0 decomp: 24 len: 34 to MEMORY 2016-09-09 16:18:25,434 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 24 bytes from map-output for attempt_1462370365985_0022_m_000000_0 2016-09-09 16:18:25,438 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 24, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->24 2016-09-09 16:18:25,445 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-4:13562 freed by fetcher#10 in 135ms 2016-09-09 16:18:38,383 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#10 2016-09-09 16:18:38,383 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#10 2016-09-09 16:18:38,388 WARN [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:38,389 FATAL [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Shuffle failed with too many fetch failures and insufficient progress! 2016-09-09 16:18:38,389 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#10 2016-09-09 16:18:38,390 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#10         at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:38,392 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#10 in 9ms 2016-09-09 16:18:38,399 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task 2016-09-09 16:18:38,425 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/out/_temporary/1/_temporary/attempt_1462370365985_0022_r_000000_2 2016-09-09 16:18:38,430 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system... 2016-09-09 16:18:38,431 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system stopped. 2016-09-09 16:18:38,431 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system shutdown complete. Container: container_1462370365985_0022_01_000007 on liujunjie-3_53523 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:0 Log Contents: LogType:stdout Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:8116 Log Contents: 2016-09-09 16:18:43,981 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:18:44,164 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:18:44,164 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system started 2016-09-09 16:18:44,185 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2016-09-09 16:18:44,186 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1462370365985_0022, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@64375c2a) 2016-09-09 16:18:44,400 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2016-09-09 16:18:44,970 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:18:46,757 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2016-09-09 16:18:47,560 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:18:47,590 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 2016-09-09 16:18:47,770 INFO [main] org.apache.hadoop.mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@55df60a 2016-09-09 16:18:47,817 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=70968936, maxSingleShuffleLimit=17742234, mergeThreshold=46839500, ioSortFactor=64, memToMemMergeOutputsThreshold=64 2016-09-09 16:18:47,823 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_3 Thread started: EventFetcher for fetching Map Completion Events 2016-09-09 16:18:47,854 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_3: Got 2 new map-outputs 2016-09-09 16:18:47,854 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#2 2016-09-09 16:18:47,854 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#2 2016-09-09 16:18:47,855 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-4:13562 with 1 to fetcher#3 2016-09-09 16:18:47,855 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-4:13562 to fetcher#3 2016-09-09 16:18:47,910 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1462370365985_0022&reduce=0&map=attempt_1462370365985_0022_m_000000_0 sent hash and received reply 2016-09-09 16:18:47,919 WARN [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:18:47,939 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#2 in 86ms 2016-09-09 16:18:47,951 INFO [fetcher#3] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-09-09 16:18:47,967 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#3 about to shuffle output of map attempt_1462370365985_0022_m_000000_0 decomp: 24 len: 34 to MEMORY 2016-09-09 16:18:47,973 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 24 bytes from map-output for attempt_1462370365985_0022_m_000000_0 2016-09-09 16:18:47,981 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 24, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->24 2016-09-09 16:18:47,983 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-4:13562 freed by fetcher#3 in 128ms 2016-09-09 16:19:00,935 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#3 2016-09-09 16:19:00,935 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#3 2016-09-09 16:19:00,940 WARN [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:19:00,940 FATAL [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Shuffle failed with too many fetch failures and insufficient progress! 2016-09-09 16:19:00,941 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3 2016-09-09 16:19:00,942 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3         at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:19:00,944 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#3 in 9ms 2016-09-09 16:19:00,951 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task 2016-09-09 16:19:00,972 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/out/_temporary/1/_temporary/attempt_1462370365985_0022_r_000000_3 2016-09-09 16:19:00,977 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system... 2016-09-09 16:19:00,977 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system stopped. 2016-09-09 16:19:00,977 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system shutdown complete. Container: container_1462370365985_0022_01_000004 on liujunjie-3_53523 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:0 Log Contents: LogType:stdout Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:8117 Log Contents: 2016-09-09 16:17:33,829 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:17:34,016 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:17:34,017 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system started 2016-09-09 16:17:34,039 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2016-09-09 16:17:34,039 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1462370365985_0022, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@64375c2a) 2016-09-09 16:17:34,269 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2016-09-09 16:17:34,849 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:17:36,598 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2016-09-09 16:17:37,346 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:17:37,375 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 2016-09-09 16:17:37,555 INFO [main] org.apache.hadoop.mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@55df60a 2016-09-09 16:17:37,609 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=70968936, maxSingleShuffleLimit=17742234, mergeThreshold=46839500, ioSortFactor=64, memToMemMergeOutputsThreshold=64 2016-09-09 16:17:37,618 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 2016-09-09 16:17:37,648 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1462370365985_0022_r_000000_0: Got 2 new map-outputs 2016-09-09 16:17:37,649 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#2 2016-09-09 16:17:37,649 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#2 2016-09-09 16:17:37,650 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-4:13562 with 1 to fetcher#3 2016-09-09 16:17:37,650 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-4:13562 to fetcher#3 2016-09-09 16:17:37,697 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1462370365985_0022&reduce=0&map=attempt_1462370365985_0022_m_000000_0 sent hash and received reply 2016-09-09 16:17:37,703 WARN [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:17:37,725 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#2 in 77ms 2016-09-09 16:17:37,738 INFO [fetcher#3] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-09-09 16:17:37,753 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#3 about to shuffle output of map attempt_1462370365985_0022_m_000000_0 decomp: 24 len: 34 to MEMORY 2016-09-09 16:17:37,762 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 24 bytes from map-output for attempt_1462370365985_0022_m_000000_0 2016-09-09 16:17:37,767 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 24, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->24 2016-09-09 16:17:37,771 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-4:13562 freed by fetcher#3 in 121ms 2016-09-09 16:17:50,722 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning liujunjie-3:13562 with 1 to fetcher#3 2016-09-09 16:17:50,722 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to liujunjie-3:13562 to fetcher#3 2016-09-09 16:17:50,728 WARN [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to liujunjie-3:13562 with 1 map outputs java.io.IOException: Got invalid response code 500 from http://liujunjie-3:13562/mapOu ... 01_0: Internal Server Error         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:429)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:292)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:17:50,728 FATAL [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Shuffle failed with too many fetch failures and insufficient progress! 2016-09-09 16:17:50,729 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3 2016-09-09 16:17:50,729 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3         at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:358)         at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:280)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)         at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2016-09-09 16:17:50,731 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: liujunjie-3:13562 freed by fetcher#3 in 10ms 2016-09-09 16:17:50,740 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task 2016-09-09 16:17:50,771 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/out/_temporary/1/_temporary/attempt_1462370365985_0022_r_000000_0 2016-09-09 16:17:50,776 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system... 2016-09-09 16:17:50,776 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system stopped. 2016-09-09 16:17:50,776 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system shutdown complete. Container: container_1462370365985_0022_01_000002 on liujunjie-3_53523 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:0 Log Contents: LogType:stdout Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:09 +0800 2016 LogLength:3512 Log Contents: 2016-09-09 16:17:23,249 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:17:23,432 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:17:23,433 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 2016-09-09 16:17:23,453 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2016-09-09 16:17:23,453 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1462370365985_0022, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@64375c2a) 2016-09-09 16:17:23,676 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2016-09-09 16:17:24,261 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:17:25,923 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2016-09-09 16:17:26,684 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:17:26,707 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 2016-09-09 16:17:27,154 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/in/part1:0+118 2016-09-09 16:17:27,263 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 7864316(31457264) 2016-09-09 16:17:27,264 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 30 2016-09-09 16:17:27,264 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 25165824 2016-09-09 16:17:27,264 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 31457280 2016-09-09 16:17:27,264 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 7864316; length = 1966080 2016-09-09 16:17:27,278 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2016-09-09 16:17:27,380 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2016-09-09 16:17:27,380 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-09-09 16:17:27,380 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 18; bufvoid = 31457280 2016-09-09 16:17:27,381 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 7864316(31457264); kvend = 7864312(31457248); length = 5/1966080 2016-09-09 16:17:27,391 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy] 2016-09-09 16:17:27,402 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0 2016-09-09 16:17:27,423 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1462370365985_0022_m_000001_0 is done. And is in the process of committing 2016-09-09 16:17:27,650 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1462370365985_0022_m_000001_0' done. 2016-09-09 16:17:27,651 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system... 2016-09-09 16:17:27,652 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped. 2016-09-09 16:17:27,652 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete. Container: container_1462370365985_0022_01_000003 on liujunjie-4_49754 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:stdout Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:3512 Log Contents: 2016-09-09 16:17:23,455 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:17:23,738 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:17:23,738 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 2016-09-09 16:17:23,762 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2016-09-09 16:17:23,762 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1462370365985_0022, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@64375c2a) 2016-09-09 16:17:24,024 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2016-09-09 16:17:24,698 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:17:26,305 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2016-09-09 16:17:27,110 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:17:27,135 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 2016-09-09 16:17:27,585 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://liujunjie-1:8020/user/hdfs/QuasiMonteCarlo_1473409017518_1360010061/in/part0:0+118 2016-09-09 16:17:27,826 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 7864316(31457264) 2016-09-09 16:17:27,826 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 30 2016-09-09 16:17:27,826 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 25165824 2016-09-09 16:17:27,826 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 31457280 2016-09-09 16:17:27,826 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 7864316; length = 1966080 2016-09-09 16:17:27,845 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2016-09-09 16:17:27,905 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2016-09-09 16:17:27,905 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-09-09 16:17:27,905 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 18; bufvoid = 31457280 2016-09-09 16:17:27,905 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 7864316(31457264); kvend = 7864312(31457248); length = 5/1966080 2016-09-09 16:17:27,917 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy] 2016-09-09 16:17:27,929 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0 2016-09-09 16:17:27,955 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1462370365985_0022_m_000000_0 is done. And is in the process of committing 2016-09-09 16:17:28,157 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1462370365985_0022_m_000000_0' done. 2016-09-09 16:17:28,159 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system... 2016-09-09 16:17:28,159 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped. 2016-09-09 16:17:28,160 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete. Container: container_1462370365985_0022_01_000001 on liujunjie-4_49754 ======================================================================== LogType:stderr Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:222 Log Contents: log4j:WARN No appenders could be found for logger (org.apache.hadoop.ipc.Server). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4 ... onfig for more info. LogType:stdout Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:0 Log Contents: LogType:syslog Log Upload Time:Fri Sep 09 16:19:08 +0800 2016 LogLength:90416 Log Contents: 2016-09-09 16:17:06,874 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1462370365985_0022_000001 2016-09-09 16:17:08,819 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-09-09 16:17:09,071 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2016-09-09 16:17:09,071 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@7c747c96) 2016-09-09 16:17:09,338 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter. 2016-09-09 16:17:11,645 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null 2016-09-09 16:17:11,751 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-09-09 16:17:11,758 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 2016-09-09 16:17:11,802 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler 2016-09-09 16:17:11,808 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher 2016-09-09 16:17:11,810 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher 2016-09-09 16:17:11,812 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher 2016-09-09 16:17:11,817 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler 2016-09-09 16:17:11,820 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher 2016-09-09 16:17:11,821 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter 2016-09-09 16:17:11,824 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter 2016-09-09 16:17:11,906 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://liujunjie-1:8020] 2016-09-09 16:17:11,982 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://liujunjie-1:8020] 2016-09-09 16:17:12,030 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://liujunjie-1:8020] 2016-09-09 16:17:12,050 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled 2016-09-09 16:17:12,137 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler 2016-09-09 16:17:12,663 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-09-09 16:17:12,784 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-09-09 16:17:12,784 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started 2016-09-09 16:17:12,798 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1462370365985_0022 to jobTokenSecretManager 2016-09-09 16:17:13,130 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1462370365985_0022 because: not enabled; 2016-09-09 16:17:13,168 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1462370365985_0022 = 236. Number of splits = 2 2016-09-09 16:17:13,170 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1462370365985_0022 = 1 2016-09-09 16:17:13,170 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1462370365985_0022Job Transitioned from NEW to INITED 2016-09-09 16:17:13,172 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1462370365985_0022. 2016-09-09 16:17:13,309 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-09-09 16:17:13,328 INFO [Socket Reader #1 for port 49969] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for

唯思可达

赞同来自:

kill 掉liujunjie-3上的nodemanager 确实OK了, 如何单独启动liujunjie-3上的nodemanager呢 ,liujunjie-3上的nodemanager出了什么问题了呢?  还是要重新关闭,在启动整个集群 ?

fish - Hadooper

赞同来自:

你得看看liujunjie-3上的nodemanager的log,看这台nm服务是不是出了什么问题。

唯思可达

赞同来自:

[root@liujunjie-3 hadoop-yarn]# tail -n 500 yarn-yarn-nodemanager-liujunjie-3.log 2016-09-09 15:22:39,288 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:       at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 2016-09-09 15:22:39,288 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:       at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 2016-09-09 15:22:39,288 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:       at java.util.concurrent.FutureTask.run(FutureTask.java:262) 2016-09-09 15:22:39,288 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2016-09-09 15:22:39,288 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2016-09-09 15:22:39,288 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:       at java.lang.Thread.run(Thread.java:745) 2016-09-09 15:22:39,289 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1 2016-09-09 15:22:39,289 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0020_02_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-09-09 15:22:39,289 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1462370365985_0020_02_000001 2016-09-09 15:22:39,323 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/root/appcache/application_1462370365985_0020/container_1462370365985_0020_02_000001 2016-09-09 15:22:39,327 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=root OPERATION=Container Finished - Failed       TARGET=ContainerImpl    RESULT=FAILURE  DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE        APPID=application_1462370365985_0020    CONTAINERID=container_1462370365985_0020_02_000001 2016-09-09 15:22:39,327 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0020_02_000001 transitioned from EXITED_WITH_FAILURE to DONE 2016-09-09 15:22:39,327 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1462370365985_0020_02_000001 from application application_1462370365985_0020 2016-09-09 15:22:39,327 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_1462370365985_0020_02_000001 for log-aggregation 2016-09-09 15:22:39,327 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1462370365985_0020 2016-09-09 15:22:40,584 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1462370365985_0020_02_000001] 2016-09-09 15:22:40,585 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0020 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP 2016-09-09 15:22:40,586 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1462370365985_0020 2016-09-09 15:22:40,586 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0020 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED 2016-09-09 15:22:40,586 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Application just finished : application_1462370365985_0020 2016-09-09 15:22:40,586 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/root/appcache/application_1462370365985_0020 2016-09-09 15:22:40,593 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1462370365985_0020_02_000001 2016-09-09 15:22:40,599 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_1462370365985_0020_02_000001. Current good log dirs are /var/log/hadoop-yarn 2016-09-09 15:22:40,600 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0020/container_1462370365985_0020_02_000001/stderr 2016-09-09 15:22:40,600 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0020/container_1462370365985_0020_02_000001/stdout 2016-09-09 15:22:40,601 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0020/container_1462370365985_0020_02_000001/syslog 2016-09-09 15:22:40,647 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0020 2016-09-09 15:48:37,374 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0021_000001 (auth:SIMPLE) 2016-09-09 15:48:37,380 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1462370365985_0021_01_000004 by user hdfs 2016-09-09 15:48:37,380 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1462370365985_0021 2016-09-09 15:48:37,381 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0021 transitioned from NEW to INITING 2016-09-09 15:48:37,381 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Start Container Request        TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0021        CONTAINERID=container_1462370365985_0021_01_000004 2016-09-09 15:48:37,387 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished. 2016-09-09 15:48:37,394 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1462370365985_0021_01_000004 to application application_1462370365985_0021 2016-09-09 15:48:37,395 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0021 transitioned from INITING to RUNNING 2016-09-09 15:48:37,395 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000004 transitioned from NEW to LOCALIZING 2016-09-09 15:48:37,395 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1462370365985_0021 2016-09-09 15:48:37,395 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1462370365985_0021 2016-09-09 15:48:37,395 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle 2016-09-09 15:48:37,395 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1462370365985_0021 2016-09-09 15:48:37,396 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0021/job.jar transitioned from INIT to DOWNLOADING 2016-09-09 15:48:37,396 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0021/job.xml transitioned from INIT to DOWNLOADING 2016-09-09 15:48:37,396 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1462370365985_0021_01_000004 2016-09-09 15:48:37,398 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /data/nm_local/nmPrivate/container_1462370365985_0021_01_000004.tokens. Credentials list: 2016-09-09 15:48:37,398 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user hdfs 2016-09-09 15:48:37,410 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /data/nm_local/nmPrivate/container_1462370365985_0021_01_000004.tokens to /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/container_1462370365985_0021_01_000004.tokens 2016-09-09 15:48:37,410 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021 = file:/data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021 2016-09-09 15:48:37,495 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0021/job.jar(->/data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/filecache/10/job.jar) transitioned from DOWNLOADING to LOCALIZED 2016-09-09 15:48:37,532 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0021/job.xml(->/data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/filecache/11/job.xml) transitioned from DOWNLOADING to LOCALIZED 2016-09-09 15:48:37,532 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000004 transitioned from LOCALIZING to LOCALIZED 2016-09-09 15:48:37,578 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000004 transitioned from LOCALIZED to RUNNING 2016-09-09 15:48:37,585 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/container_1462370365985_0021_01_000004/default_container_executor.sh] 2016-09-09 15:48:37,645 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1462370365985_0021_01_000004 2016-09-09 15:48:37,686 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 14.5 MB of 200 MB physical memory used; 568.1 MB of 420.0 MB virtual memory used 2016-09-09 15:48:40,721 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 80.8 MB of 200 MB physical memory used; 618.6 MB of 420.0 MB virtual memory used 2016-09-09 15:48:43,772 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 97.0 MB of 200 MB physical memory used; 642.3 MB of 420.0 MB virtual memory used 2016-09-09 15:48:46,791 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 100.4 MB of 200 MB physical memory used; 660.4 MB of 420.0 MB virtual memory used 2016-09-09 15:48:49,807 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 99.1 MB of 200 MB physical memory used; 660.4 MB of 420.0 MB virtual memory used 2016-09-09 15:48:52,828 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 99.2 MB of 200 MB physical memory used; 660.4 MB of 420.0 MB virtual memory used 2016-09-09 15:48:55,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24681 for container-id container_1462370365985_0021_01_000004: 99.2 MB of 200 MB physical memory used; 660.4 MB of 420.0 MB virtual memory used 2016-09-09 15:48:57,518 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0021_000001 (auth:SIMPLE) 2016-09-09 15:48:57,523 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1462370365985_0021_01_000004 2016-09-09 15:48:57,523 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Stop Container Request TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0021CONTAINERID=container_1462370365985_0021_01_000004 2016-09-09 15:48:57,524 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000004 transitioned from RUNNING to KILLING 2016-09-09 15:48:57,524 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1462370365985_0021_01_000004 2016-09-09 15:48:57,537 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1462370365985_0021_01_000004 is : 143 2016-09-09 15:48:57,567 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000004 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2016-09-09 15:48:57,567 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/container_1462370365985_0021_01_000004 2016-09-09 15:48:57,569 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs OPERATION=Container Finished - Killed       TARGET=ContainerImpl    RESULT=SUCCESS  APPID=application_1462370365985_0021    CONTAINERID=container_1462370365985_0021_01_000004 2016-09-09 15:48:57,569 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000004 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2016-09-09 15:48:57,569 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1462370365985_0021_01_000004 from application application_1462370365985_0021 2016-09-09 15:48:57,569 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_1462370365985_0021_01_000004 for log-aggregation 2016-09-09 15:48:57,569 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1462370365985_0021 2016-09-09 15:48:58,855 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1462370365985_0021_01_000004 2016-09-09 15:49:00,463 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0021_000001 (auth:SIMPLE) 2016-09-09 15:49:00,468 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1462370365985_0021_01_000005 by user hdfs 2016-09-09 15:49:00,468 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1462370365985_0021_01_000005 to application application_1462370365985_0021 2016-09-09 15:49:00,469 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000005 transitioned from NEW to LOCALIZING 2016-09-09 15:49:00,469 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1462370365985_0021 2016-09-09 15:49:00,469 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1462370365985_0021 2016-09-09 15:49:00,469 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle 2016-09-09 15:49:00,469 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1462370365985_0021 2016-09-09 15:49:00,470 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000005 transitioned from LOCALIZING to LOCALIZED 2016-09-09 15:49:00,470 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Start Container Request        TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0021        CONTAINERID=container_1462370365985_0021_01_000005 2016-09-09 15:49:00,505 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000005 transitioned from LOCALIZED to RUNNING 2016-09-09 15:49:00,512 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/container_1462370365985_0021_01_000005/default_container_executor.sh] 2016-09-09 15:49:00,530 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1462370365985_0021_01_000004] 2016-09-09 15:49:01,855 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1462370365985_0021_01_000005 2016-09-09 15:49:01,888 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 58.6 MB of 200 MB physical memory used; 613.4 MB of 420.0 MB virtual memory used 2016-09-09 15:49:04,943 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 91.3 MB of 200 MB physical memory used; 641.4 MB of 420.0 MB virtual memory used 2016-09-09 15:49:07,959 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 101.9 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 15:49:10,980 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 101.8 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 15:49:13,996 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 101.9 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 15:49:17,016 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 100.4 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 15:49:20,034 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24740 for container-id container_1462370365985_0021_01_000005: 100.4 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 15:49:20,130 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0021_000001 (auth:SIMPLE) 2016-09-09 15:49:20,135 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1462370365985_0021_01_000005 2016-09-09 15:49:20,135 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Stop Container Request TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0021CONTAINERID=container_1462370365985_0021_01_000005 2016-09-09 15:49:20,135 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000005 transitioned from RUNNING to KILLING 2016-09-09 15:49:20,135 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1462370365985_0021_01_000005 2016-09-09 15:49:20,148 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1462370365985_0021_01_000005 is : 143 2016-09-09 15:49:20,173 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000005 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2016-09-09 15:49:20,174 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021/container_1462370365985_0021_01_000005 2016-09-09 15:49:20,176 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs OPERATION=Container Finished - Killed       TARGET=ContainerImpl    RESULT=SUCCESS  APPID=application_1462370365985_0021    CONTAINERID=container_1462370365985_0021_01_000005 2016-09-09 15:49:20,176 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0021_01_000005 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2016-09-09 15:49:20,176 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1462370365985_0021_01_000005 from application application_1462370365985_0021 2016-09-09 15:49:20,176 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_1462370365985_0021_01_000005 for log-aggregation 2016-09-09 15:49:20,176 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1462370365985_0021 2016-09-09 15:49:23,035 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1462370365985_0021_01_000005 2016-09-09 15:49:23,142 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1462370365985_0021_01_000005] 2016-09-09 15:50:15,208 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0021 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP 2016-09-09 15:50:15,209 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1462370365985_0021 2016-09-09 15:50:15,209 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0021 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED 2016-09-09 15:50:15,209 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Application just finished : application_1462370365985_0021 2016-09-09 15:50:15,209 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0021 2016-09-09 15:50:15,230 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_1462370365985_0021_01_000005. Current good log dirs are /var/log/hadoop-yarn 2016-09-09 15:50:15,232 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021/container_1462370365985_0021_01_000005/syslog 2016-09-09 15:50:15,232 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021/container_1462370365985_0021_01_000005/stderr 2016-09-09 15:50:15,233 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021/container_1462370365985_0021_01_000005/stdout 2016-09-09 15:50:15,233 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_1462370365985_0021_01_000004. Current good log dirs are /var/log/hadoop-yarn 2016-09-09 15:50:15,235 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021/container_1462370365985_0021_01_000004/stdout 2016-09-09 15:50:15,235 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021/container_1462370365985_0021_01_000004/syslog 2016-09-09 15:50:15,235 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021/container_1462370365985_0021_01_000004/stderr 2016-09-09 15:50:15,285 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /var/log/hadoop-yarn/application_1462370365985_0021 2016-09-09 16:17:19,866 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0022_000001 (auth:SIMPLE) 2016-09-09 16:17:19,875 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1462370365985_0022_01_000002 by user hdfs 2016-09-09 16:17:19,875 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1462370365985_0022 2016-09-09 16:17:19,875 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0022 transitioned from NEW to INITING 2016-09-09 16:17:19,876 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Start Container Request        TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0022        CONTAINERID=container_1462370365985_0022_01_000002 2016-09-09 16:17:19,881 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished. 2016-09-09 16:17:19,888 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1462370365985_0022_01_000002 to application application_1462370365985_0022 2016-09-09 16:17:19,888 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1462370365985_0022 transitioned from INITING to RUNNING 2016-09-09 16:17:19,889 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000002 transitioned from NEW to LOCALIZING 2016-09-09 16:17:19,889 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1462370365985_0022 2016-09-09 16:17:19,889 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1462370365985_0022 2016-09-09 16:17:19,889 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle 2016-09-09 16:17:19,889 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1462370365985_0022 2016-09-09 16:17:19,889 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0022/job.jar transitioned from INIT to DOWNLOADING 2016-09-09 16:17:19,889 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0022/job.xml transitioned from INIT to DOWNLOADING 2016-09-09 16:17:19,890 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1462370365985_0022_01_000002 2016-09-09 16:17:19,891 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /data/nm_local/nmPrivate/container_1462370365985_0022_01_000002.tokens. Credentials list: 2016-09-09 16:17:19,892 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user hdfs 2016-09-09 16:17:19,894 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /data/nm_local/nmPrivate/container_1462370365985_0022_01_000002.tokens to /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/container_1462370365985_0022_01_000002.tokens 2016-09-09 16:17:19,896 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 = file:/data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022 2016-09-09 16:17:19,960 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0022/job.jar(->/data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/filecache/10/job.jar) transitioned from DOWNLOADING to LOCALIZED 2016-09-09 16:17:19,982 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://liujunjie-1:8020/user/hdfs/.staging/job_1462370365985_0022/job.xml(->/data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/filecache/11/job.xml) transitioned from DOWNLOADING to LOCALIZED 2016-09-09 16:17:19,983 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000002 transitioned from LOCALIZING to LOCALIZED 2016-09-09 16:17:20,016 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000002 transitioned from LOCALIZED to RUNNING 2016-09-09 16:17:20,023 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/container_1462370365985_0022_01_000002/default_container_executor.sh] 2016-09-09 16:17:20,090 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1462370365985_0022_01_000002 2016-09-09 16:17:20,121 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24911 for container-id container_1462370365985_0022_01_000002: 14.9 MB of 200 MB physical memory used; 568.1 MB of 420.0 MB virtual memory used 2016-09-09 16:17:23,196 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24911 for container-id container_1462370365985_0022_01_000002: 80.9 MB of 200 MB physical memory used; 618.8 MB of 420.0 MB virtual memory used 2016-09-09 16:17:26,234 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24911 for container-id container_1462370365985_0022_01_000002: 95.8 MB of 200 MB physical memory used; 642.4 MB of 420.0 MB virtual memory used 2016-09-09 16:17:27,677 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0022_000001 (auth:SIMPLE) 2016-09-09 16:17:27,683 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1462370365985_0022_01_000002 2016-09-09 16:17:27,683 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Stop Container Request TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0022CONTAINERID=container_1462370365985_0022_01_000002 2016-09-09 16:17:27,684 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000002 transitioned from RUNNING to KILLING 2016-09-09 16:17:27,684 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1462370365985_0022_01_000002 2016-09-09 16:17:27,695 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1462370365985_0022_01_000002 is : 143 2016-09-09 16:17:27,722 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000002 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2016-09-09 16:17:27,723 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/container_1462370365985_0022_01_000002 2016-09-09 16:17:27,729 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs OPERATION=Container Finished - Killed       TARGET=ContainerImpl    RESULT=SUCCESS  APPID=application_1462370365985_0022    CONTAINERID=container_1462370365985_0022_01_000002 2016-09-09 16:17:27,730 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000002 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2016-09-09 16:17:27,730 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1462370365985_0022_01_000002 from application application_1462370365985_0022 2016-09-09 16:17:27,730 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_1462370365985_0022_01_000002 for log-aggregation 2016-09-09 16:17:27,730 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1462370365985_0022 2016-09-09 16:17:29,234 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1462370365985_0022_01_000002 2016-09-09 16:17:30,591 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0022_000001 (auth:SIMPLE) 2016-09-09 16:17:30,598 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1462370365985_0022_01_000004 by user hdfs 2016-09-09 16:17:30,598 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1462370365985_0022_01_000004 to application application_1462370365985_0022 2016-09-09 16:17:30,598 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000004 transitioned from NEW to LOCALIZING 2016-09-09 16:17:30,598 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1462370365985_0022 2016-09-09 16:17:30,599 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1462370365985_0022 2016-09-09 16:17:30,599 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle 2016-09-09 16:17:30,599 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1462370365985_0022 2016-09-09 16:17:30,601 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Start Container Request        TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0022        CONTAINERID=container_1462370365985_0022_01_000004 2016-09-09 16:17:30,602 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000004 transitioned from LOCALIZING to LOCALIZED 2016-09-09 16:17:30,647 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000004 transitioned from LOCALIZED to RUNNING 2016-09-09 16:17:30,654 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/container_1462370365985_0022_01_000004/default_container_executor.sh] 2016-09-09 16:17:30,690 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1462370365985_0022_01_000002] 2016-09-09 16:17:32,235 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1462370365985_0022_01_000004 2016-09-09 16:17:32,288 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 60.9 MB of 200 MB physical memory used; 613.4 MB of 420.0 MB virtual memory used 2016-09-09 16:17:35,340 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 91.7 MB of 200 MB physical memory used; 642.5 MB of 420.0 MB virtual memory used 2016-09-09 16:17:37,689 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:17:37,689 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x2dd5f789, /10.144.31.84:33502 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:17:38,358 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 104.4 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:17:41,373 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 102.8 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:17:44,388 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 102.9 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:17:47,415 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 102.9 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:17:50,430 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 24959 for container-id container_1462370365985_0022_01_000004: 102.9 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:17:50,726 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:17:50,727 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x28209d90, /10.144.31.84:33503 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:17:50,792 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0022_000001 (auth:SIMPLE) 2016-09-09 16:17:50,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1462370365985_0022_01_000004 2016-09-09 16:17:50,797 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Stop Container Request TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0022CONTAINERID=container_1462370365985_0022_01_000004 2016-09-09 16:17:50,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000004 transitioned from RUNNING to KILLING 2016-09-09 16:17:50,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1462370365985_0022_01_000004 2016-09-09 16:17:50,809 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1462370365985_0022_01_000004 is : 143 2016-09-09 16:17:50,840 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000004 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2016-09-09 16:17:50,842 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/container_1462370365985_0022_01_000004 2016-09-09 16:17:50,843 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs OPERATION=Container Finished - Killed       TARGET=ContainerImpl    RESULT=SUCCESS  APPID=application_1462370365985_0022    CONTAINERID=container_1462370365985_0022_01_000004 2016-09-09 16:17:50,843 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000004 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2016-09-09 16:17:50,844 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1462370365985_0022_01_000004 from application application_1462370365985_0022 2016-09-09 16:17:50,844 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_1462370365985_0022_01_000004 for log-aggregation 2016-09-09 16:17:50,844 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1462370365985_0022 2016-09-09 16:17:53,431 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1462370365985_0022_01_000004 2016-09-09 16:17:53,804 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1462370365985_0022_01_000004] 2016-09-09 16:18:02,550 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:18:02,550 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x6f015f87, /10.144.67.160:35874 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:18:15,585 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:18:15,586 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x3df07964, /10.144.67.160:35876 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:18:25,350 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:18:25,351 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x4326315f, /10.144.11.239:50452 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:18:38,387 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:18:38,387 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0xefe46dd8, /10.144.11.239:50454 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:18:40,847 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1462370365985_0022_000001 (auth:SIMPLE) 2016-09-09 16:18:40,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1462370365985_0022_01_000007 by user hdfs 2016-09-09 16:18:40,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1462370365985_0022_01_000007 to application application_1462370365985_0022 2016-09-09 16:18:40,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000007 transitioned from NEW to LOCALIZING 2016-09-09 16:18:40,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1462370365985_0022 2016-09-09 16:18:40,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1462370365985_0022 2016-09-09 16:18:40,854 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle 2016-09-09 16:18:40,855 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1462370365985_0022 2016-09-09 16:18:40,855 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000007 transitioned from LOCALIZING to LOCALIZED 2016-09-09 16:18:40,858 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs IP=10.144.65.182   OPERATION=Start Container Request        TARGET=ContainerManageImpl      RESULT=SUCCESS  APPID=application_1462370365985_0022        CONTAINERID=container_1462370365985_0022_01_000007 2016-09-09 16:18:40,883 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1462370365985_0022_01_000007 transitioned from LOCALIZED to RUNNING 2016-09-09 16:18:40,890 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /data/nm_local/usercache/hdfs/appcache/application_1462370365985_0022/container_1462370365985_0022_01_000007/default_container_executor.sh] 2016-09-09 16:18:41,433 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1462370365985_0022_01_000007 2016-09-09 16:18:41,484 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 46.2 MB of 200 MB physical memory used; 584.0 MB of 420.0 MB virtual memory used 2016-09-09 16:18:44,512 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 84.4 MB of 200 MB physical memory used; 621.0 MB of 420.0 MB virtual memory used 2016-09-09 16:18:47,539 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 100.8 MB of 200 MB physical memory used; 643.4 MB of 420.0 MB virtual memory used 2016-09-09 16:18:47,898 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:18:47,911 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0xe2f5458e, /10.144.31.84:33507 => /10.144.31.84:13562] EXCEPTION: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils 2016-09-09 16:18:50,554 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 103.7 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:18:53,586 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 101.4 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:18:56,605 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 101.5 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:18:59,619 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 25019 for container-id container_1462370365985_0022_01_000007: 101.5 MB of 200 MB physical memory used; 660.5 MB of 420.0 MB virtual memory used 2016-09-09 16:19:00,938 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/security/SecureShuffleUtils         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:931)         at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:774)         at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)         at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)         at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)         at java.lang.Thread.run(Thread.java:745) 2016-09-09 16:19:00,939 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x0e57c63c, /10.144.31.

唯思可达

赞同来自:

重新运行了一下,又失败了,用sudo -u hdfs yarn logs  -applicationId  application_1462370365985_0026看了下日志,发现和刚才的报错一样,只不过是liujunjie-1的报错了,怎么破?冼老师

fish - Hadooper

赞同来自:

装的什么版本的hadoop,cdh还是apache?

唯思可达

赞同来自:

是CDH的 ,重新运行了一下,又失败了,用sudo -u hdfs yarn logs  -applicationId  application_1462370365985_0026看了下日志,发现和刚才的报错一样,只不过是liujunjie-1的报错了,怎么破?冼老师

fish - Hadooper

赞同来自:

mapred-site.xml里面mapreduce.application.classpath配成什么了?

唯思可达

赞同来自:

    <name>mapreduce.application.classpath</name>     <value>$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$MR2_CLASSPATH</value>

fish - Hadooper

赞同来自:

机器上有/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar这个文件么? jar -tf /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar | fgrep SecureShuffleUtils 输出什么?

唯思可达

赞同来自:

[root@liujunjie-1 hadoop-mapreduce]# jar -tf hadoop-mapreduce-client-core.jar |fgrep SecureShuffleUtils org/apache/hadoop/mapreduce/security/SecureShuffleUtils.class

fish - Hadooper

赞同来自:

我上你机器看看吧,服务什么的我可以随意修改重启吧?

唯思可达

赞同来自:

可以的,修改了什么配置,您知会一下我就行。

要回复问题请先登录注册