Hadoop HA+yarn集群配置好后,跑pi or wrodcount卡住

Hadoop HA+yarn集群后,跑pi or wordcount卡住,我想产看log,结果log里什么都没有打印,很着急。求大师帮助。不胜感激。
下面是跑Pi是的情况,JPS里显示个进程都正常运行。
# jps
20000 ResourceManager
8437 NameNode
21177 Jps
8313 JournalNode
# hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop- mapreduce-examples-2.6.2.jar pi 20 50
Number of Maps = 20
Samples per Map = 50
16/04/05 13:33:28 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Wrote input for Map #16
Wrote input for Map #17
Wrote input for Map #18
Wrote input for Map #19
Starting Job
16/04/05 13:33:31 INFO client.RMProxy: Connecting to ResourceManager at owenyang 00/10.144.81.241:8032
16/04/05 13:33:32 INFO input.FileInputFormat: Total input paths to process : 20
16/04/05 13:33:32 INFO mapreduce.JobSubmitter: number of splits:20
16/04/05 13:33:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14 59824772101_0002
16/04/05 13:33:33 INFO impl.YarnClientImpl: Submitted application application_14 59824772101_0002
16/04/05 13:33:33 INFO mapreduce.Job: The url to track the job: http://owenyang0 0:8088/proxy/application_1459824772101_0002/
16/04/05 13:33:33 INFO mapreduce.Job: Running job: job_1459824772101_0002
已邀请:

wangxiaolei

赞同来自: owenyang

mapred-site.xml加上{{{
yarn.app.mapreduce.am.resource.mb
200
}}}不重启服务,再跑下pi试试。

fish - Hadooper

赞同来自: owenyang

yarn-site.xml中,将yarn.nodemanager.vmem-pmem-ratio设置为更大的值,比如200:
>  
    yarn.nodemanager.vmem-pmem-ratio
    200
 

wangxiaolei

赞同来自: owenyang

现在怎么样了

owenyang - 在我青年

赞同来自: wangxiaolei

==========run==========
# hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar pi **100 100000000**

=========result=====
Job Finished in 613.7 seconds
Estimated value of Pi is **3.14159264920000000000**
===========================


 
在logs目录下找找ResourceManager的日志,把最后的日志信息贴出来看看。

owenyang - 在我青年

Log 里面发现了 "Node not found resyncing"
2016-04-05 13:50:11,817 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@owenyang00:8088
2016-04-05 13:50:11,817 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /cluster started at 8088
2016-04-05 13:50:12,894 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing owenyang03:57386
2016-04-05 13:50:12,897 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing owenyang01:42718
2016-04-05 13:50:12,903 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing owenyang02:50116
下面是rm logs:
---------- yarn-root-resourcemanager-owenyang00.log--------------------------------------------------
2016-04-05 13:50:11,795 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2016-04-05 13:50:11,796 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-04-05 13:50:11,800 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2016-04-05 13:50:11,808 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-04-05 13:50:11,808 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-04-05 13:50:11,817 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@owenyang00:8088
2016-04-05 13:50:11,817 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /cluster started at 8088
2016-04-05 13:50:12,894 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing owenyang03:57386
2016-04-05 13:50:12,897 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing owenyang01:42718
2016-04-05 13:50:12,903 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing owenyang02:50116
2016-04-05 13:50:12,949 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2016-04-05 13:50:13,012 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-04-05 13:50:13,013 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8033
2016-04-05 13:50:13,023 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB to the server
2016-04-05 13:50:13,024 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-04-05 13:50:13,024 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8033: starting
2016-04-05 13:50:13,960 INFO org.apache.hadoop.yarn.util.RackResolver: Resolved owenyang03 to /default-rack
2016-04-05 13:50:13,961 INFO org.apache.hadoop.yarn.util.RackResolver: Resolved owenyang02 to /default-rack
2016-04-05 13:50:13,975 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: NodeManager from node owenyang02(cmPort: 50116 httpPort: 8042) registered with capability: , assigned nodeId owenyang02:50116
2016-04-05 13:50:13,969 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: NodeManager from node owenyang03(cmPort: 57386 httpPort: 8042) registered with capability: , assigned nodeId owenyang03:57386
2016-04-05 13:50:13,969 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: owenyang03:57386 Node Transitioned from NEW to RUNNING
2016-04-05 13:50:13,964 INFO org.apache.hadoop.yarn.util.RackResolver: Resolved owenyang01 to /default-rack
2016-04-05 13:50:13,980 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: NodeManager from node owenyang01(cmPort: 42718 httpPort: 8042) registered with capability: , assigned nodeId owenyang01:42718
2016-04-05 13:50:13,984 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: owenyang02:50116 Node Transitioned from NEW to RUNNING
2016-04-05 13:50:13,984 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: owenyang01:42718 Node Transitioned from NEW to RUNNING
2016-04-05 13:50:13,991 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added node owenyang03:57386 cluster capacity:
2016-04-05 13:50:13,993 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added node owenyang02:50116 cluster capacity:
2016-04-05 13:50:14,004 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added node owenyang01:42718 cluster capacity:

 
yarn-site.xml和mapred-site.xml的内容贴出来。
 

owenyang - 在我青年

yarn-site.xml
--------




 
    yarn.resourcemanager.address
       owenyang00:8032
       


 
      yarn.resourcemanager.scheduler.address
          owenyang00:8030
           


 
      yarn.resourcemanager.webapp.address
          owenyang00:8088
           


 
      yarn.resourcemanager.webapp.https.address
          owenyang00:8090
           


 
      yarn.resourcemanager.resource-tracker.address
          owenyang00:8031
           


 
      yarn.resourcemanager.admin.address
          owenyang00:8033
           



        yarn.resourcemanager.scheduler.class
                org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
                



    yarn.scheduler.fair.allocation.file
        ${yarn.home.dir}/etc/hadoop/fairscheduler.xml
       



    yarn.nodemanager.local-dirs
        /home/owenyang/hadoop/yarn/local
        



 
       yarn.log-aggregation-enable
            true
             


 
      yarn.nodemanager.remote-app-log-dir
          /tmp/yarn-log
           


 
      yarn.nodemanager.resource.memory-mb
          1024
           


 
      yarn.nodemanager.resource.cpu-vcores
          1
           



 
      yarn.nodemanager.aux-services
          mapreduce_shuffle
           



=============
--------mapred-site.xml-----------------------------------------------













  mapreduce.framework.name
    yarn
      The runtime framework for executing MapReduce jobs.
        Can be one of local, classic or yarn.
         

         




  mapreduce.jobhistory.address
    owenyang01:10020
      MapReduce JobHistory Server IPC host:port
     




  mapreduce.jobhistory.webapp.address
    owenyang01:19888
      MapReduce JobHistory Server Web UI host:port
     




===========
 

owenyang - 在我青年

添加了, 然后scp 到了其他3个PC. 没有重启nn dn rm nm。直接跑PI。
【37有关键信息:Diagnostics: Container is running beyond virtual memory limits. Current usage: 95.4 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.】
=============all runing comments on Terminal======================
# hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar pi 20 50
Number of Maps  = 20
Samples per Map = 50
16/04/05 15:00:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Wrote input for Map #16
Wrote input for Map #17
Wrote input for Map #18
Wrote input for Map #19
Starting Job
16/04/05 15:00:15 INFO client.RMProxy: Connecting to ResourceManager at owenyang00/10.144.81.241:8032
16/04/05 15:00:16 INFO input.FileInputFormat: Total input paths to process : 20
16/04/05 15:00:16 INFO mapreduce.JobSubmitter: number of splits:20
16/04/05 15:00:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1459835410485_0002
16/04/05 15:00:17 INFO impl.YarnClientImpl: Submitted application application_1459835410485_0002
16/04/05 15:00:17 INFO mapreduce.Job: The url to track the job: http://owenyang00:8088/proxy/application_1459835410485_0002/
16/04/05 15:00:17 INFO mapreduce.Job: Running job: job_1459835410485_0002
16/04/05 15:00:35 INFO mapreduce.Job: Job job_1459835410485_0002 running in uber mode : false
16/04/05 15:00:35 INFO mapreduce.Job:  map 0% reduce 0%
16/04/05 15:00:35 INFO mapreduce.Job: Job job_1459835410485_0002 failed with state FAILED due to: Application application_1459835410485_0002 failed 2 times due to AM Container for appattempt_1459835410485_0002_000002 exited with  exitCode: -103
For more detailed output, check application tracking page:http://owenyang00:8088/proxy/application_1459835410485_0002/Then, click on links to logs of each attempt.
Diagnostics: Container is running beyond virtual memory limits. Current usage: 95.4 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1459835410485_0002_02_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 7447 7439 7439 7439 (java) 441 16 2812837888 24125 /software/jdk1.8.0_77/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/hadoop/logs/userlogs/application_1459835410485_0002/container_1459835410485_0002_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster
        |- 7439 7437 7439 7439 (bash) 0 0 108650496 295 /bin/bash -c /software/jdk1.8.0_77/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/hadoop/logs/userlogs/application_1459835410485_0002/container_1459835410485_0002_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA  -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/root/hadoop/logs/userlogs/application_1459835410485_0002/container_1459835410485_0002_02_000001/stdout 2>/root/hadoop/logs/userlogs/application_1459835410485_0002/container_1459835410485_0002_02_000001/stderr

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
16/04/05 15:00:35 INFO mapreduce.Job: Counters: 0
Job Finished in 19.854 seconds
java.io.FileNotFoundException: File does not exist: hdfs://owenyang00:8020/user/root/QuasiMonteCarlo_1459839611523_1374794002/out/reduce-out
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
        at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1817)
        at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1841)
        at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
        at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
#

 
执行下yarn logs -applicationId application_1459835410485_0002
能看到什么信息

owenyang - 在我青年

不胜感激!问题解决了!

老师好:能够讲讲,为什么HA+yarn的基本配置跑不成功PI吗? 是董老师估计这么设置配置的吗?
修改的这个配置,我可以去查资料,搞懂,但是希望老师在课堂上也话几分钟说下。不胜感激!
====================================================


在yarn-site.xml中,将yarn.nodemanager.vmem-pmem-ratio设置为了200:

    yarn.nodemanager.vmem-pmem-ratio
     200

============ running Info | SUCCESS!!!==========
# scp /root/hadoop/etc/hadoop/yarn-site.xml owenyang02:/                                                                                        root/hadoop/etc/hadoop/
yarn-site.xml                                 100% 2148     2.1KB/s   00:00
# scp /root/hadoop/etc/hadoop/yarn-site.xml owenyang03:/                                                                                        root/hadoop/etc/hadoop/
yarn-site.xml                                 100% 2148     2.1KB/s   00:00
# vim /root/hadoop/etc/hadoop/yarn-site.xml
# hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-                                                                                        mapreduce-examples-2.6.2.jar pi 20 50
Number of Maps  = 20
Samples per Map = 50
16/04/05 16:16:55 WARN util.NativeCodeLoader: Unable to load native-hadoop libra                                                                                        ry for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Wrote input for Map #16
Wrote input for Map #17
Wrote input for Map #18
Wrote input for Map #19
Starting Job
16/04/05 16:16:58 INFO client.RMProxy: Connecting to ResourceManager at owenyang                                                                                        00/10.144.81.241:8032
16/04/05 16:17:00 INFO input.FileInputFormat: Total input paths to process : 20
16/04/05 16:17:00 INFO mapreduce.JobSubmitter: number of splits:20
16/04/05 16:17:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14                                                                                        59835410485_0003
16/04/05 16:17:01 INFO impl.YarnClientImpl: Submitted application application_14                                                                                        59835410485_0003
16/04/05 16:17:01 INFO mapreduce.Job: The url to track the job: http://owenyang0                                                                                        0:8088/proxy/application_1459835410485_0003/
16/04/05 16:17:01 INFO mapreduce.Job: Running job: job_1459835410485_0003
16/04/05 16:17:24 INFO mapreduce.Job: Job job_1459835410485_0003 running in uber                                                                                         mode : false
16/04/05 16:17:24 INFO mapreduce.Job:  map 0% reduce 0%
16/04/05 16:17:31 INFO mapreduce.Job:  map 10% reduce 0%
16/04/05 16:17:37 INFO mapreduce.Job:  map 20% reduce 0%
16/04/05 16:17:43 INFO mapreduce.Job:  map 30% reduce 0%
16/04/05 16:17:49 INFO mapreduce.Job:  map 40% reduce 0%
16/04/05 16:17:56 INFO mapreduce.Job:  map 50% reduce 0%
16/04/05 16:18:02 INFO mapreduce.Job:  map 60% reduce 0%
16/04/05 16:18:08 INFO mapreduce.Job:  map 65% reduce 0%
16/04/05 16:18:13 INFO mapreduce.Job:  map 65% reduce 22%
16/04/05 16:18:14 INFO mapreduce.Job:  map 70% reduce 22%
16/04/05 16:18:16 INFO mapreduce.Job:  map 70% reduce 23%
16/04/05 16:18:20 INFO mapreduce.Job:  map 75% reduce 23%
16/04/05 16:18:22 INFO mapreduce.Job:  map 75% reduce 25%
16/04/05 16:18:26 INFO mapreduce.Job:  map 80% reduce 25%
16/04/05 16:18:28 INFO mapreduce.Job:  map 80% reduce 27%
16/04/05 16:18:32 INFO mapreduce.Job:  map 85% reduce 27%
16/04/05 16:18:34 INFO mapreduce.Job:  map 85% reduce 28%
16/04/05 16:18:37 INFO mapreduce.Job:  map 90% reduce 28%
16/04/05 16:18:40 INFO mapreduce.Job:  map 90% reduce 30%
16/04/05 16:18:43 INFO mapreduce.Job:  map 95% reduce 30%
16/04/05 16:18:46 INFO mapreduce.Job:  map 95% reduce 32%
16/04/05 16:18:49 INFO mapreduce.Job:  map 100% reduce 32%
16/04/05 16:18:51 INFO mapreduce.Job:  map 100% reduce 100%
16/04/05 16:18:51 INFO mapreduce.Job: Job job_1459835410485_0003 completed succe                                                                                        ssfully
16/04/05 16:18:51 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=446
                FILE: Number of bytes written=2265202
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=5310
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=83
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=20
                Launched reduce tasks=1
                Data-local map tasks=20
                Total time spent by all maps in occupied slots (ms)=84917
                Total time spent by all reduces in occupied slots (ms)=47593
                Total time spent by all map tasks (ms)=84917
                Total time spent by all reduce tasks (ms)=47593
                Total vcore-seconds taken by all map tasks=84917
                Total vcore-seconds taken by all reduce tasks=47593
                Total megabyte-seconds taken by all map tasks=86955008
                Total megabyte-seconds taken by all reduce tasks=48735232
        Map-Reduce Framework
                Map input records=20
                Map output records=40
                Map output bytes=360
                Map output materialized bytes=560
                Input split bytes=2950
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=560
                Reduce input records=40
                Reduce output records=0
                Spilled Records=80
                Shuffled Maps =20
                Failed Shuffles=0
                Merged Map outputs=20
                GC time elapsed (ms)=1704
                CPU time spent (ms)=9100
                Physical memory (bytes) snapshot=4066738176
                Virtual memory (bytes) snapshot=43205808128
                Total committed heap usage (bytes)=2737192960
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=2360
        File Output Format Counters
                Bytes Written=97
Job Finished in 113.221 seconds
Estimated value of Pi is 3.14800000000000000000
#

 

owenyang - 在我青年

不胜感激!问题解决了!

老师好:能讲讲,为什么HA+yarn的基本配置跑不成功PI吗? 是董老师有意这么设置配置的吗?
关于的这个配置,我可以去查资料,搞懂具体控制什么,但是希望老师在课堂上能系统地也花几分钟讲讲。这样就通过问题深入理论和实战了。谢谢老师!
===================================

owenyang - 在我青年

**紧接着,worldcount 也跑成功了。下面是终端的显示详情:**
=======creating input folder and adding related content=======
hadoop fs -mkdir input
hadoop fs -ls
hadoop fs -put file/file*.txt input
**=====HDFS input file relevant info=====
more file1.txt
hello hadoop

more file1.txt
hello world**
===========================
# **hadoop fs -ls**
16/04/05 18:59:03 WARN util.NativeCodeLoader: Unable to load native-hadoop libra                                                                                        ry for your platform... using builtin-java classes where applicable
Found 1 items                                                                               35949748_302200074
**drwxr-xr-x   - root supergroup          0 2016-04-04 20:04 input**
# **hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount input output**
16/04/05 18:59:23 WARN util.NativeCodeLoader: Unable to load native-hadoop libra                                                                                        ry for your platform... using builtin-java classes where applicable
16/04/05 18:59:24 INFO client.RMProxy: Connecting to ResourceManager at owenyang                                                                                        00/10.144.81.241:8032
16/04/05 18:59:26 INFO input.FileInputFormat: Total input paths to process : 2
16/04/05 18:59:27 INFO mapreduce.JobSubmitter: number of splits:2
16/04/05 18:59:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14                                                                                        59835410485_0005
16/04/05 18:59:27 INFO impl.YarnClientImpl: Submitted application application_14                                                                                        59835410485_0005
16/04/05 18:59:27 INFO mapreduce.Job: The url to track the job: http://owenyang0                                                                                        0:8088/proxy/application_1459835410485_0005/
16/04/05 18:59:27 INFO mapreduce.Job: Running job: job_1459835410485_0005
16/04/05 18:59:46 INFO mapreduce.Job: Job job_1459835410485_0005 running in uber mode : false
16/04/05 18:59:46 INFO mapreduce.Job:  map 0% reduce 0%
16/04/05 18:59:53 INFO mapreduce.Job:  map 100% reduce 0%
16/04/05 19:00:00 INFO mapreduce.Job:  map 100% reduce 100%
16/04/05 19:00:00 INFO mapreduce.Job: Job job_1459835410485_0005 completed successfully
16/04/05 19:00:00 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=55
                FILE: Number of bytes written=322603
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=251
                HDFS: Number of bytes written=25
                HDFS: Number of read operations=9
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=2
                Launched reduce tasks=1
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=8748
                Total time spent by all reduces in occupied slots (ms)=4518
                Total time spent by all map tasks (ms)=8748
                Total time spent by all reduce tasks (ms)=4518
                Total vcore-seconds taken by all map tasks=8748
                Total vcore-seconds taken by all reduce tasks=4518
                Total megabyte-seconds taken by all map tasks=8957952
                Total megabyte-seconds taken by all reduce tasks=4626432
        Map-Reduce Framework
                Map input records=2
                Map output records=4
                Map output bytes=41
                Map output materialized bytes=61
                Input split bytes=226
                Combine input records=4
                Combine output records=4
                Reduce input groups=3
                Reduce shuffle bytes=61
                Reduce input records=4
                Reduce output records=3
                Spilled Records=8
                Shuffled Maps =2
                Failed Shuffles=0
                Merged Map outputs=2
                GC time elapsed (ms)=252
                CPU time spent (ms)=1530
                Physical memory (bytes) snapshot=491528192
                Virtual memory (bytes) snapshot=6179708928
                Total committed heap usage (bytes)=301146112
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=25
        File Output Format Counters
                Bytes Written=25
# **hadoop fs -ls**
16/04/05 19:00:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
**drwxr-xr-x   - root supergroup          0 2016-04-04 20:04 input
drwxr-xr-x   - root supergroup          0 2016-04-05 18:59 output**
# hadoop fs -ls output/
16/04/05 19:00:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
**-rw-r--r--   3 root supergroup          0 2016-04-05 18:59 output/_SUCCESS
-rw-r--r--   3 root supergroup         25 2016-04-05 18:59 output/part-r-00000**
# **hadoop fs -cat output/part-r-00000**
16/04/05 19:01:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
**hadoop  1
hello   2
world   1**
#
**PS: 下次继续执行wordcount时,会报错。这是因为HDFS的output文件已经存在。请用下面的命令删除之。
hadoop fs -rm -r -f output
然后就可以正常跑了,或者你重新定义一个非output的文件名。

完**
 

owenyang - 在我青年

========================
# hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar pi 20** 500**
**当参数从50变为500时,报错:**
Application application_1459835410485_0006 failed 2 times due to AM Container for appattempt_1459835410485_0006_000002 exited with exitCode: -103

For more detailed output, check application tracking page:http://owenyang00:8088/proxy/application_1459835410485_0006/Then, click on links to logs of each attempt.

Diagnostics: Container is running **beyond virtual memory limits. Current usage: 92.6 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used.** **Killing container.**

Dump of the process-tree for container_1459835410485_0006_02_000001 :

|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE

|- 9265 9263 9265 9265 (bash) 0 0 108650496 295 /bin/bash -c /software/jdk1.8.0_77/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/hadoop/logs/userlogs/application_1459835410485_0006/container_1459835410485_0006_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/root/hadoop/logs/userlogs/application_1459835410485_0006/container_1459835410485_0006_02_000001/stdout 2>/root/hadoop/logs/userlogs/application_1459835410485_0006/container_1459835410485_0006_02_000001/stderr

|- 9273 9265 9265 9265 (java) 405 13 2809688064 23413 /software/jdk1.8.0_77/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/hadoop/logs/userlogs/application_1459835410485_0006/container_1459835410485_0006_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster



Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

Failing this attempt. Failing the application.
=========================================================================
**在yarn-site.xml中,将yarn.nodemanager.vmem-pmem-ratio和yarn.nodemanager.resource.memory-mb的值该多少合适,需要继续学习。
参考董老师的博客:**http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-memory-cpu-scheduling/
**(1)yarn.nodemanager.resource.memory-mb

表示该节点上YARN可使用的物理内存总量,默认是8192(MB),注意,如果你的节点内存资源不够8GB,则需要调减小这个值,而YARN不会智能的探测节点的物理内存总量。

(2)yarn.nodemanager.vmem-pmem-ratio

任务每使用1MB物理内存,最多可使用虚拟内存量,默认是2.1。
==========================**
是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉,默认是true。
在yarn-site.xml中,将其值设置成false{{{
yarn.nodemanager.vmem-check-enabled
false
}}}

owenyang - 在我青年

**在yarn-site.xml中,添加并将其值设置成false**
_**
    yarn.nodemanager.vmem-check-enabled
    false
**_
**然后跑pi 20 500,成功!**
# **hadoop jar /root/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar pi 20 500**
Number of Maps  = 20
Samples per Map = 500
16/04/05 23:02:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Wrote input for Map #16
Wrote input for Map #17
Wrote input for Map #18
Wrote input for Map #19
Starting Job
16/04/05 23:02:30 INFO client.RMProxy: Connecting to ResourceManager at owenyang00/10.144.81.241:8032
16/04/05 23:02:31 INFO input.FileInputFormat: Total input paths to process : 20
16/04/05 23:02:31 INFO mapreduce.JobSubmitter: number of splits:20
16/04/05 23:02:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1459835410485_0008
16/04/05 23:02:32 INFO impl.YarnClientImpl: Submitted application application_1459835410485_0008
16/04/05 23:02:32 INFO mapreduce.Job: The url to track the job: http://owenyang00:8088/proxy/application_1459835410485_0008/
16/04/05 23:02:32 INFO mapreduce.Job: Running job: job_1459835410485_0008
16/04/05 23:02:46 INFO mapreduce.Job: Job job_1459835410485_0008 running in uber mode : false
16/04/05 23:02:46 INFO mapreduce.Job:  map 0% reduce 0%
16/04/05 23:02:53 INFO mapreduce.Job:  map 10% reduce 0%
16/04/05 23:02:58 INFO mapreduce.Job:  map 20% reduce 0%
16/04/05 23:03:04 INFO mapreduce.Job:  map 30% reduce 0%
16/04/05 23:03:10 INFO mapreduce.Job:  map 40% reduce 0%
16/04/05 23:03:16 INFO mapreduce.Job:  map 50% reduce 0%
16/04/05 23:03:22 INFO mapreduce.Job:  map 60% reduce 0%
16/04/05 23:03:28 INFO mapreduce.Job:  map 65% reduce 0%
16/04/05 23:03:34 INFO mapreduce.Job:  map 70% reduce 22%
16/04/05 23:03:37 INFO mapreduce.Job:  map 70% reduce 23%
16/04/05 23:03:40 INFO mapreduce.Job:  map 75% reduce 23%
16/04/05 23:03:43 INFO mapreduce.Job:  map 75% reduce 25%
16/04/05 23:03:46 INFO mapreduce.Job:  map 80% reduce 25%
16/04/05 23:03:49 INFO mapreduce.Job:  map 80% reduce 27%
16/04/05 23:03:52 INFO mapreduce.Job:  map 85% reduce 27%
16/04/05 23:03:55 INFO mapreduce.Job:  map 85% reduce 28%
16/04/05 23:03:58 INFO mapreduce.Job:  map 90% reduce 28%
16/04/05 23:04:01 INFO mapreduce.Job:  map 90% reduce 30%
16/04/05 23:04:05 INFO mapreduce.Job:  map 95% reduce 30%
16/04/05 23:04:08 INFO mapreduce.Job:  map 95% reduce 32%
16/04/05 23:04:11 INFO mapreduce.Job:  map 100% reduce 32%
16/04/05 23:04:12 INFO mapreduce.Job:  map 100% reduce 100%
16/04/05 23:04:13 INFO mapreduce.Job: Job job_1459835410485_0008 completed successfully
16/04/05 23:04:13 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=446
                FILE: Number of bytes written=2265097
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=5310
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=83
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=20
                Launched reduce tasks=1
                Data-local map tasks=20
                Total time spent by all maps in occupied slots (ms)=83867
                Total time spent by all reduces in occupied slots (ms)=47586
                Total time spent by all map tasks (ms)=83867
                Total time spent by all reduce tasks (ms)=47586
                Total vcore-seconds taken by all map tasks=83867
                Total vcore-seconds taken by all reduce tasks=47586
                Total megabyte-seconds taken by all map tasks=85879808
                Total megabyte-seconds taken by all reduce tasks=48728064
        Map-Reduce Framework
                Map input records=20
                Map output records=40
                Map output bytes=360
                Map output materialized bytes=560
                Input split bytes=2950
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=560
                Reduce input records=40
                Reduce output records=0
                Spilled Records=80
                Shuffled Maps =20
                Failed Shuffles=0
                Merged Map outputs=20
                GC time elapsed (ms)=1725
                CPU time spent (ms)=9260
                Physical memory (bytes) snapshot=4063383552
                Virtual memory (bytes) snapshot=43210629120
                Total committed heap usage (bytes)=2737192960
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=2360
        File Output Format Counters
                Bytes Written=97
Job Finished in 103.255 seconds
**Estimated value of Pi is 3.14080000000000000000**
#

 

要回复问题请先登录注册