Flume进阶-将日志文件写入HDFS

配置将指定目录下的日志文件写入HDFS
步骤
1:通过hadoop创建目录/flume/log
2:复制hadoop-core-1.1.1.jar到flume/lib
3:配置一个hdfs.conf如下
agent1.sources = spooldirSource
agent1.channels = memoryChannel
agent1.sinks = hdfsSink

agent1.sources.spooldirSource.type=spooldir
agent1.sources.spooldirSource.spoolDir=/tmp/flume
agent1.sources.spooldirSource.channels=memoryChannel

agent1.sinks.hdfsSink.type=hdfs
agent1.sinks.hdfsSink.hdfs.path=hdfs://pg2:9000/flume/log
agent1.sinks.hdfsSink.filePrefix=log-
agent1.sinks.hdfsSink.channel=memoryChannel

agent1.channels.memoryChannel.type=memory
agent1.channels.memoryChannel.capacity=100
4:启动
[root@pg1 apache-flume-1.4.0-bin]# ./bin/flume-ng agent -n agent1 -c conf -f conf/hdfs.conf                                                                                                                                                   Info: Sourcing environment configuration script /flume/apache-flume-1.4.0-bin/conf/flume-env.sh
Info: Including Hadoop libraries found via (/hadoop/hadoop-1.1.1/bin/hadoop) for HDFS access
Info: Excluding /hadoop/hadoop-1.1.1/libexec/../lib/slf4j-api-1.4.3.jar from classpath
Info: Excluding /hadoop/hadoop-1.1.1/libexec/../lib/slf4j-log4j12-1.4.3.jar from classpath
Info: Including HBASE libraries found via (/hbase/hbase-0.94.4/bin/hbase) for HBASE access
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flume/tools/GetJavaProperty
Caused by: java.lang.ClassNotFoundException: org.apache.flume.tools.GetJavaProperty
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.flume.tools.GetJavaProperty.  Program will exit.
Info: Excluding /hbase/hbase-0.94.4/lib/slf4j-api-1.4.3.jar from classpath
Info: Excluding /hbase/hbase-0.94.4/lib/slf4j-log4j12-1.4.3.jar from classpath
Info: Excluding /hadoop/hadoop-1.1.1/libexec/../lib/slf4j-api-1.4.3.jar from classpath
Info: Excluding /hadoop/hadoop-1.1.1/libexec/../lib/slf4j-log4j12-1.4.3.jar from classpath
 
报错内容org/apache/flume/tools/GetJavaProperty在该包中。
flume-ng-core-1.4.0.jar
5:测试
[root@pg1 conf]# cd /tmp/flume
[root@pg1 flume]# ls
[root@pg1 flume]# echo "Test hello flume write data to hdfs">test123.txt
[root@pg1 flume]# ls
test123.txt.COMPLETED 
[root@pg1 flume]#
可以看到已经被处理
6:查看Flume日志 $FLUME_HOME/logs下
2013 08:19:42,859 INFO  [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:145)  - Starting Channel memoryChannelemoryChannel type memory
2013 08:19:42,916 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:110)  - Monitoried counter group for type: CHANNEL, name: memoryChannel, registered successfully.
2013 08:19:42,917 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:94)  - Component type: CHANNEL, name: memoryChannel started
2013 08:19:42,917 INFO  [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:173)  - Starting Sink hdfsSinkdfsSink, type: hdfs
2013 08:19:42,918 INFO  [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:184)  - Starting Source spooldirSourcee
2013 08:19:42,921 INFO  [lifecycleSupervisor-1-1] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:110)  - Monitoried counter group for type: SINK, name: hdfsSink, registered successfully.
2013 08:19:42,921 INFO  [lifecycleSupervisor-1-1] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:94)  - Component type: SINK, name: hdfsSink startedce=EventDrivenSourceRunner: { source:org.apache.flume.source.Spooll
7:查看HDFS文件

0 个评论

要回复文章请先登录注册