博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
【Apache Nutch系列】Nutch2.0配置安装异常集锦
阅读量:5923 次
发布时间:2019-06-19

本文共 10507 字,大约阅读时间需要 35 分钟。

1、java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration 

[plain]
  1. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration  
  2.         at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:108)  
  3.         at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)  
  4.         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)  
  5.         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)  
  6.         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)  
  7.         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)  
  8.         at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)  
  9.         at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)  
  10.         at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)  
  11.         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)  
  12.         at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)  
  13. Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration  
  14.         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)  
  15.         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)  
  16.         at java.security.AccessController.doPrivileged(Native Method)  
  17.         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)  
  18.         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)  
  19.         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)  
  20.         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)  
  21.         ... 11 more  
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration        at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:108)        at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)        at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)        at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)        at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)        at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)        at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)        at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)        at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)        at java.security.AccessController.doPrivileged(Native Method)        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)        ... 11 more
官方文档说明如下:

[plain]
  1. N.B. It's possible to encounter the following exception: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration; this is caused by the fact that sometimes the hbase TEST jar is deployed in the lib dir. To resolve this just copy the lib over from your installed HBase dir into the build lib dir. (This issue is currently in progress).  
N.B. It's possible to encounter the following exception: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration; this is caused by the fact that sometimes the hbase TEST jar is deployed in the lib dir. To resolve this just copy the lib over from your installed HBase dir into the build lib dir. (This issue is currently in progress).
解决方法:

我们把$HBASE_HOME/lib下的所有包,拷贝到$NUTCH_HOME/runtime/local/lib目录下。运行即可

2、java.lang.NoSuchMethodError:org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V

HBASE官方JIRA BUG编号:HBASE-8273

这个是HBASE-5357引入的问题,原因是HBASE-5357将HColumnDescriptor.setMaxVersions 返回值修改成返回​​HColumnDescriptor,而不是返回void,所以改变了​​HColumnDescriptor setMaxVersions 方法的签名。所以它只会得到与Integer.intValue编译仍然不会找到setMaxVersions(INT)

Cloudera 官网说明

[plain]
  1. Column family manipulations are binary-incompatible between CDH4.2 and CDH4.0/CDH4.1  
  2. Because of HBASE-5357, code compiled against CDH4.0 and CDH4.1 will fail with java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V, if used with the CDH4.2 libraries. The reason is that the setter methods in HColumnDescriptor were modified to return HColumnDescriptor instead of void, which changes their signature. Code that only does data manipulations, using the HTable class, will still work without recompilation.  
  3.   
  4. Bug: HBASE-8273  
  5.   
  6. Severity: Medium  
  7.   
  8. Anticipated Resolution: None planned; use workaround.  
  9.   
  10. Workaround: Code compiled against CDH4.0 and 4.1 that uses HColumnDescriptor must be recompiled against CDH4.2 in order to work with the CDH4.2 libraries. Code compiled against CDH4.0 and CDH4.1 running with those libraries does not have this problem.  
Column family manipulations are binary-incompatible between CDH4.2 and CDH4.0/CDH4.1Because of HBASE-5357, code compiled against CDH4.0 and CDH4.1 will fail with java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V, if used with the CDH4.2 libraries. The reason is that the setter methods in HColumnDescriptor were modified to return HColumnDescriptor instead of void, which changes their signature. Code that only does data manipulations, using the HTable class, will still work without recompilation.Bug: HBASE-8273Severity: MediumAnticipated Resolution: None planned; use workaround.Workaround: Code compiled against CDH4.0 and 4.1 that uses HColumnDescriptor must be recompiled against CDH4.2 in order to work with the CDH4.2 libraries. Code compiled against CDH4.0 and CDH4.1 running with those libraries does not have this problem.

原因:这边我使用的hadoop和hbase启动是没有问题的,也就是说是gora-hbase插件的问题

解决方法:

将gora-hbase插件中涉及使用到HColumnDescriptor的代码重新编译可解决。

具体要编译那些类后续会列出

3、java.lang.ClassNotFoundException: org.apache.gora.hbase.store.HBaseStore

[plain]
  1. hadoop@nutch1:/data/projects/apache-nutch-2.2.1/runtime/local$ bin/nutch crawl urls/seed.txt -dir crawl -depth 3 -topN 5  
  2. Exception in thread "main" java.lang.ClassNotFoundException: org.apache.gora.hbase.store.HBaseStore  
  3.         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)  
  4.         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)  
  5.         at java.security.AccessController.doPrivileged(Native Method)  
  6.         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)  
  7.         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)  
  8.         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)  
  9.         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)  
  10.         at java.lang.Class.forName0(Native Method)  
  11.         at java.lang.Class.forName(Class.java:188)  
  12.         at org.apache.nutch.storage.StorageUtils.getDataStoreClass(StorageUtils.java:89)  
  13.         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:73)  
  14.         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)  
  15.         at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)  
  16.         at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)  
  17.         at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)  
  18.         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)  
  19.         at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)  
hadoop@nutch1:/data/projects/apache-nutch-2.2.1/runtime/local$ bin/nutch crawl urls/seed.txt -dir crawl -depth 3 -topN 5Exception in thread "main" java.lang.ClassNotFoundException: org.apache.gora.hbase.store.HBaseStore        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)        at java.security.AccessController.doPrivileged(Native Method)        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)        at java.lang.Class.forName0(Native Method)        at java.lang.Class.forName(Class.java:188)        at org.apache.nutch.storage.StorageUtils.getDataStoreClass(StorageUtils.java:89)        at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:73)        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)        at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)        at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)        at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)        at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
解决方法:

方法1:下载gora-0.3,然后对该目录下的gora-hbase进行编译生成gora-hbase.jar,然后将jar包放到$NUTCH/runtime/local/lib目录下

方法2:修改$NUTCH_HOME/ivy/ivy.xml

将<dependency org="org.apache.gora" name="gora-hbase" rev="0.3" conf="*->default" />去掉注释。然后再重新编译一次。这样ivy会为你生成gora-hbase的插件

4、java.lang.NullPointerException

[plain]
  1.  java.lang.NullPointerException  
  2. at org.apache.avro.util.Utf8.<init>(Utf8.java:37)  
  3. at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)  
  4. at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)  
  5. at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)  
  6. at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)  
  7. at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)  
java.lang.NullPointerExceptionat org.apache.avro.util.Utf8.
(Utf8.java:37)at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)

查看GeneratorReducer第100行代码如下:

batchId = newUtf8(conf.get(GeneratorJob.BATCH_ID));

可以看到是获取GeneratorJob.BATCH_ID。也就是generate.batch.id这个值的时候报空了!

解决方法:
方法1:
在nutch-site.xml中添加
generate.batch.id配置项,value不为空即可;但是这种做法不是很好,因为查看源码里面batchId是用随机数生成的。可能有其他地方有限制。
方法2:
修改GeneratorJob中的public Map<String,Object> run(Map<String,Object> args) 方法。

添加以下三行

[java]
  1. // generate batchId  
  2.    int randomSeed = Math.abs(new Random().nextInt());  
  3.    String batchId = (curTime / 1000) + "-" + randomSeed;  
  4.    getConf().set(BATCH_ID, batchId);  
// generate batchId    int randomSeed = Math.abs(new Random().nextInt());    String batchId = (curTime / 1000) + "-" + randomSeed;    getConf().set(BATCH_ID, batchId);

转载地址:http://kyavx.baihongyu.com/

你可能感兴趣的文章
Linux备份ifcfg-eth0文件导致的网络故障问题
查看>>
2018年尾总结——稳中成长
查看>>
$resource in AngularJS
查看>>
java虚拟机学习笔记 【1】
查看>>
DUBBO笔记
查看>>
nginx php上传大文件的设置(php-fpm)
查看>>
MySQL 运行状态监控方法
查看>>
Fedora 12 环境下Gtk+开发环境配置
查看>>
vs2008中在解决方案资源管理器查看当前打开文件
查看>>
ubuntu14.04 鼠标闪烁问题
查看>>
jQuery Lightbox(balupton版)图片展示插件demo
查看>>
Elasticsearch集群的简单搭建
查看>>
SCRT-SSH传输文件
查看>>
Python非常cool的svg格式chart生成库pygal
查看>>
Telnet部署与启动 windows&&linux
查看>>
行列式的乘法定理
查看>>
有1000瓶水,3个瓶子可以再换1瓶,一共可以喝多少瓶?
查看>>
Search in Rotated Sorted Array ||
查看>>
NUC_HomeWork1 -- POJ2067(最短路)
查看>>
卸载mysql
查看>>