Thursday, April 19, 2012

Hadoop Eclipse plugin for CDH3 u3

There's a general Hadoop Eclipse plugin ( But it's built against Apache Hadoop, and it's incompatible with CDH3 cluster.

On a CDH3 installed machine, we can find some additional tools and libraries under folder "/usr/lib/hadoop/contrib", they were built from source code under HADOOP_PKG/src/contrib. And the Eclipse plugin source code is also there, it was just not compiled.

Build your CDH3 Eclipse plugin yourself
(under ubuntu x32 Desktop)
1, Download Hadoop package from Cloudera (

2, Install compile tools:
a) Eclipse: I failed compile with the Eclipse which installed from Ubuntu software center, so, please download eclipse from Eclipse Indigo 3.7.2:
b) apt-get install maven2

3, compile Hadoop

4, compile Eclipse Plugin
we need to point out Eclipse position, because some jar files under ${eclipse.home}/plugins will be used during the compiling.
  cd HADOOP_PKG/src/contrib/eclipse-plugin
  ant -Declipse.home=XXXX -Dversion=0.20.2-cdh3u3 jar

If everything is OK, the final eclipse plugin jar will generated under HADOOP_PKG/build/contrib/eclipse-plugin/

Work with Eclipse
1, install the plugin
put the plugin jar under <Eclipse_ROOT>/plugins

2, basic configuration 
Fill out your Hadoop installation directory in "Window" > "Preferences" > "Hadoop Map/Reduce"
Eclipse will load some Hadoop libraries when writing a Map/Reduce  project.

3, add a CDH3 cluster information
In "Map/Reduce Locations" window, add "New Hadoop location...", fill the cluster information.

4, A problem u MUST meet :(
After you add the M/R location, you can click the "DFS Locations"  in "Project Explorer" to browse HDFS. Unfortunately, you must see the error dialog telling you about connecting error.

"An internal error occurred during: "Connecting to DFS vm".

That's because the eclipse plugin can not find the guava jar.
Fix it: merge guava jar to eclipse plugin jar.
find out the "guava-r09-jarjar.jar" under HADOOP_PKG/lib, copy content inside (folder "org/") to <plugin_jar>/classes/.
You should see 2 folder "eclipse" & "thirdparty" under <plugin_jar>/classes/org/apache/hadoop/.
Put the fixed plugin jar file back to <Eclipse_ROOT>/plugins. You should be able to browse HDFS from "DFS Locations" now.

* MAKE SURE DO NOT DESTROY THE JAR STRUCTURE WHILE MERGING THEM. If you're using an archive tool, such as "File Roller" in Ubuntu, just drag  the "org" to folder /classes/ in GUI.

5, problems u might meet
a) No "Map/Reduce Project" choice in "New Project" wizard.
  the plugin jar works only with the specified version that compiled with. If you copy the plugin jar to another version eclipse, that might not work.


  1. I get a lot of great information from this blog. Thank you for your sharing this informative blog. Just now I have completed hadoop certification course at a leading academy. If you are interested to learn Hadoop Training Chennai visit FITA IT training and placement academy.

    1. I have read your blog its very attractive and impressive. I like it your blog.

      Java Training in Chennai Core Java Training in Chennai Core Java Training in Chennai

      Java Online Training Java Online Training JavaEE Training in Chennai Java EE Training in Chennai

    2. Java Online Training Java Online Training Java Online Training Java Online Training Java Online Training Java Online Training

      Hibernate Online Training Hibernate Online Training Spring Online Training Spring Online Training Spring Batch Training Online Spring Batch Training Online

    3. The effectiveness of IEEE Project Domains depends very much on the situation in which they are applied. In order to further improve IEEE Final Year Project Domains practices we need to explicitly describe and utilise our knowledge about software domains of software engineering Final Year Project Domains for CSE technologies. This paper suggests a modelling formalism for supporting systematic reuse of software engineering technologies during planning of software projects and improvement programmes in Final Year Projects for CSE.

      Software management seeks for decision support to identify technologies like JavaScript that meet best the goals and characteristics of a software project or improvement programme. JavaScript Training in Chennai Accessible experiences and repositories that effectively guide that technology selection are still lacking.

      Aim of technology domain analysis is to describe the class of context situations (e.g., kinds of JavaScript software projects) in which a software engineering technology JavaScript Training in Chennai can be applied successfully

      The Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

  2. You have certainly explained that Big data analytics is the process of examining big data to uncover hidden patterns, unknown correlations and other useful information that can be used to make better decisions..The big data analytics is the major part to be understood regarding Hadoop Training in Chennai program. Via your quality content i get to know about that in deep. Thanks for sharing this here.

  3. This is the exact piece of information that I was searching for a long time(Hadoop Training in Chennai). Processing data is the biggest issue that every cloud based companies are facing worldwide(Big Data Training in Chennai). Handling this problem made easy with the introduction of big data. Thank you so much for your worth able content here. Keep Posting article like this(Hadoop training institutes in chennai).

  4. I have finally found a Worth able content to read. The way you have presented information here is quite impressive. I have bookmarked this page for future use. Thanks for sharing content like this once again. Keep sharing content like this.

    Software testing training in chennai | Software testing course | Manual testing training in Chennai

  5. The expansion of internet and intelligence in business process lead the way to huge volume of data. It is important to maintain and process these data to be efficient in data handling. Hadoop Training in Chennai | Big Data Training in Chennai

  6. Excellent post. Big data is a term that portrays the substantial volume of information; both organized and unstructured that immerses a business on an everyday premise. To know more details please visit Big Data Training in Chennai | Primavera Training in Chennai

  7. I was able to build the plugin successfully. In eclipse, I can see the map/reduce perspective but when i click on the add " new hadoop location" nothing happens.I am not able to create a new location.
    Any advice ?

    We provide best Primavera Training in Chennai with affordable Primavera course fees

  8. I was very pleased to find this web-site. I wanted to thanks for your time for this wonderful read!! I definitely enjoying every little bit of it and I have you bookmarked to check out new stuff you blog post. Read more...