Friday 9 September 2016

Hadoop-eclipse-plugin installation


After setting up your hadoop on windows which we learnt in our earlier blogs(Apache Hadoop Installation on Windows 64-bit platform), we now set up eclipse on windows to develop Map Reduce programs for our Apache Hadoop. Below are the steps to do so:

Install Eclipse on windows (download from here)

Next step is to download the Hadoop Eclipse Plugin. Many tutorials suggest to build the plugin jar file Yourself using "ant/maven". This method is a quite messy and you may screwup your hadoop install changing the various build.xml and build-properties.xml files. The best way is to directly download this jar file from here.

Next copy the jar file to eclipse/plugins directory. Its going to be c:/eclipse/plugins in my case.

Start eclipse, go to "Window >> Open Perspective >> Other". From perspectives window, you should see “Map/Reduce”, Select it and click "OK".






















You will see “Map/Reduce” perspective icon at the top right hand corner of the main eclipse panel now, as highlighted below.


Also you will notice the Map/Reduce Locations tab at the bottom. Go to that tab, right click and add a new location as shown below.










In the details form, give any Location Name.Under the Map/Reduce(V2)Master section, fill in Host as your name node host name. In our case, its "master". The port is 9001. Under DFS Master, check the "Use M/R Master host" option and give port as 9000. Now click finish.















Also you see DFS Locations at project explorer. Expand it to see the location added in the previous step.















Right click on folder (0) select Create new directory to create input and output folders as mentioned in the below diagram.













After creating input and output folder, right click on DFS Locations and select Disconnect and Refresh. You should see below folder structure.













Lets create WordCountSample.txt file and upload to DFS.
WordCountSmaple.txt
Hadoop is an Apache open source framework written in java
that allows distributed processing of large datasets across clusters of computers using simple programming models
A Hadoop frame-worked application works in an environment that 
provides distributed storage and computation across clusters of computers
Hadoop is designed to scale up from single server to thousands of machines
each offering local computation and storage
Right click on input folder and select Upload files to DFS....


Browse from the windows system and select sample file


Right click on DFS Locations and select Disconnect and Refresh. You should see below folder structure.

Double click on file name to see the content (read only)


In the next article (How to create a WordCount MapReduce with Maven and Eclipse) we will see how to run a MapReduce program using eclipse and debug the code line by line....

6 comments:

  1. Hi, I have done the above as explained but am getting this error

    error call from pc-004/192.168.1.10 to localhost 9000 failed on connection exception

    Please assist where have i gone wrong and how can i troubleshoot it ?

    ReplyDelete
    Replies
    1. Same issue happened to me.It is because the setup port number might not be 9000 in mapred-site.xml. Check your setup, if its different use that port number, else setup port number 9000 in this xml. That resolved my problem. Thanks

      Delete
  2. Hi:
    it is very helpful for me!
    my problem is:
    I upload the txt file, but I find the uploaded file is empty. why?

    ReplyDelete
    Replies
    1. same here any solution?

      Delete
    2. I also found same problem.
      Are you okay now?
      Please give the solution!

      Delete
  3. hello..
    the above mentioned link "Install Eclipse on windows (download from here)" is not working.. so i installed the eclipse photon and then the plugin from above mentioned link to the plugin directory but its not working.. there is no Mapreduce option in perspective options..

    ReplyDelete