The process is straight forward. First, we need to download and install the following software: Java Download the Java 1.8 from Once installed confirm that you’re running the correct version from command line using ‘java -version’ command, output of which you can confirm in command line like this: WinRAR I’ve downloaded and installed WinRAR 64 bit release from that will later allow me to decompress Linux type tar.gz packages on Windows. Hadoop The next step was to install a Hadoop distribution. To do so, I’ve decided to download the most recent release Hadoop 3.0.0-alpha2 (25 Jan, 2017) in a binary form, from the Apache Download Mirror at Once the hadoop-3.0.0-alpha2.tar.gz (250 MB) downloaded, I’ve extracted it by using WinRAR (installed in the previous step) into C: hadoop-3.0.0-alpha2 folder: Now that I had Hadoop downloaded, it was time to start the Hadoop cluster with a single node. Setup Environmental Variables In Windows 10 I’ve opened System Properties windows and clicked on Environment Variables button: Then created a new HADOOPHOME variable and pointed the path to C: hadoop-3.0.0-alpha2 bin folder on my PC: Next step was to add a Hadoop bin directory path to PATH variable. Clicked on PATH and pressed edit: Then added a ‘C: hadoop-3.0.0-alpha2 bin’ path like this and pressed OK: Edit Hadoop Configuration Next thing I’ve configured Hadoop to start on localhost and port 9000, by editing: C: hadoop-3.0.0-alpha2 etc hadoop core-site.xml file, just like this: Next I went to C: hadoop-3.0.0-alpha2 etc hadoop folder and renamed mapred-site.xml.template to mapred-site.xml. Then I’ve edited the mapred-site.xml file adding the following XML Yarn configuration for Mapreduce.
How to get started with Hadoop on Windows by installing Hortonworks Data Platform 2.0 on Windows Server. How To Install Hadoop on Windows with HDP 2.0. After that there is no information / guide to what we can do on this windows installation.All tutorials are related to virtual sandbox. Is it possible to provide tutorials for. Installing the JDK Software and Setting JAVA_HOME (From Oracle docs) How to set java_home on Windows 7 Make sure you select the location where your JDK was installed, because you may install it somewhere else also. Once that has been done you will continue with the installation, should you stumble upon any other dependency problem later.
This is what yarn-site.xml file looked like once completed: Then I continued by editing hadoop-env.cmd in C: hadoop-3.0.0-alpha2 etc hadoop hadoop-env.cmd. I’ve changed the line for JAVAHOME=%JAVAHOME% and added a path to my JAVA folder: C: PROGRA1 Java JDK181.01 It’s usually better to use Windows short names here. So I went to C: Program Files Java jdk1.8.0111 where my Java JDK is installed and converted a long path to windows short name: Next step was to open hadoop-env.cmd and add it in there, as shown in this screenshot: Next in C: hadoop-3.0.0-alpha2 bin using windows command prompt as admin run: ‘hdfs namenode -format’ command. Output looked like this. Then I’ve finally started Hadoop. I’ve opened command prompt as admin in C: hadoop-3.0.0-alpha2 sbin and ran start-dfs.cmd and also start-yarn.cmd, like this: Open Hadoop GUI Once all above steps were completed, I’ve opened browser and navigated to: — Next Steps?
This section won’t go into details of setting up IntelliJ, etc. But just very briefly If you want to play with WordCount.java and Hadoop’s mapreduce algorithm, you can download it from, it’ll look like this: Then once you have the code working, you can use, same as I did, an online generator at to create couple of random words: After I did so, I’ve saved my words to words.txt, but to make it little more fun, I’ve replaced some of them with my last name, for a total of 96 unique words and 4 that are repeated last name. Running Wordlist against Hadoop’s MapReduce Once I ran my code, it executed and started processing the words.txt file that was prior to execution copied to input folder (which I created earlier together with the output folder for the outcome files).
Following was the result of Hadoop’s processing job: I was able to see follow the job progress in the browser as well: RESULTS Once done, I’ve looked up the output folder and voila! Words.txt was processed by Hadoop. Here is the result. It’s of course not the most perfect Hadoop wordcount, but I wanted to have all words as unique, except my last name, to make sure that the wordcount and Hadoop work as intended, which the result proved. Outcome is sorted, counts are provided correctly.
I am working on hadoop installation in Windows 7. Tried to untar the tarfiles from apache site but it was unsuccessful.
I have searched in internet and found below link. I was able to install. But when i was trying to execute the examples i was encountered with below errors.