In, I showed how to configure a single node Hadoop instance on Windows 10. The steps are not too difficult to follow if you have Java programming background. However there is one step that is not very straightforward: native Hadoop executable (winutils.exe) is not included in the official Hadoop distribution and needs to be downloaded separately or built locally. In Linux or UNIX, you don’t usually need to do that since the native libs are pre-compiled and included in the binary distribution.In August 2016, Microsoft has published the initial release of Windows Subsystem for Linux (WSL). In Jun this year, WSL 2.0 will also be released with enhanced performance. With WSL, we can run Linux as subsystem in Windows 10.
In this post, I am going to show you how to install Hadoop 3.2.0 in WSL. Prerequisites Follow the page below to enable WSL and then install one of the Linux systems from Microsoft Store.To be specific, enable WSL by running the following PowerShell code as Administrator (or enable it through Control Panel):Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-LinuxAnd then install Ubuntu from Microsoft Store.Once download is completed, click Launch button to lunch the application. It make take a few minutes to install:During the installation, you need to input a username and password. Have you tried the steps I mentioned in the post?sudo apt-get install sshsudo service ssh restartI'm not expert in network and I don't think the following solution will definitely help as they are all local traffics. I cannot reproduce this issue in my environment, so it will be hard to say where it goes wrong in your environment.There must be some other reasons that you cannot ssh localhost. For example, is port 22 used by your other programs?Can you also please try the same approach to allow ssh connections?The websites won't start successfully until you resolve the ssh issue.
So make sure you can ssh localhost first. Protocol type: TCP. Local port: 22. Remote port: All Ports. Scope: make sure all your local IP addresses are added. Profiles: Private. I'm choosing this one I will only connect to my wSL when connecting to private network.
Have you tried the solution I mentioned in the post? I got the same issue when it is first installed but after the following commands, it work. And also make sure you stop and restart hadoop daemons.sudo apt-get install sshsudo service ssh restartI'm not expert in network and I don't think the following solution will definitely help as they are all local traffics. There must be some other reasons that you cannot ssh localhost. For example, is port 22 used by your other programs? Can you also use IPv4 addresses for localhost instead of the IPv6 one?Can you try to add firewall rule to allow TCP traffic to ssh port 22?. Protocol type: TCP.
Local port: 22. Remote port: All Ports. Scope: make sure all your local IP addresses are added. Profiles: Private. I'm choosing this one I will only connect to my wSL when connecting to private network.
Spark Winutil
Today, I was working on and found that there are some issues with site. DSWB’s Jupyter Notebook link was not working. I try to overcome this situation by creating Standalone Mode Setup on my home Windows 10 PC. This blog post summarizes steps that I have performed for the purpose.Please refer Wikipedia Apache Spark page ( ) to start learning about the same. Software Version details:. OS: Microsoft Windows 10 Version 10.0.14393 64bit.
Java JDK Version 1.8.0101. Apache Spark version 2.0.2.
Hadoop Github
Scala Version 2.12.0Install JavaJava is required for clearly mentions “It’s easy to run locally on one machine — all you need is to have java installed on your system PATH, or the JAVAHOME environment variable pointing to a Java installation”. I’ve used Java JDK Version 1.8.0101 for my setup.
Winutils Hadoop 2.7
ScalaApache Spark is written in Scala programming language and needs it installed on local PC. I’ve downloaded Scala 2.12.0 binaries MSI installer from ( ). Followed standard installation prompts and installed Scala on default path ( C:Program Files (x86)scala ). WinutilsI referred various sources and found that Spark can run locally, but needs winutils.exe which is a component of Hadoop. So why exactly is winutils and why it is required? On further investigation, I found that among other things, it seems like Spark uses Hadoop which calls UNIX commands such as chmod to create files and directories. Also, winutils calls are made to read and write files on Windows.
In summary, it is required for running shell commands on Windows OS. I’m running 64bit Windows 10 and downloaded winutils.exe from this Git Hub URL:. Placed the winutils.exe file to a folder. SparkI’ve downloaded latest Spark release 2.0.2 (Nov 14, 2016) from the official download site.( ).
The downloaded file is 7zip compressed. I’ve extracted the files to a folder in D drive as my C drive have limited space.