Wednesday, September 6, 2017

How to Connect HDInsight using Putty

I have written many articles on HDInsight Service which is an Azure service that allows us to provision Apache Hadoop Cluster and use it for various type of big data related implementations. If you need to see all my articles written on HDInsight, please use the link: http://dinesql.blogspot.com/search?q=hdinsight.

Although I have mainly used Azure portal to demonstrate HDInsight (Ambari dashboard), many use a tool called PuTTY that is a free implementation of SSH and Telnet for Windows and Unix platform. It is a good tool to work with, specifically if you need to connect the cluster remotely and work with it for navigating nodes or doing some data work with some sub projects like Hive against the cluster. Remember, if you have configured your HDInsight cluster as a Linux-based cluster, you need a SSH client to remotely connect with it. Let's see the way of using it.

First of all you need to download PuTTY for using it as Windows machines do not have it as a built-in tool (Linux and Mac OS computers have a SSH client interface built-in). Here is the link for downloading PuTTY.


There is no installation with it. You can just use the putty.exe. However, you need a host name to connect. If you go to the portal and open your cluster blade, you can get the Host Name for connecting with a SSH Client by clicking Secure Shell (SSH) button.


Note that there are two, one for Windows and another for Linux, Unix and OS X users, make sure that you copy the Windows one.


When you open PuTTY, you should see an input box called Host Name for entering the copied host name. Enter the copied value and make sure that the Connection Type is selected as SSH.

All you have to do now is, click Open. This opens the standard window with standard font and font size. If you need a larger font, you can go to Window - Appearance  and change settings as you want.

When you open this first time, you will get a security warning as below;


This says that we do not have the key related to the host cached locally, hence it is not secure or not sure whether we connect to the right one. Since we know that we connect to the right host, we can simply click Yes and continue.

Once connected, you will be promoted for user id and password. Note that you need to use the one you configured as SSHUser and the Password when the cluster was created. It does not accept the HTTP User.


Enter the SSHUser and the password, you will be connected with the default node.


Everything is ready now. As you see, first part of the last line is the user connected and hn0 is one of the head nodes. Last part of it shows the cluster name but we see it partially as it is lengthy. Now I can navigate, make folders, move files or connect with Hive without any issue.





No comments: