Basic Filesystem Operation:
You can get detailed help on every command.
Here $ stand for command prompt of centos or ubuntu.
$hadoop fs -help
Above syntax will give the detail help in fs function.
To copying files from the local filesystem to HDFS.
$hadoop fs -copyFromLocal <input path of localsystem> <destination path of hdfs>
e.g.
$hadoop fs -copyFromLocal /home/yanish/a.txt /tmp/b.txt
$hadoop fs -copyFromLocal /home/yanish/a.txt hdfs://localhost/tmp/b.txt
The above command copy file a.txt from local system and paste in hadoop hdfs with file name b.txt.
Copy file from HDFS to localhost
$hadoop fs -copyToLocal /user/yanish/a.txt /usr/local/
The ls command is similar to ls command of linux
$hadoop fs -ls /
The above command will show the list of file contain in / directory of hadoop HDFS.
$hadoop fs -ls /user
The above command will show the list of file contain in /user directory of hadoop HDFS.
To make folder in HDFS.
$hadoop fs -mkdir <path of hdfs>
e.g $hadoop fs -mkdir /user/yanish
The above command make folder yanish in /user directory.
there is command put and get to put function to put file in HDFS and to get file from hdfs. The syntax is similar to copyFromLocal like that.
Parallel Copying with distcp:
Hadoop comes with a useful program called distcp for copying large amount of data to and from Hadoop filesystem in parallel.
$hadoop distcp hdfs://namenode1/foo hdfs://namenode2/foo
This will copy /foo directory from the namenode1 node to namenode2 node.
By default, distcp will skip files that already exist in the destination, but they can be overwritten by supplying the -overwrite option. You can also update only the files that have changed using the -update option.
$hadoop distcp -update hdfs://namenode1/foo hdfs://namenode2/bar/foo.
To delete a file files.text from hdfs.
$hadoop fs -rmr /my/files.txt
To change permission in folder in hdfs:
$hadoop fs -chmod 755 /user/a
$hadoop fs -chmod -R 755 /user/a
The above file make the permission of 755 for directory a. You can change permission as u like like 700, 533 according to the need of read, write and execute permission. you and even add sticky bit like that too.
The second command will change the permission of directory a as well as the contain in that directory with defined permission.
You can change the ownership of file and directory.
$hadoop fs -chmod <user>:<group> <path>
$hadoop fs -chmod yanish:hadoop /user/yanish
$hadoop fs -chmod -R yanish:hadoop /user/yanish
The above command will change the ownership of directory yanish as user yanish and group hadoop. The second command will change the ownership of directory as well as the directory and file inside that directory.
The best solution to deal with filesystem is by using hue which make very easy for most of the operation like copy from and to hdfs, changing ownership and permission and read too.
For hadoop training the video will be uploaded soon in youtube.
You can get detailed help on every command.
Here $ stand for command prompt of centos or ubuntu.
$hadoop fs -help
Above syntax will give the detail help in fs function.
To copying files from the local filesystem to HDFS.
$hadoop fs -copyFromLocal <input path of localsystem> <destination path of hdfs>
e.g.
$hadoop fs -copyFromLocal /home/yanish/a.txt /tmp/b.txt
$hadoop fs -copyFromLocal /home/yanish/a.txt hdfs://localhost/tmp/b.txt
The above command copy file a.txt from local system and paste in hadoop hdfs with file name b.txt.
Copy file from HDFS to localhost
$hadoop fs -copyToLocal /user/yanish/a.txt /usr/local/
The ls command is similar to ls command of linux
$hadoop fs -ls /
The above command will show the list of file contain in / directory of hadoop HDFS.
$hadoop fs -ls /user
The above command will show the list of file contain in /user directory of hadoop HDFS.
To make folder in HDFS.
$hadoop fs -mkdir <path of hdfs>
e.g $hadoop fs -mkdir /user/yanish
The above command make folder yanish in /user directory.
there is command put and get to put function to put file in HDFS and to get file from hdfs. The syntax is similar to copyFromLocal like that.
Parallel Copying with distcp:
Hadoop comes with a useful program called distcp for copying large amount of data to and from Hadoop filesystem in parallel.
$hadoop distcp hdfs://namenode1/foo hdfs://namenode2/foo
This will copy /foo directory from the namenode1 node to namenode2 node.
By default, distcp will skip files that already exist in the destination, but they can be overwritten by supplying the -overwrite option. You can also update only the files that have changed using the -update option.
$hadoop distcp -update hdfs://namenode1/foo hdfs://namenode2/bar/foo.
To delete a file files.text from hdfs.
$hadoop fs -rmr /my/files.txt
To change permission in folder in hdfs:
$hadoop fs -chmod 755 /user/a
$hadoop fs -chmod -R 755 /user/a
The above file make the permission of 755 for directory a. You can change permission as u like like 700, 533 according to the need of read, write and execute permission. you and even add sticky bit like that too.
The second command will change the permission of directory a as well as the contain in that directory with defined permission.
You can change the ownership of file and directory.
$hadoop fs -chmod <user>:<group> <path>
$hadoop fs -chmod yanish:hadoop /user/yanish
$hadoop fs -chmod -R yanish:hadoop /user/yanish
The above command will change the ownership of directory yanish as user yanish and group hadoop. The second command will change the ownership of directory as well as the directory and file inside that directory.
The best solution to deal with filesystem is by using hue which make very easy for most of the operation like copy from and to hdfs, changing ownership and permission and read too.
For hadoop training the video will be uploaded soon in youtube.
Awesome information.it was a nice article.
ReplyDeletethanks for sharing this beautiful article.
keep sharing.
best hadoop training in bengaluru