Big Data
Explain Hadoop Distributed File Systems (HDFS)
Difficulty: unrated
Source: bregman-arie/devops-exercises
by Arie Bregman
Answer
- Distributed file system providing high aggregate bandwidth across the cluster.
- For a user it looks like a regular file system structure but behind the scenes it's distributed across multiple machines in a cluster
- Typical file size is TB and it can scale and supports millions of files
- It's fault tolerant which means it provides automatic recovery from faults
- It's best suited for running long batch operations rather than live analysis