org.apache.hadoop.hdfs.server.namenode
Class HostFileManager

java.lang.Object
  extended by org.apache.hadoop.hdfs.server.namenode.HostFileManager

public class HostFileManager
extends Object

This class manages the include and exclude files for HDFS. These files control which DataNodes the NameNode expects to see in the cluster. Loosely speaking, the include file, if it exists and is not empty, is a list of everything we expect to see. The exclude file is a list of everything we want to ignore if we do see it. Entries may or may not specify a port. If they don't, we consider them to apply to every DataNode on that host. For example, putting 192.168.0.100 in the excludes file blacklists both 192.168.0.100:5000 and 192.168.0.100:6000. This case comes up in unit tests. When reading the hosts files, we try to find the IP address for each entry. This is important because it allows us to de-duplicate entries. If the user specifies a node as foo.bar.com in the include file, but 192.168.0.100 in the exclude file, we need to realize that these are the same node. Resolving the IP address also allows us to give more information back to getDatanodeListForReport, which makes the web UI look nicer (among other things.) See HDFS-3934 for more details. DNS resolution can be slow. For this reason, we ONLY do it when (re)reading the hosts files. In all other cases, we rely on the cached values either in the DatanodeID objects, or in HostFileManager#Entry. We also don't want to be holding locks when doing this. See HDFS-3990 for more discussion of DNS overheads. Not all entries in the hosts files will have an associated IP address. Some entries may be "registration names." The "registration name" of a DataNode is either the actual hostname, or an arbitrary string configured by dfs.datanode.hostname. It's possible to add registration names to the include or exclude files. If we can't find an IP address associated with a host file entry, we assume it's a registered hostname and act accordingly. The "registration name" feature is a little odd and it may be removed in the future (I hope?)


Nested Class Summary
static class HostFileManager.Entry
           
static class HostFileManager.EntrySet
           
static class HostFileManager.MutableEntrySet
           
 
Constructor Summary
HostFileManager()
           
 
Method Summary
 HostFileManager.EntrySet getExcludes()
           
 HostFileManager.EntrySet getIncludes()
           
 boolean hasIncludes()
           
 boolean isExcluded(org.apache.hadoop.hdfs.protocol.DatanodeID dn)
           
 boolean isIncluded(org.apache.hadoop.hdfs.protocol.DatanodeID dn)
           
 void refresh(String includeFile, String excludeFile)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HostFileManager

public HostFileManager()
Method Detail

refresh

public void refresh(String includeFile,
                    String excludeFile)
             throws IOException
Throws:
IOException

isIncluded

public boolean isIncluded(org.apache.hadoop.hdfs.protocol.DatanodeID dn)

isExcluded

public boolean isExcluded(org.apache.hadoop.hdfs.protocol.DatanodeID dn)

hasIncludes

public boolean hasIncludes()

getIncludes

public HostFileManager.EntrySet getIncludes()
Returns:
the includes as an immutable set.

getExcludes

public HostFileManager.EntrySet getExcludes()
Returns:
the excludes as an immutable set.


Copyright © 2013 Apache Software Foundation. All Rights Reserved.