Class ApplicationSetup

java.lang.Object
org.terrier.utility.ApplicationSetup

public class ApplicationSetup
extends java.lang.Object

This class retrieves and provides access to all the constants and parameters for the system. When it is statically initialised, it loads the properties file specified by the system property terrier.setup. If this is not specified, then the default value is the value of the terrier.home system property, appended by etc/terrier.properties.
eg java -Dterrier.home=$TERRIER_HOME -Dterrier.setup=$TERRIER_HOME/etc/terrier.properties TrecTerrier

System Properties used:

terrier.setupSpecifies where the terrier.properties file can be found.
terrier.homeSpecified where Terrier has been installed, if the terrier.properties file cannot be found, or the terrier.properties file does not specify the terrier.home in it.
NB:In the future, this may further default to $TERRIER_HOME from the environment.
file.separatorWhat separates directory names in this platform. Set automatically by Java
line.separatorWhat separates lines in a file on this platform. Set automatically by Java

In essence, for Terrier to function properly, you need to specify one of the following on the command line:

  • terrier.setup pointing to a terrier.properties file containing a terrier.home value.
  • ORterrier.home, and Terrier will use a properties file at etc/terrier.properties, if it finds one.

Any property defined in the properties file can be overridden as follows:

  • If the system property terrier.usecontext is equal to true, then a Context object is used to override the properties defined in the file.
  • If the system property terrier.usecontext is equal to false, then system properties are used to override the properties defined in the file.
Author:
Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald
  • Nested Class Summary

    Nested Classes 
    Modifier and Type Class Description
    static interface  ApplicationSetup.TerrierApplicationPlugin
    Interface for plugins.
  • Field Summary

    Fields 
    Modifier and Type Field Description
    protected static java.util.Properties appProperties
    The properties object in which the properties from the file are read.
    static boolean BLOCK_INDEXING
    Specifies whether block information will be used for indexing.
    static int BLOCK_SIZE
    The size of a block of terms in a document.
    static java.lang.String COLLECTION_SPEC
    Deprecated.
    static java.lang.String DEFAULT_LOG4J_CONFIG
    Default log4j config Terrier loads if no TERRIER_ETC/terrier-log.xml file exists
    static int DOCS_CHECK_SINGLEPASS
    Number of documents between each memory check in the single pass inversion method.
    static java.lang.String EOL
    The new line character used by the operating system.
    static int EXPANSION_DOCUMENTS
    Deprecated.
    static int EXPANSION_TERMS
    Deprecated.
    static java.lang.String FILE_SEPARATOR
    The file separator used by the operating system.
    static boolean IGNORE_EMPTY_DOCUMENTS
    Ignore or not empty documents.
    protected static java.util.List<ApplicationSetup.TerrierApplicationPlugin> loadedPlugins
    list of loaded plugins
    static java.lang.String LOG4J_CONFIG
    The configuration file used by log4j
    static int MAX_BLOCKS
    The maximum number of blocks in a document.
    static int MAX_TERM_LENGTH
    The maximum size of a term.
    static int MEMORY_THRESHOLD_SINGLEPASS
    Memory threshold in the single pass inversion method.
    static java.lang.String TERRIER_ETC
    The directory under which the configuration files of Terrier are stored.
    static java.lang.String TERRIER_HOME
    The directory under which the application is installed.
    static java.lang.String TERRIER_INDEX_PATH
    The name of the directory where the inverted file and other data structures are stored.
    static java.lang.String TERRIER_INDEX_PREFIX
    The prefix of the data structures' filenames.
    static java.lang.String TERRIER_SHARE
    The name of the directory where installation independent read-only data is stored.
    static java.lang.String TERRIER_VAR
    The name of the directory where the data structures and the output of Terrier are stored.
    static java.lang.String TERRIER_VERSION
    Current Terrier version
    static java.lang.String TREC_RESULTS
    The name of the directory where the results are stored.
    static java.lang.String TREC_RESULTS_SUFFIX
    The suffix of the files, where the results are stored.
    protected static java.util.Properties UsedAppProperties  
  • Constructor Summary

    Constructors 
    Constructor Description
    ApplicationSetup()  
  • Method Summary

    Modifier and Type Method Description
    static void bootstrapInitialisation()
    forces ApplicatinSetup initilisation
    static void bootstrapInitialisation​(java.util.Properties properties)  
    static void clearAllProperties()
    Clears ApplicationSetup of all properties
    static boolean configure​(java.io.InputStream... propertiesStreams)
    Loads the common Terrier properties from the specified InputStream
    static java.lang.Class<?> getClass​(java.lang.String name)  
    static java.lang.Class<?> getClass​(java.lang.String name, boolean load)  
    static java.lang.ClassLoader getClassLoader()  
    static ApplicationSetup.TerrierApplicationPlugin getPlugin​(java.lang.String name)
    Return a loaded plugin by name.
    static java.util.Properties getProperties()
    Returns the underlying properties object for ApplicationSetup
    static java.lang.String getProperty​(java.lang.String propertyKey, java.lang.String defaultValue)
    Returns the value for the specified property, given a default value, in case the property was not defined during the initialization of the system.
    static java.util.Properties getUsedProperties()
    Returns a properties object detailing all the properties fetched during the lifetime of this class.
    static void loadCommonProperties()
    Loads the ApplicationSetup variables, e.g.
    static java.util.List<java.io.InputStream> loadResources​(java.lang.String name, java.lang.ClassLoader classLoader)  
    static java.lang.String makeAbsolute​(java.lang.String filename, java.lang.String DefaultPath)
    Checks whether the given filename is absolute and if not, it adds on the default path to make it absolute.
    static void setDefaultProperty​(java.lang.String propertyKey, java.lang.String defaultValue)
    set a property value only if it has not already been set
    static void setProperty​(java.lang.String propertyKey, java.lang.String value)
    Sets a value for the specified property.
    protected static void setupPlugins()
    Calls the initialise method of any plugins named in terrier.plugins

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • TERRIER_VERSION

      public static final java.lang.String TERRIER_VERSION
      Current Terrier version
      See Also:
      Constant Field Values
    • DEFAULT_LOG4J_CONFIG

      public static final java.lang.String DEFAULT_LOG4J_CONFIG
      Default log4j config Terrier loads if no TERRIER_ETC/terrier-log.xml file exists
      Since:
      1.1.0
      See Also:
      Constant Field Values
    • appProperties

      protected static final java.util.Properties appProperties
      The properties object in which the properties from the file are read.
    • UsedAppProperties

      protected static final java.util.Properties UsedAppProperties
    • FILE_SEPARATOR

      public static java.lang.String FILE_SEPARATOR
      The file separator used by the operating system. Defaults to the system property file.separator.
    • EOL

      public static java.lang.String EOL
      The new line character used by the operating system. Defaults to the system property line.separator.
    • TERRIER_HOME

      public static java.lang.String TERRIER_HOME
      The directory under which the application is installed. It corresponds to the property terrier.home and it should be set in the properties file, or as a property on the command line.
    • TERRIER_ETC

      public static java.lang.String TERRIER_ETC
      The directory under which the configuration files of Terrier are stored. The corresponding property is terrier.etc and it should be set in the properties file. If a relative path is given, TERRIER_HOME will be prefixed.
    • TERRIER_SHARE

      public static java.lang.String TERRIER_SHARE
      The name of the directory where installation independent read-only data is stored. Files like stopword lists, and example and testing data are examples. The corresponding property is terrier.share and its default value is share. If a relative path is given, then TERRIER_HOME will be prefixed.
    • TERRIER_VAR

      public static java.lang.String TERRIER_VAR
      The name of the directory where the data structures and the output of Terrier are stored. The corresponding property is terrier.var and its default value is var. If a relative path is given, TERRIER_HOME will be prefixed.
    • TERRIER_INDEX_PATH

      public static java.lang.String TERRIER_INDEX_PATH
      The name of the directory where the inverted file and other data structures are stored. The default value is index in the TERRIER_VAR diretory but it can be overridden with the property terrier.index.path. If a relative path is given, TERRIER_VAR will be prefixed.
    • COLLECTION_SPEC

      @Deprecated public static java.lang.String COLLECTION_SPEC
      Deprecated.
      The name of the file that contains the list of resources to be processed during indexing. The contents of this file are collection implementation dependent. For example, for a TREC collection, this file must contain the list of files to index. The corresponding property is collection.spec and by default its value is collection.spec. If a relative path is given, TERRIER_ETC will be prefixed.
    • TREC_RESULTS

      public static java.lang.String TREC_RESULTS
      The name of the directory where the results are stored. The corresponding property is trec.results and the default value is results. If a relative path is given, TERRIER_VAR will be prefixed.
    • TREC_RESULTS_SUFFIX

      public static java.lang.String TREC_RESULTS_SUFFIX
      The suffix of the files, where the results are stored. It corresponds to the property trec.results.suffix and the default value is .res.
    • MAX_TERM_LENGTH

      public static int MAX_TERM_LENGTH
      The maximum size of a term. It corresponds to the the property max.term.length, and the default value is 20.
      Since:
      1.1.0
    • IGNORE_EMPTY_DOCUMENTS

      public static boolean IGNORE_EMPTY_DOCUMENTS
      Ignore or not empty documents. That is, if it is true, then a document that does not contain any terms will have a corresponding entry in the .docid file and the total number of documents in the statistics will be the total number of documents in the collection, even if some of them are empty. It corresponds to the property ignore.empty.documents and the default value is false.
    • TERRIER_INDEX_PREFIX

      public static java.lang.String TERRIER_INDEX_PREFIX
      The prefix of the data structures' filenames. It corresponds to the property terrier.index.prefix and the default value is data.
    • EXPANSION_TERMS

      @Deprecated public static int EXPANSION_TERMS
      Deprecated.
      The number of terms added to the original query. The corresponding property is expansion.terms and the default value is 10.
    • EXPANSION_DOCUMENTS

      @Deprecated public static int EXPANSION_DOCUMENTS
      Deprecated.
      The number of top ranked documents considered for expanding the query. The corresponding property is expansion.documents and the default value is 3.
    • BLOCK_SIZE

      public static int BLOCK_SIZE
      The size of a block of terms in a document. The corresponding property is blocks.size and the default value is 1.
    • MAX_BLOCKS

      public static int MAX_BLOCKS
      The maximum number of blocks in a document. The corresponding property is blocks.max and the default value is 100000.
    • BLOCK_INDEXING

      public static boolean BLOCK_INDEXING
      Specifies whether block information will be used for indexing. The corresponding property is block.indexing and the default value is false. The value of this property cannot be modified after the index of a collection has been built.
    • MEMORY_THRESHOLD_SINGLEPASS

      public static int MEMORY_THRESHOLD_SINGLEPASS
      Memory threshold in the single pass inversion method. If a memory check is below this value, the postings in memory are written to disk. Default is 50 000 000 (50MB) and 100 000 000 (100MB) for 32bit and 64bit JVMs respectively, and can be configured using the property memory.reserved.
    • DOCS_CHECK_SINGLEPASS

      public static int DOCS_CHECK_SINGLEPASS
      Number of documents between each memory check in the single pass inversion method. The default value is 20, and this can be configured using the property docs.check.
    • LOG4J_CONFIG

      public static java.lang.String LOG4J_CONFIG
      The configuration file used by log4j
    • loadedPlugins

      protected static java.util.List<ApplicationSetup.TerrierApplicationPlugin> loadedPlugins
      list of loaded plugins
  • Constructor Details

  • Method Details

    • bootstrapInitialisation

      public static void bootstrapInitialisation()
      forces ApplicatinSetup initilisation
    • loadResources

      public static java.util.List<java.io.InputStream> loadResources​(java.lang.String name, java.lang.ClassLoader classLoader) throws java.io.IOException
      Throws:
      java.io.IOException
    • getClass

      public static java.lang.Class<?> getClass​(java.lang.String name) throws java.lang.ClassNotFoundException
      Throws:
      java.lang.ClassNotFoundException
    • getClass

      public static java.lang.Class<?> getClass​(java.lang.String name, boolean load) throws java.lang.ClassNotFoundException
      Throws:
      java.lang.ClassNotFoundException
    • bootstrapInitialisation

      public static void bootstrapInitialisation​(java.util.Properties properties)
    • loadCommonProperties

      public static void loadCommonProperties()
      Loads the ApplicationSetup variables, e.g. ApplicationSetup.TERRIER_HOME
    • configure

      public static boolean configure​(java.io.InputStream... propertiesStreams) throws java.io.IOException
      Loads the common Terrier properties from the specified InputStream
      Throws:
      java.io.IOException
    • getProperty

      public static java.lang.String getProperty​(java.lang.String propertyKey, java.lang.String defaultValue)
      Returns the value for the specified property, given a default value, in case the property was not defined during the initialization of the system. The property values are read from the properties file. If the value of the property terrier.usecontext is true, then the properties file is overridden by the context. If the value of the property terrier.usecontext is false, then the properties file is overridden
      Parameters:
      propertyKey - The property to be returned
      defaultValue - The default value used, in case it is not defined
      Returns:
      the value for the given property.
    • getUsedProperties

      public static java.util.Properties getUsedProperties()
      Returns a properties object detailing all the properties fetched during the lifetime of this class. It is of note that this is NOT the underlying appProperties table, as to update that would mean that properties fetched using their defaults, could not have different defaults in different places.
    • getProperties

      public static java.util.Properties getProperties()
      Returns the underlying properties object for ApplicationSetup
    • setProperty

      public static void setProperty​(java.lang.String propertyKey, java.lang.String value)
      Sets a value for the specified property. The properties set with this method are not saved in the properties file.
      Parameters:
      propertyKey - the name of the property to set.
      value - the value of the property to set.
    • setDefaultProperty

      public static void setDefaultProperty​(java.lang.String propertyKey, java.lang.String defaultValue)
      set a property value only if it has not already been set
      Parameters:
      propertyKey - the name of the property to set.
      defaultValue - the value of the property to set.
    • setupPlugins

      protected static void setupPlugins()
      Calls the initialise method of any plugins named in terrier.plugins
    • getClassLoader

      public static java.lang.ClassLoader getClassLoader()
    • getPlugin

      public static ApplicationSetup.TerrierApplicationPlugin getPlugin​(java.lang.String name)
      Return a loaded plugin by name. Returns null if a plugin of that name has not been loaded
    • makeAbsolute

      public static java.lang.String makeAbsolute​(java.lang.String filename, java.lang.String DefaultPath)
      Checks whether the given filename is absolute and if not, it adds on the default path to make it absolute. If a URI scheme is present, the filename is assumed to be absolute
      Parameters:
      filename - String the filename to make absolute
      DefaultPath - String the prefix to add
      Returns:
      the absolute filename
    • clearAllProperties

      public static void clearAllProperties()
      Clears ApplicationSetup of all properties