gate.creole.ontology
Class OUtils

java.lang.Object
  extended by gate.creole.ontology.OUtils

public class OUtils
extends Object


Method Summary
static String toResourceName(String text)
          Converts a string to a form suitable for use as a resource name by the Ontology.createOURIForName(java.lang.String) method.
static String uriDecode(String uri)
          Convert a URI reference (URI or URI fragment), in US-ASCII, with escaped characters taken from UTF-8, to the corresponding Unicode string.
static String uriEncode(String uriRef)
          Convert a Unicode string (which is assumed to represent a URI or URI fragment) to an RFC 2396-compliant URI reference by first converting it to bytes in UTF-8 and then encoding the resulting bytes as specified by the RFC.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

toResourceName

public static String toResourceName(String text)
Converts a string to a form suitable for use as a resource name by the Ontology.createOURIForName(java.lang.String) method. This is not a reversible encoding, but is intended to produce "readable" resource URIs in an ontology from English source strings. The process is:
  1. Replace any slashes, colons, apostrophes and other non-URI-legal punctuation characters and unicode symbol characters with a space, i.e. any punctuation except ; ? @ & = + $ , - _ . ! ~ * ( )
  2. Convert any runs of whitespace characters to a single underscore
  3. uriEncode(java.lang.String) the result.
For example, this would convert "John Smith" to "John_Smith", "Allen & Heath" to "Allen_&_Heath", "N/A" to "N_A", "32 °F" to "32_F", etc.


uriEncode

public static String uriEncode(String uriRef)
Convert a Unicode string (which is assumed to represent a URI or URI fragment) to an RFC 2396-compliant URI reference by first converting it to bytes in UTF-8 and then encoding the resulting bytes as specified by the RFC. ASCII letters, numbers and the other characters that are permitted in URI references are left unchanged, existing %NN escape sequences are left unchanged, and any other characters are %-escaped as appropriate. In particular any % characters in the original string that are not part of a %NN escape sequence will themselves be encoded as %25.

Parameters:
uriRef - The uri, in characters specified by RFC 2396 + '#'
Returns:
The corresponding Unicode String

uriDecode

public static String uriDecode(String uri)
Convert a URI reference (URI or URI fragment), in US-ASCII, with escaped characters taken from UTF-8, to the corresponding Unicode string. On ill-formed input the results are undefined, specifically if the unescaped version is not a UTF-8 String, some String will be returned.

Parameters:
uri - The uri, in characters specified by RFC 2396 + '#'.
Returns:
The corresponding Unicode String.
Throws:
IllegalArgumentException - If a % hex sequence is ill-formed.