public class PyUnicode extends PyString implements java.lang.Iterable<java.lang.Integer>
PySequence.DefaultIndexDelegatePyObject.ConversionException| Modifier and Type | Field and Description |
|---|---|
static PyType |
TYPE |
delegatorattributes, gcMonitorGlobal, objtype| Constructor and Description |
|---|
PyUnicode() |
PyUnicode(char c) |
PyUnicode(java.util.Collection<java.lang.Integer> ucs4) |
PyUnicode(int codepoint) |
PyUnicode(int[] codepoints) |
PyUnicode(java.util.Iterator<java.lang.Integer> iter) |
PyUnicode(PyString pystring) |
PyUnicode(PyType subtype,
PyString pystring) |
PyUnicode(PyType subtype,
java.lang.String string) |
PyUnicode(java.lang.String string)
Construct a PyUnicode interpreting the Java String argument as UTF-16.
|
PyUnicode(java.lang.String string,
boolean isBasic)
Construct a PyUnicode interpreting the Java String argument as UTF-16.
|
| Modifier and Type | Method and Description |
|---|---|
PyObject |
__add__(PyObject other)
Equivalent to the standard Python __add__ method.
|
PyComplex |
__complex__()
Equivalent to the standard Python __complex__ method.
|
boolean |
__contains__(PyObject o)
Equivalent to the standard Python __contains__ method.
|
PyObject |
__eq__(PyObject other)
Equivalent to the standard Python __eq__ method.
|
PyObject |
__format__(PyObject formatSpec) |
PyObject |
__ge__(PyObject other)
Equivalent to the standard Python __ge__ method.
|
PyObject |
__gt__(PyObject other)
Equivalent to the standard Python __gt__ method.
|
PyObject |
__le__(PyObject other)
Equivalent to the standard Python __le__ method.
|
int |
__len__()
Equivalent to the standard Python __len__ method.
|
PyObject |
__lt__(PyObject other)
Equivalent to the standard Python __lt__ method.
|
PyObject |
__mod__(PyObject other)
Equivalent to the standard Python __mod__ method
|
PyObject |
__ne__(PyObject other)
Equivalent to the standard Python __ne__ method.
|
PyString |
__repr__()
Equivalent to the standard Python
__repr__ method. |
PyString |
__str__()
Equivalent to the standard Python __str__ method.
|
PyUnicode |
__unicode__() |
protected int |
_findLeft(int right)
Helper for
strip, lstrip implementation, when stripping whitespace. |
protected int |
_findRight()
Helper for
strip, rstrip implementation, when stripping whitespace. |
double |
atof()
Convert this PyString to a floating-point value according to Python rules.
|
int |
atoi(int base) |
PyLong |
atol(int base) |
static java.lang.String |
checkEncoding(java.lang.String s) |
PyString |
createInstance(java.lang.String string)
Create an instance of the same type as this object, from the Java String given as argument.
|
protected PyString |
createInstance(java.lang.String string,
boolean isBasic)
Create an instance of the same type as this object, from the Java String given as argument.
|
boolean |
endswith(PyObject suffix,
PyObject start,
PyObject end)
Equivalent to the Python
unicode.endswith method, testing whether a string ends
with a specified suffix, where a sub-range is specified by [start:end]. |
static PyUnicode |
from(char c)
Return a not-necessarily new
PyUnicode from a Java char. |
static PyUnicode |
fromCodepoint(int codepoint)
Return a not-necessarily new
PyUnicode from a Java code point. |
static PyUnicode |
fromInterned(java.lang.String s)
Returns a PyUnicode from an already interned String.
|
static PyUnicode |
fromString(java.lang.String s,
boolean isBasic)
Return a not-necessarily new
PyUnicode from a Java String. |
protected PyString |
fromSubstring(int begin,
int end)
Return a new object of the same type as this one equal to the slice
[begin:end]. |
PyBuffer |
getBuffer(int flags)
PyUnicode implements the interface BufferProtocol technically by inheritance from PyString,
but does not provide a buffer (in CPython). |
int |
getCodePointCount() |
int |
getInt(int i) |
protected PyObject |
getslice(int start,
int stop,
int step)
Returns a range of elements from the sequence.
|
boolean |
isBasicPlane()
Determine whether the string consists entirely of basic-plane characters.
|
java.util.Iterator<java.lang.Integer> |
iterator() |
PyString |
join(PyObject seq) |
java.util.Iterator<java.lang.Integer> |
newSubsequenceIterator()
Get an iterator over the code point sequence.
|
java.util.Iterator<java.lang.Integer> |
newSubsequenceIterator(int start,
int stop,
int step)
Get an iterator over a slice of the code point sequence.
|
PyTuple |
partition(PyObject sep)
Equivalent to Python
str.partition(), splits the PyString at the
first occurrence of sepObj returning a PyTuple containing the part
before the separator, the separator itself, and the part after the separator. |
protected PyObject |
pyget(int i)
Returns the element of the sequence at the given index.
|
PyTuple |
rpartition(PyObject sep)
Equivalent to Python
str.rpartition(), splits the PyString at the
last occurrence of sepObj returning a PyTuple containing the part before
the separator, the separator itself, and the part after the separator. |
protected PyList |
rsplitfields(int maxsplit)
Helper function for
.rsplit, in str and (when overridden) in
unicode, splitting on white space and returning a list of the separated parts. |
protected PyList |
splitfields(int maxsplit)
Helper function for
.split, in str and (when overridden) in
unicode, splitting on white space and returning a list of the separated parts. |
boolean |
startswith(PyObject prefix,
PyObject start,
PyObject end)
Equivalent to the Python
unicode.startswith method, testing whether a string
starts with a specified prefix, where a sub-range is specified by [start:end]. |
java.lang.String |
substring(int start,
int end)
Return a substring of this object as a Java String.
|
int[] |
toCodePoints() |
protected int[] |
translateIndices(PyObject start,
PyObject end)
Many of the string methods deal with slices specified using Python slice semantics:
endpoints, which are
PyObjects, may be null or None
(meaning default to one end or the other) or may be negative (meaning "from the end"). |
__cmp__, __float__, __getnewargs__, __int__, __invert__, __long__, __mul__, __neg__, __pos__, __rmul__, __tojava__, _count, _find, _lstrip, _lstrip, _replace, _rfind, _rsplit, _rstrip, _rstrip, _split, _strip, _strip, asDouble, asInt, asLong, asName, asString, asString, asU16BytesOrError, atoi, atol, buildFormattedString, capitalize, center, charAt, checkIndex, count, count, count, count, count, count, decode_UnicodeEscape, decode, decode, decode, encode_UnicodeEscape, encode, encode, encode, endswith, endswith, expandtabs, expandtabs, find, find, find, find, find, find, getString, hashCode, index, index, index, index, index, index, internedString, isalnum, isalpha, isdecimal, isdigit, islower, isnumeric, isspace, istitle, isunicode, isupper, length, ljust, ljust, lower, lstrip, lstrip, lstrip, repeat, replace, replace, rfind, rfind, rfind, rfind, rfind, rfind, rindex, rindex, rindex, rindex, rindex, rindex, rjust, rsplit, rsplit, rsplit, rsplit, rsplit, rstrip, rstrip, rstrip, split, split, split, split, split, splitlines, splitlines, startswith, startswith, str___mod__, strip, strip, strip, subSequence, swapcase, title, toBytes, toString, translate, translate, translate, translate, unsupportedopMessage, upper, zfill__delitem__, __delslice__, __finditem__, __finditem__, __getitem__, __getslice__, __iter__, __nonzero__, __setitem__, __setitem__, __setslice__, boundToSequence, cmp, del, delRange, delslice, fastSequence, isMappingType, isNumberType, isSequenceType, isSubType, pyset, runsupportedopMessage, setslice, sliceLength__abs__, __and__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __coerce__, __coerce_ex__, __delattr__, __delattr__, __delete__, __delitem__, __delslice__, __dir__, __div__, __divmod__, __ensure_finalizer__, __findattr__, __findattr__, __findattr_ex__, __finditem__, __floordiv__, __get__, __getattr__, __getattr__, __getitem__, __getslice__, __hash__, __hex__, __iadd__, __iand__, __idiv__, __idivmod__, __ifloordiv__, __ilshift__, __imod__, __imul__, __index__, __ior__, __ipow__, __irshift__, __isub__, __iternext__, __itruediv__, __ixor__, __lshift__, __not__, __oct__, __or__, __pow__, __pow__, __radd__, __rand__, __rawdir__, __rdiv__, __rdivmod__, __reduce__, __reduce_ex__, __reduce_ex__, __rfloordiv__, __rlshift__, __rmod__, __ror__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__, __rxor__, __set__, __setattr__, __setattr__, __setitem__, __setslice__, __sub__, __truediv__, __trunc__, __xor__, _add, _and, _callextra, _cmp, _div, _divmod, _doget, _doget, _doset, _eq, _floordiv, _ge, _gt, _iadd, _iand, _idiv, _idivmod, _ifloordiv, _ilshift, _imod, _imul, _in, _ior, _ipow, _irshift, _is, _isnot, _isub, _itruediv, _ixor, _jcall, _jcallexc, _jthrow, _le, _lshift, _lt, _mod, _mul, _ne, _notin, _or, _pow, _rshift, _sub, _truediv, _unsupportedop, _xor, adaptToCoerceTuple, asIndex, asIndex, asInt, asIterable, asLong, asName, asStringOrNull, asStringOrNull, bit_length, conjugate, delDict, delType, dispatch__init__, equals, fastGetClass, fastGetDict, finalize, getDict, getJavaProxy, getType, impAttr, implementsDescrDelete, implementsDescrGet, implementsDescrSet, invoke, invoke, invoke, invoke, invoke, invoke, isCallable, isDataDescr, isIndex, isInteger, mergeClassDict, mergeDictAttr, mergeListAttr, noAttributeError, object___subclasshook__, readonlyAttributeError, setDict, setTypepublic static final PyType TYPE
public PyUnicode()
public PyUnicode(java.lang.String string)
string - UTF-16 string encoding the characters (as Java).public PyUnicode(java.lang.String string,
boolean isBasic)
string - UTF-16 string encoding the characters (as Java).isBasic - true if it is known that only BMP characters are present.public PyUnicode(PyType subtype, java.lang.String string)
public PyUnicode(PyString pystring)
public PyUnicode(char c)
public PyUnicode(int codepoint)
public PyUnicode(int[] codepoints)
public PyUnicode(java.util.Iterator<java.lang.Integer> iter)
public PyUnicode(java.util.Collection<java.lang.Integer> ucs4)
public int[] toCodePoints()
toCodePoints in class PyStringpublic PyBuffer getBuffer(int flags) throws java.lang.ClassCastException
PyUnicode implements the interface BufferProtocol technically by inheritance from PyString,
but does not provide a buffer (in CPython). We therefore arrange that all calls to getBuffer
raise an error.getBuffer in interface BufferProtocolgetBuffer in class PyStringflags - consumer requirementsClassCastExceptionjava.lang.ClassCastException - when the object only formally implements BufferProtocolprotected int[] translateIndices(PyObject start, PyObject end)
PyObjects, may be null or None
(meaning default to one end or the other) or may be negative (meaning "from the end").
Meanwhile, the implementation methods need integer indices, both within the array, and
0<=start<=end<=N the length of the array.
This method first translates the Python slice startObj and endObj
according to the slice semantics for null and negative values, and stores these in elements 2
and 3 of the result. Then, since the end points of the range may lie outside this sequence's
bounds (in either direction) it reduces them to the nearest points satisfying
0<=start<=end<=N, and stores these in elements [0] and [1] of the
result.
In the PyUnicode version, the arguments are code point indices, such as are
received from the Python caller, while the first two elements of the returned array have been
translated to UTF-16 indices in the implementation string.
translateIndices in class PyStringstart - Python start of sliceend - Python end of slicepublic java.lang.String substring(int start,
int end)
char) indices. For
example:
PyUnicode u = new PyUnicode("..𐀂𐀃...");
// (Python) u = u'..\U00010002\U00010003...'
String s = u.substring(2, 4); // = "𐀂𐀃" (Java)
public static PyUnicode fromInterned(java.lang.String s)
public static PyUnicode fromString(java.lang.String s, boolean isBasic)
PyUnicode from a Java String.s - UTF-16 string encoding the characters (as Java).isBasic - true if it is known that only BMP characters are present.PyUnicodepublic static PyUnicode from(char c)
PyUnicode from a Java char. Some low index chars
(ASCII) return a re-used PyUnicode. This method does not assume the character is
basic-plane.c - to convert to a PyUnicode.PyUnicodepublic static PyUnicode fromCodepoint(int codepoint)
PyUnicode from a Java code point.codepoint - of the single character requiredPyUnicode for the characterpublic boolean isBasicPlane()
PyString, of course, it is always true, but this is useful in cases
where either a PyString or a PyUnicode is acceptable.isBasicPlane in class PyStringpublic int getCodePointCount()
public static java.lang.String checkEncoding(java.lang.String s)
public PyString createInstance(java.lang.String string)
PyStringcreateInstance in class PyStringstring - to wrapstrprotected PyString createInstance(java.lang.String string, boolean isBasic)
PyStringcreateInstance in class PyStringstring - UTF-16 string encoding the characters (as Java).isBasic - true if it is known that only BMP characters are present.strpublic PyObject __mod__(PyObject other)
PyObjectpublic PyUnicode __unicode__()
__unicode__ in class PyStringpublic PyString __str__()
PyObjectPyObject) calls PyObject.__repr__(), making it unnecessary to override
__str__ in sub-classes of PyObject where both forms are the same. A
common choice is to provide the same implementation to __str__ and
toString, for consistency in the printed form of objects between Python and
Java.public int __len__()
PyObjectpublic PyString __repr__()
PyObject__repr__ method. Each sub-class of
PyObject is likely to re-define this method to provide for its own reproduction.protected PyObject getslice(int start, int stop, int step)
PySequencepublic PyObject __eq__(PyObject other)
PyObjectpublic PyObject __ne__(PyObject other)
PyObjectpublic PyObject __lt__(PyObject other)
PyObjectpublic PyObject __le__(PyObject other)
PyObjectpublic PyObject __gt__(PyObject other)
PyObjectpublic PyObject __ge__(PyObject other)
PyObjectprotected PyObject pyget(int i)
PySequencePySequence.__getitem__(org.python.core.PyObject) It is guaranteed by PySequence that
when it calls pyget(int) the index is within the bounds of the array. Any other
clients must make the same guarantee.public java.util.Iterator<java.lang.Integer> newSubsequenceIterator()
public java.util.Iterator<java.lang.Integer> newSubsequenceIterator(int start,
int stop,
int step)
public boolean __contains__(PyObject o)
PyObject__contains__ in class PyStringo - the element to search for in this container.protected int _findLeft(int right)
strip, lstrip implementation, when stripping whitespace.protected int _findRight()
strip, rstrip implementation, when stripping whitespace._findRight in class PyStringpublic PyTuple partition(PyObject sep)
PyStringstr.partition(), splits the PyString at the
first occurrence of sepObj returning a PyTuple containing the part
before the separator, the separator itself, and the part after the separator.partition in class PyStringsep - str, unicode or object implementing BufferProtocolpublic PyTuple rpartition(PyObject sep)
PyStringstr.rpartition(), splits the PyString at the
last occurrence of sepObj returning a PyTuple containing the part before
the separator, the separator itself, and the part after the separator.rpartition in class PyStringsep - str, unicode or object implementing BufferProtocolprotected PyList splitfields(int maxsplit)
.split, in str and (when overridden) in
unicode, splitting on white space and returning a list of the separated parts.
If there are more than maxsplit feasible splits the last element of the list is
the remainder of the original (this) string. The split sections will be PyUnicode and use the Python
unicode definition of "space".splitfields in class PyStringmaxsplit - limit on the number of splits (if >=0)PyList of split sectionsprotected PyList rsplitfields(int maxsplit)
.rsplit, in str and (when overridden) in
unicode, splitting on white space and returning a list of the separated parts.
If there are more than maxsplit feasible splits the first element of the list is
the remainder of the original (this) string. The split sections will be PyUnicode and use the Python
unicode definition of "space".rsplitfields in class PyStringmaxsplit - limit on the number of splits (if >=0)PyList of split sectionsprotected PyString fromSubstring(int begin, int end)
PyString[begin:end]. (Python end-relative indexes etc. are not supported.) Subclasses (
fromSubstring(int, int)) override this to return their own type.)fromSubstring in class PyStringbegin - first included character.end - first excluded character.public boolean startswith(PyObject prefix, PyObject start, PyObject end)
unicode.startswith method, testing whether a string
starts with a specified prefix, where a sub-range is specified by [start:end].
Arguments start and end are interpreted as in slice notation, with
null or Py.None representing "missing". prefix can also be a tuple of
prefixes to look for.startswith in class PyStringprefix - string to check for (or a PyTuple of them).start - start of slice.end - end of slice.true if this string slice starts with a specified prefix, otherwise
false.public boolean endswith(PyObject suffix, PyObject start, PyObject end)
unicode.endswith method, testing whether a string ends
with a specified suffix, where a sub-range is specified by [start:end].
Arguments start and end are interpreted as in slice notation, with
null or Py.None representing "missing". suffix can also be a tuple of
suffixes to look for.public PyObject __format__(PyObject formatSpec)
__format__ in class PyStringpublic java.util.Iterator<java.lang.Integer> iterator()
iterator in interface java.lang.Iterable<java.lang.Integer>public PyComplex __complex__()
PyObject__complex__ in class PyString