5.7. String Functions and Operators
String Operators
The || operator performs concatenation.
String Functions
Note
These functions assume that the input strings contain valid UTF-8 encoded Unicode code points. There are no explicit checks for valid UTF-8, and the functions may return incorrect results on invalid UTF-8. Invalid UTF-8 data can be corrected with from_utf8().
Additionally, the functions operate on Unicode code points and not user visible characters (or grapheme clusters). Some languages combine multiple code points into a single user-perceived character, the basic unit of a writing system for a language, but the functions will treat each code point as a separate unit.
The lower() and upper() functions do not perform locale-sensitive, context-sensitive, or one-to-many mappings required for some languages. Specifically, this will return incorrect results for Lithuanian, Turkish and Azeri.
- chr(n) → varchar
Returns the Unicode code point n as a single character string.
- concat(string1, ..., stringN) → varchar
Returns the concatenation of string1, string2, ..., stringN. This function provides the same functionality as the SQL-standard concatenation operator (||).
- length(string) → bigint
Returns the length of string in characters.
- lower(string) → varchar
Converts string to lowercase.
- lpad(string, size, padstring) → varchar
Left pads string to size characters with padstring. If size is less than the length of string, the result is truncated to size characters. size must not be negative and padstring must be non-empty.
- ltrim(string) → varchar
Removes leading whitespace from string.
- replace(string, search) → varchar
Removes all instances of search from string.
- replace(string, search, replace) → varchar
Replaces all instances of search with replace in string.
- reverse(string) → varchar
Returns string with the characters in reverse order.
- rpad(string, size, padstring) → varchar
Right pads string to size characters with padstring. If size is less than the length of string, the result is truncated to size characters. size must not be negative and padstring must be non-empty.
- rtrim(string) → varchar
Removes trailing whitespace from string.
- split(string, delimiter) → array<varchar>
Splits string on delimiter and returns an array.
- split(string, delimiter, limit) → array<varchar>
Splits string on delimiter and returns an array of size at most limit. The last element in the array always contain everything left in the string. limit must be a positive number.
- split_part(string, delimiter, index) → varchar
Splits string on delimiter and returns the field index. Field indexes start with 1. If the index is larger than than the number of fields, then null is returned.
- strpos(string, substring) → bigint
Returns the starting position of the first instance of substring in string. Positions start with 1. If not found, 0 is returned.
- position(substring IN string) → bigint
Returns the starting position of the first instance of substring in string. Positions start with 1. If not found, 0 is returned.
- substr(string, start) → varchar
Returns the rest of string from the starting position start. Positions start with 1. A negative starting position is interpreted as being relative to the end of the string.
- substr(string, start, length) → varchar
Returns a substring from string of length length from the starting position start. Positions start with 1. A negative starting position is interpreted as being relative to the end of the string.
- trim(string) → varchar
Removes leading and trailing whitespace from string.
- upper(string) → varchar
Converts string to uppercase.
Unicode Functions
- normalize(string) → varchar
Transforms string with NFC normalization form.
- normalize(string, form) → varchar
Transforms string with the specified normalization form. form must be be one of the following keywords:
Form Description NFD Canonical Decomposition NFC Canonical Decomposition, followed by Canonical Composition NFKD Compatibility Decomposition NFKC Compatibility Decomposition, followed by Canonical Composition Note
This SQL-standard function has special syntax and requires specifying form as a keyword, not as a string.
- to_utf8(string) → varbinary
Encodes string into a UTF-8 varbinary representation.
- from_utf8(binary) → varchar
Decodes a UTF-8 encoded string from binary. Invalid UTF-8 sequences are replaced with the Unicode replacement character U+FFFD.
- from_utf8(binary, replace) → varchar
Decodes a UTF-8 encoded string from binary. Invalid UTF-8 sequences are replaced with replace. The replacement string replace must either be a single character or empty (in which case invalid characters are removed).