STRING
A byte sequence of zero or more characters.
Arbitrary bytes may be encoded by %XX
(case insensitive) hex escape sequences.
For example:
set resp.http.X-s = "";set resp.http.X-s = "こんにちは 世界"; # UTF-8 string literalset resp.http.X-s = "%F0%9F%8C%AE"; # UTF-8 encoding "by hand"
A string may contain the null byte, %00
,
which has the effect of terminating the string at that point:
set resp.http.X-s = "x%00y"; /* equivalent to "x" */
String literals must be valid UTF-8 sequences.
Literals may be given in several formats, each of which offer different features:
Double-quoted strings with percent escapes of various kinds:
"...%xx..."
Long strings:
{"..."}
. Here, the opening and closing quotes consist of two characters, where the brace is part of the quote. Thus, a single"
character may appear in the body of the string and will not close it early.Long strings do not support percent escapes.
Long strings may have heredoc delimiters:
{xyz"..."xyz}
. These delimiters are arbitrary identifiers, selected such that the closing quote sequence does not appear in the body of the string. This is useful for writing JSON content, where"}
may be present in the body of the string.LF
is a convenience for a single newline character
Percent escapes come in several varieties:
%XX
(exactly two hex digits) for a byte. For example,%09
specifies the ASCII horizontal tab character.A UTF-8 sequence can be encoded using several such bytes. For example,
%f0%9f%90%8b
gives the UTF-8 encoding of the Unicode code point U+1F40B.%uXXXX
(exactly four hex digits) for a single Unicode code point.%u{...}
(one to six hex digits, but not to exceed U+10FFFF)
The %u
forms produce UTF-8 encoding as if they had been written out
as a sequence of single-byte values.
All the percent encoding sequences are case insensitive.
A string may be not set
(as opposed to being set to the empty string),
in which case it is considered to have no value.
Various functions and operators treat unset strings differently;
some render them as the empty string, and some as "(null)"
.
This handling is a property of the function (or operator), rather than
a property of the STRING type.
A not set
value is converted to an empty string when assigned to
a STRING variable and the empty string always compares true in conditions.
No such conversion occurs when comparing a string with a regular expression,
and matching a not set
value with ~
will always evaluate to false.
This behavior can be seen in the following situations:
declare local var.s STRING;if (var.s ~ ".") { } # not reachedif (var.s !~ ".") { } # reachedif (var.s ~ "null") { } # not reachedif (var.s ~ "^$") { } # not reached