Method MIME.tokenize()

Method tokenize

array(string|int) tokenize(string header, int|void flags)


A structured header field, as specified by RFC 0822, is constructed from a sequence of lexical elements.

Parameter header

The header value to parse.

Parameter flags

An optional set of flags. Currently only one flag is defined:


Keep backslash-escapes in quoted-strings.

The lexical elements parsed are:

individual special characters





This function will analyze a string containing the header value, and produce an array containing the lexical elements.

Individual special characters will be returned as characters (i.e. ints).

Quoted-strings, domain-literals and atoms will be decoded and returned as strings.

Comments are not returned in the array at all.


As domain-literals are returned as strings, there is no way to tell the domain-literal [] from the quoted-string "[]". Hopefully this won't cause any problems. Domain-literals are used seldom, if at all, anyway...

The set of special-characters is the one specified in RFC 1521 (i.e. "<", ">", "@", ",", ";", ":", "\", "/", "?", "="), and not the set specified in RFC 0822.

See also

MIME.quote(), tokenize_labled(), decode_words_tokenized_remapped().