Syntax of the Content-Type Header Field
During an incident this morning I got a bit curious about how the Content-Type and the Cache-Control parameters in the HTTP-header is allowed to be formatted. We might have found that there’s a space between the colon and the type/subtype which according to the RFC2045 document there should be no space between the ‘:’ and the type. However it does not explicitly say that a space is forbidden, and the document uses a space in some of the examples:
“Note that the value of a quoted string parameter does not include the quotes. That is, the quotation marks in a quoted-string are not a part of the value of the parameter, but are merely used to delimit that parameter value. In addition, comments are allowed in accordance with RFC 822 rules for structured header fields. Thus the following two forms:
Content-type: text/plain; charset=us-ascii (Plain text)
Content-type: text/plain; charset=”us-ascii” are completely equivalent.”
5.1. Syntax of the Content-Type Header Field
In the Augmented BNF notation of RFC 822, a Content-Type header field value is defined as follows:
content := "Content-Type" ":" type "/" subtype *(";" parameter) ; Matching of media type and subtype ; is ALWAYS case-insensitive.
type := discrete-type / composite-type
discrete-type := "text" / "image" / "audio" / "video" / "application" / extension-token
composite-type := "message" / "multipart" / extension-token
extension-token := ietf-token / x-token
ietf-token := <An extension token defined by a standards-track RFC and registered with IANA.>
x-token := <The two characters "X-" or "x-" followed, with no intervening white space, by any token>
subtype := extension-token / iana-token
iana-token := <A publicly-defined extension token. Tokens of this form must be registered with IANA as specified in RFC 2048.>
parameter := attribute "=" value
attribute := token ; Matching of attributes ; is ALWAYS case-insensitive.
value := token / quoted-string
token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, or tspecials>

Note that the definition of “tspecials” is the same as the RFC 822 definition of “specials” with the addition of the three characters “/”, “?”, and “=”, and the removal of “.”.
Note also that a subtype specification is MANDATORY — it may not be omitted from a Content-Type header field. As such, there are no default subtypes.
The type, subtype, and parameter names are not case sensitive. For example, TEXT, Text, and TeXt are all equivalent top-level media types. Parameter values are normally case sensitive, but sometimes are interpreted in a case-insensitive fashion, depending on the intended use. (For example, multipart boundaries are case-sensitive, but the “access-type” parameter for message/External-body is not case-sensitive.)
Note that the value of a quoted string parameter does not include the quotes. That is, the quotation marks in a quoted-string are not a part of the value of the parameter, but are merely used to delimit that parameter value. In addition, comments are allowed in accordance with RFC 822 rules for structured header fields. Thus the following two forms
Content-type: text/plain; charset=us-ascii (Plain text)
Content-type: text/plain; charset=”us-ascii”
are completely equivalent.
Beyond this syntax, the only syntactic constraint on the definition of subtype names is the desire that their uses must not conflict. That is, it would be undesirable to have two different communities using “Content-Type: application/foobar” to mean two different things. The process of defining new media subtypes, then, is not intended to be a mechanism for imposing restrictions, but simply a mechanism for publicizing their definition and usage. There are, therefore, two acceptable mechanisms for defining new media subtypes:
(1) Private values (starting with "X-") may be defined bilaterally between two cooperating agents without outside registration or standardization. Such values cannot be registered or standardized.
(2) New standard values should be registered with IANA as described in RFC 2048.
The second document in this set, RFC 2046, defines the initial set of media types for MIME.

Stumble It!