RFC 9110 & RFC 9112
The HTTP protocol is defined in two different RFCs:
- RFC 9110 which defines the semantics of HTTP/1.1
- RFC 9112 which defines the syntax and message format of HTTP/1.1
The RFCs use several syntax elements to define message structure. Here are the most commonly used syntax elements:
| Element | Name | Description |
|---|---|---|
| CRLF | Carriage Return Line Feed | Line terminator for HTTP messages |
| CR | Carriage Return | Carriage return character |
| LF | Line Feed | Line feed character |
| SP | Space | Space character |
| HTAB | Horizontal tabulation | Horizontal tabulation character |
| OWS | Optional Whitespace | Optional whitespace (zero or more SP/HTAB) |
| RWS | Required Whitespace | Required whitespace (one or more SP/HTAB) |
Message
Message format
An HTTP/1.1 message consists of a start-line followed by a CRLF and a sequence of bytes:
HTTP-message = start-line CRLF
*( field-line CRLF )
CRLF
[ message-body ]
A message can be either a request from client to server or a response from server to client. Syntactically, the two types of messages differ only in the start-line, which is either a request-line (for requests) or a status-line (for responses):
start-line = request-line / status-line
Field Lines
Each field line (also known as header) consists of a case-insensitive field name followed by a colon (":"), optional leading whitespace, the field line value, and optional trailing whitespace.
field-line = field-name ":" OWS field-value OWS
No whitespace is allowed between the field name and colon. A server MUST reject, with a response status code of 400 (Bad Request), any received request message that contains whitespace between a header field name and colon.
Request
Request Line
A request-line begins with a method token, followed by a single space (SP), the request-target, and another single space (SP), and ends with the protocol version.
request-line = method SP request-target SP HTTP-version
Method
The method token indicates the request method to be performed on the target resource. The request method is case-sensitive.
method = token
This specification defines a number of standardized methods that are commonly used in HTTP. Here are the only methods you need to implement for HTTPd:
| Method Name | Description |
|---|---|
| GET | Transfer a current representation of the target resource. |
| HEAD | Same as GET, but do not transfer the response content. |
Request Target
The request-target identifies the target resource upon which to apply the request. The client derives a request-target from its desired target URI. There are four distinct formats for the request-target, depending on both the method being requested and whether the request is to a proxy.
request-target = origin-form
/ absolute-form
/ authority-form
/ asterisk-form
Only origin-form and absolute-form are necessary for you to handle in the HTTPd project, and you should focus mainly on origin-form.
No whitespace is allowed in the request-target.
origin-form
The most common form of request-target is the "origin-form".
origin-form = absolute-path [ "?" query ]
When making a request directly to an origin server, a client MUST send only the absolute path and query components of the target URI as the request-target. If the target URI's path component is empty, the client MUST send "/" as the path within the origin-form of request-target. A Host header field is also sent.
For example, a client wishing to retrieve a representation of the resource identified as http://www.example.org/where?q=now would send the lines:
GET /where?q=now HTTP/1.1
Host: www.example.org
absolute-form
Sometimes, a client MUST send the target URI in "absolute-form" as the request-target. An example absolute-form of request-line would be:
GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1
HTTP Version
HTTP uses a "<major>.<minor>" numbering scheme to indicate versions of the protocol. The version of an HTTP/1.x message is indicated by an HTTP-version field in the start-line. HTTP-version is case-sensitive.
HTTP-version = HTTP-name "/" DIGIT "." DIGIT
Request Headers
HTTP header fields, also known as HTTP headers, are components of the header section of request and response messages.
| Header Name | Description |
|---|---|
| Host | The domain name of the server and optionally the port number |
Response
Status Line
The first line of a response message is the status-line, consisting of the protocol version, a space (SP), the status code, another space and ending with a textual phrase describing the status code.
status-line = HTTP-version SP status-code SP reason-phrase
The reason-phrase element exists for the sole purpose of providing a textual description associated with the numeric status code.
Status Code
The status code of a response is a three-digit integer code that describes the result of the request and the semantics of the response. The first digit of the status code defines the class of response. There are five values for the first digit, for example:
- 1xx (Informational): The request was received, continuing process
- 2xx (Successful): The request was successfully received, understood, and accepted
- 3xx (Redirection): Further action needs to be taken in order to complete the request
- 4xx (Client Error): The request contains bad syntax or cannot be fulfilled
- 5xx (Server Error): The server failed to fulfill an apparently valid request
Here are the only status codes you need to implement for HTTPd:
| Status Code | Reason Phrase | Description |
|---|---|---|
| 200 | OK | The request has succeeded. |
| 400 | Bad Request | The server could not understand the request due to something that is perceived to be a client error. |
| 403 | Forbidden | The server understood the request but refuses to authorize it. |
| 404 | Not Found | The server cannot find the requested resource. |
| 405 | Method Not Allowed | The request method is known by the server but is not supported by the target resource. |
| 505 | HTTP Version Not Supported | The server does not support the HTTP protocol version used in the request. |
Response Headers
HTTP header fields, also known as HTTP headers, are components of the header section of request and response messages.
| Header Name | Description |
|---|---|
| Server | Contains information about the software used by the origin server to handle the request. |
| Date | The date and time that the message was sent. |
| Content-Length | The size of the response body in bytes. |
| Connection | Control options for the current connection (e.g., keep-alive, close). |
Message Body
The message body (if any) of an HTTP/1.1 message is used to carry content for the request or response. The presence of a message body in a request is signaled by a Content-Length header field. The presence of a message body in a response depends on both the request method to which it is responding and the response status code.
Content-Length
The Content-Length header field indicates the size of the message body, in decimal number of bytes, sent to the recipient.
Exemples of a request and a response
Here is an example of a simple HTTP request and the corresponding response. The server is running on localhost at port 8080.
42sh$ tree
.
├── request.txt
└── root_dir
└── index.html
2 directories, 2 files
42sh$ cat request.txt
GET / HTTP/1.1
HOST: localhost:8080
conTENT-LENGTH: 0
42sh$ cat root_dir/index.html
Hello World!
42sh$ cat request.txt | nc localhost 8080
HTTP/1.1 200 OK
Date: Fri, 2 May 2003 14:15:13 GMT
Content-Length: 13
Connection: close
Hello World!