The HTTP Protocol

Before getting into web application vulnerabilities, it is important to understand HTTP (Hyper Text Transfer Protocol). Some basic info about the HTTP protocol:

HTTP 1.1 is defined in RFCs 7230-7235
In most cases, HTTP is a stateless protocol which does not rely on persistent connection for communication logic
A HTTP transaction consists of a single request from client to server, followed by a single response from server to client
A server must maintain its connection to the client throughout transmission of successive commands until the interaction is terminated
A sequence of transmitted and executed commands is called a session

HTTP proxies operate between the client and server. They can make requests to web servers on behalf of clients, they enable HTTP transfers across firewalls and can also perform other roles, such as NAT and HTTP filtering.

HTTP is an application-level protocol in the TCP/IP suite. It uses TCP as the transport layer protocl for transmission.

A HTTP interaction typically has the following structure:

The Method:
- GET: retrieve information from server
- HEAD: same as GET, but only retrieves HTTP headers
- POST: send data to server
- TRACE: message loopback test along path to target
- PUT: upload representation of specific URI
- DELETE: delete resource
- OPTIONS: return methods supported by server
- CONNECT: convert request to transparent TCP/IP tunnel
The URI and path-to-resource field: path portion of the requested URL
Request version-number: specifies version of HTTP used by client
User-agent: the user-agent used to access the server e.g. Chrome, Firefox
Other fields like accept, accept-language may appear

Once this is sent, the server will respond and include a three digit status code alongside a human-readable explanation of the status code:

100 range code: informational message
200 range code: successful transaction message
300 range code: HTTP redirection message
400 range code: client side error message
500 range code: server side error message

HTTP URL Structure

It is important to understand the component parts of a URL, looking at:

https://example.com:1234/dir/test;id=1?name=contrxl&admin=true

The Scheme: this defines the protocol to be used, this is always followed by a colon and two forward slashes, the scheme here is https://.
The Host: the IP server (number or DNS) of the web server to access, this follows the scheme. The host here is example.com.
The Port: this is optional and denotes the port number which the target server listens on. The port here is :1234.
The Path: the path from the root directory of the server to the resource you wish to access. Servers can use aliasing to identify documents, gateways and services. The path here is /dir.
The Path-Segment-Params: this includes optional name/value pairs, this is typically preceded by a semi-colon and immediately follows the path. Here, the path segment parameter is id=1.
The Query-String: optional portion of URL including name/value pairs which represent dynamic parameters associated with your request. The query string is typically preceded by a question mark. Here, the query string is ?name=contrxl&admin=true

Web Sessions

A web session is a sequence of HTTP request and response transactions. These include pre-authentication, authentication, session management, access control, and session finalisation. Sessions are used to track anonymous users throughout their session, this means an application can remember a users language preference each time they visit the site.

Authenticated sessions allow the app to identify the user on subsequent requests and apply relevant access controls. Once an authentication session is established, a session ID/token becomes temporarily equivalent to the strongest authentication method used by the application. If the default session ID name is not changed, it can be used to fingerprint common development frameworks:

PHPSESSID (PHP)
JSESSIONID (J2EE)
CFID / CFTOKEN (ColdFusion)
ASP.NET_SessionId (ASP.NET)

It is important that the session ID is at least 128bits to prevent brute force attacks, it should also be unique and unpredictable. The session ID should also be excluded from the URL at all times to prevent manipulation.

All web sessions should be encrypted at all times. This ensures the session ID is only exchanged over encrypted channels.

Session management based on cookies use two types of cookie: non-persistent (session) cookies and persistent cookies. Any cookie with the max-age or expires attribute is a persistent cookie and is stored on disk by the server. Most modern apps use non-persistent cookies, which are erased once the session expires or the browser instance is closed.

Last updated 7 months ago