Most web applications need to handle user sessions at some point. A common use-case is to remember an authenticated user across requests. Since HTTP is a stateless protocol, the only way for servers to know that the current request is related to a previous request by the same user is to associate them with some common identifier provided by the browser containing information about the user.
A popular and simple way that browser clients provide this identifier is to do it completely through cookies – this is also known as client side session management.
Here’s a typical scenario involving cookie sessions:
- User submits login credentials for web app
- Server authenticates the user credentials and asks the browser to set a cookie containing information (such as user ID) that allows the server to re-authenticate the user on subsequent requests
- After the cookie is set, the browser forwards cookies to the app as long as it’s active and not expired.
- The app reads the cookie and the user ID that the cookie contains and then performs a user lookup using the ID. If the user exists, they’re authenticated and the application can respond accordingly.
In most production systems, the cookie contents are typically masked and cryptographically signed so that they can’t tampered with. This is important because if cookie contents can be easily modified, any authenticated user can impersonate any other user just by changing the contents of the cookie like
Drawbacks of client side management
While a fully client side approach that’s easy to set up – and very secure if done right – there’s some drawbacks as well.
- If your server keys are exposed, the contents of any cookie generated by the key can be tampered with. This can be bad if the cookies contain personally identifiable information (PII) or worse, user credentials.
- You are reliant on the browser to expire user sessions since the entirety of session state lives in the browser. To get rid of a cookie, you need to make sure to set cookie expiry headers and wait for the browser to expire them. If cookies are being cryptographically signed, you can technically rotate your key to invalidate all sessions… but that’s a nuclear option.
The one-sided nature of an applications control over session life and potential for leaking sensitive information into the client leads many applications to adopt a server side session management – a different, more operationally expensive approach that comes with a different set of trade-offs.
Server Side Management
Server side session management (SSM) is a bit of a misnomer because session related data is not completely server side. Remember, HTTP is a stateless protocol – the only way for a remote server to tell that requests are related is from information passed along by the client.
Instead of storing the actual contents of a session like user id (encrypted or not, doesn’t matter) directly in the cookie, it stores a reference to the data in the form of a session ID – also known as a session token. When the client makes a request to the web application, it forwards this cookie containing the session token and the application uses it to look up the actual contents.
Okay, so where is the actual content stored?
The two primary locations for session contents are either on same application servers receiving requests or in a database running on another server.
Storing session data on application servers
The easiest option is on the application server itself, either on disk or in memory. On a single server setup, sessions will go down with the server if in memory but can persist across restarts if persisted to disk.
Most high-traffic production apps are clustered so there are groups of app servers acting as single system. In a clustered environment behind an load balancer, you can’t store session data on specific instances without using sticky sessions (route requests from same client to the same server). Otherwise, requests are likely going to end up on different servers each time and create a bad user experience; in one request, a user may be identified as being logged in but then in another subsequent request they’re suddenly logged out.
I would avoid sticky sessions because it requires you to have long-lived servers. If you’re doing blue-green deployments and are frequently tearing down old servers and deploying new ones, you’re constantly going to be purging session data. This fundamentally limits your ability to horizontally scale your web instances.
Instead of storing session data on app servers directly, I recommend storing session data in a separate datastore on an external, shared server.
Storing session data on databases in external servers
With an external storage setup, there’s two primary options as far as databases go – either in-memory or on-disk. We’re also going to incur an additional network call for every request.
If we’re using an in-memory, distributed cache store like Redis or Memcached:
- Storage limit is limited by RAM, but with option of horizontally scaling out by adding more instances to a cluster. Ideal for short-lived, volatile (can be evicted at anytime without serious impact on application) sessions.
- Fast reads and writes – a fetch from RAM is orders of magnitudes faster than fetch from disk.
- The instance needs to be properly secured and fire-walled, otherwise you risk compromising potentially sensitive user data.
- It’s not as easy to setup as a cookie only session because you need to create and manage additional infrastructure.
- Automatic expiry management and cleanup is available in stores like Redis.
If we’re using an on-disk store such as PostgreSQL or MySQL:
- The storage limit per instance is much, much larger. This is ideal for large, persistent session data. At the time of writing you can get a 64 terabyte server from AWS.
- Reads and writes will be slower compared to in-memory stores.
- Expiry will be more manual. Unlike in memory databases, you cant just restart the instance if you want to wipe all sessions at once. You’ll likely need to run background jobs to expire session data.
While there’s no one size fits all solution depending on your use case, here’s some general guidelines:
- Avoid storing PII or any sensitive information as session content – this applies regardless of whether you use client or server side management. When encryption keys are compromised, you should assume exposure of user data so it’s best to be sure you’re not storing anything sensitive.
- Keep session state to a minimum – if it seems like your per-session data storage requirements are high, maybe it shouldn’t be treated as session data? It may make sense to persist alongside your primary application state.
- Lean on trusted, open-source libraries to ensure security best practices (rails by default encrypts session tokens) instead of rolling your own home-grown session management solution. Most popular web frameworks like rails have many battle-tested solutions around session management such as devise.
- Avoid sticky sessions. They’re unreliable and the drawbacks are usually not what you’re willing to sign up for. For clustered services that use server side sessions, just go with a distributed cache.
- Know how to securely manage your instance and understand its failure modes and replication behavior, especially if you’re dealing with high traffic applications.
- Avoid re-using your primary application database as a session database because they can experience very different traffic patterns. For example, with certain session configurations, Rails apps will create a new session for every new user that visits the application – this can put undue write load on your application server.