Blog
Back to blogs

Permanent production access is a slow-burning risk. JIT is the answer.

Most teams know that permanent production privileges are a problem. They leave them in place anyway because the alternative, a manual access process, is too slow, too clunky, and breaks down precisely when something urgent is happening. JIT does not solve this by adding more bureaucracy. It solves it by treating temporary access as a platform capability.

Production access has an uncomfortable status in many organisations. Everyone knows permanent privileges are not good practice. They stick around because nobody actively revoked them, because access was granted "just temporarily" and that temporary period expired months ago, or simply because revoking access costs more effort than leaving it in place. The result is an audit trail full of gaps, an access matrix nobody fully understands, and a production environment that is technically reachable by more people than necessary.

Just-in-Time access does not solve this by stacking a new forms process on top of existing workflows. It flips the assumption: access is the exception, not the default. Whoever needs access requests it explicitly. That request is technically validated, approved by the right person, automatically provisioned, and revoked at a predetermined time, without anyone needing to remember.

Why permanent access is quieter and more dangerous than it looks

The problem with permanent production privileges is not that they get abused. The problem is that you cannot tell whether they are being abused. An engineer who received access six months ago for an incident has probably not used those privileges since. But they are still there. And if they exist, they can be used by that engineer, by someone with access to their account, or through a key that ended up in a config file somewhere along the way.

The risks accumulate in silence. Not as one big incident, but as a slowly growing attack surface that only becomes visible when it is already too late. A well-designed JIT approach makes that pattern structurally impossible: without an active request, there is no access. Full stop.

The request layer: structure without bureaucracy

A JIT flow starts with a request that is structured enough to process automatically, but low-friction enough to reflect how people actually work. That sounds straightforward, but this is where things most often break down in practice. Teams build a form, the form gets too elaborate, engineers skip it or work around the system, and you are back to square one.

The request needs to fit how the team already works. That could be an internal portal, a ticketing integration, a CLI or a Slack workflow. The interface matters less than what it captures: who is requesting, for which scope, for how long, and for what purpose. Those four elements are not optional; they are the input for everything that follows.

  • Scope and duration are required fields, not optional metadata; they directly drive provisioning and revocation
  • The reason for access must be useful for the approver, not just for the audit trail after the fact
  • Free text is fine for context, but the technical core must be structured and machine-readable

What the flow looks like in practice

The diagram below shows what a mature JIT model looks like when abstracted from specific tooling. The exact choice of form, CI system or approval platform matters less than the coherence between the layers. Request, approval, provisioning, scoped access, audit logging and timed revocation need to work as one system. Once a single link is loose, you lose the guarantee the model is supposed to provide.

From request to automatic cleanup: a generic JIT flow that integrates with existing processes without losing the technical layer.

Temporary production access should not be an ad-hoc emergency process. It should be a reliable, standard function of your platform.

Approvals that actually mean something

An approval step that amounts to a thumbs-up in Slack is not a control. It is theatre. A real approval flow is technically enforceable: provisioning only starts once the approval has been registered by the system, by someone demonstrably authorised for the requested scope.

Two rules are non-negotiable here. The requester cannot approve their own request, even when that would be faster. And the decision itself must be part of the same audit trail as the rest of the flow, not floating somewhere in a chat history or email archive that will be gone in two years. An approval you cannot reconstruct has no value when an audit arrives.

Provisioning through existing automation paths

After an approved request, there should be no manual step. Provisioning belongs in the same automation paths already used for infrastructure and platform changes, whether that is a CI/CD system, an internal developer platform, a workflow runner, or a combination of orchestration and infrastructure-as-code.

In practice that usually means temporarily activating several layers at once: a controlled entry point into production, temporary IAM permissions, database access through a named account, cluster access, or network reachability into a protected environment. The exact combination depends on your stack, but the principle stays the same everywhere: grant only what is needed for the task and bind everything to the same expiry.

  • Use existing deployment and automation mechanisms; they should be reproducible, demonstrable, and not manual
  • Scope access to the task: no broad permissions when a narrower alternative is sufficient
  • Tie identity, permissions and any temporary access paths to the same access window

Credentials and access paths should not travel informally

A common failure mode is that teams provision temporary permissions correctly, but then distribute the associated credentials or connection details informally. At that point you may have formal JIT, but not an actually controlled operating model. Credentials, tokens or connection strings should be issued per user, per session or per request, and only readable by the identity they are bound to.

The same applies to access paths. If a bastion, jump host, session manager or temporary network segment is part of the design, that path also needs to sit inside the controlled flow. Not "here is the access", but "here is the temporarily activated route through which that access can be used safely and traceably".

The strongest point in a JIT model is not granting access. It is that revocation is guaranteed to happen, regardless of whether anyone remembers.

Automatic revocation: the property that makes the model complete

Revocation is where most JIT implementations fall short. Not because it is technically hard, but because it gets treated as an afterthought. Access is granted, and cleanup "will be handled later". Later does not come, or comes too late, and you gradually rebuild the same backlog of privileges that should have been gone months ago.

In a well-designed model, revocation is not a separate step. It is a built-in property of the system. At the moment provisioning happens, cleanup is already scheduled. A scheduler or expiry mechanism performs cleanup at a fixed time: temporary compute removed, permissions revoked, session paths closed, audit state updated. Nobody needs to remember, because the system handles it.

An audit trail that stays useful later

"Access granted" is not an audit trail. A useful log tells the complete story: who requested, for what scope, for how long, with whose approval, which resources were activated, when revocation happened, and whether that cleanup completed successfully. Only then can you demonstrate in an audit not just that control existed, but how it worked technically.

That information is valuable in audits, but equally so during incident response. If something goes wrong in production, you want to see within a minute which temporary access paths were active at that moment, who requested them, and whether they were correctly revoked. A good log is not a compliance exercise. It is operational readiness.

Why teams find this faster than the classic approach

The counterintuitive thing about a well-designed JIT model is that it feels faster in day-to-day practice than the alternative. Not because there is less control, but because the control sits inside the flow rather than around it. No loose processes to route around, no ad-hoc exceptions that multiply. Whoever needs access requests it, and the flow handles the rest.

For security, compliance and audit teams, it means production access is finally explainable. For the engineers using it, it means the path to production is clear and does not depend on finding the right person at the right time. That is the value of governance that is technically built in instead of manually enforced.

What you get

JIT production access is at its best when temporary access is implemented as a standard platform function: clear to request, technically enforceable, integrated with existing processes, automatically cleaned up, and logged strongly enough to fully reconstruct later. That gives you not only less risk. It gives you a platform that keeps functioning like a mature system under pressure, including when something actually goes wrong.