Skip to main content

Passwordless Auth to Azure Key Vault using External Secret and Workload Identity

I want to fetch my secrets from Azure KV and I don't want to use any password for it. Let's see how this can be implemented.

This is yet another blog post (YABP) about ESO and Azure Workload Identity.

Why Passwordless Auth:

It is a common practice to use some sort of "master password" (spn clienid, clientsecret etc) to access Secret Vaults (in this case it is AZ KV) but that master password becomes a headache to manage (rotate, prevent leak etc). So, the passwordless auth to AKV is ideal. 

Why ESO:

This is discussed and addressed in the conclusion section.


Workload Identity (Passwordless Auth):

Lets make a backward start (just for a change). I will try to explain how the passwordless auth will work. This will make more sense when you will read through the detailed implementation section.

Here's a sequence diagram to explain it:


There's no magic here. This is a well documented process by microsoft here.


The below diagram (directly copied from the official documentation) explain the handshaking process further: 



Here's what I understood:

OCP Cluster: Generates a ServiceAccount token (JWT) signed with its private key.

ESO: Sends this JWT to Azure Entra ID requesting an access token.

Azure Entra ID: Reads the "Issuer" field in the JWT (https://xxxxxxxxxx.z13.web.core.windows.net).

Azure Entra ID: Makes a call to that URL (publicly available; hosted on azure blob) to fetch the discovery document and the jwks.json.

Issuer URL (host on AZ Storage): Returns the public keys.

Azure Entra ID: Verifies the signature of the token from the OCP cluster.

Azure Entra ID: Sends back an Azure Access Token to the OCP cluster.

ESO: Calls Azure Key Vault with that token to fetch needed secrets (as defined in the CRD: External Secret).

Detailed Implementation:

Here's a solution overview diagram:


Description of the components in this diagram:

The Azure Components:

1. Key Vault: 

It provides a secure, centralized repository for managing cryptographic keys, certificates, and secrets (such as passwords, connection strings, and API keys). This is where I will manage the secrets/credentials such as DB creds, regitry creds/dockerconfigjson, certs, system creds etc.



az group create --name homelab --location eastus

AZ_KV_NAME=homelab-kv
az keyvault create --name $AZ_KV_NAME --resource-group "homelab" --location "eastus"




# Add current logged in user as officer to create secrets

# for service principal

USER_ID=$(az account show --query user.name -o tsv)
ASSIGNEE_OID=$(az ad sp show --id $USER_ID --query id -o tsv)

# or for real user
# az ad signed-in-user show --query id -o tsv

az role assignment create --role "Key Vault Secrets Officer" \
  --assignee $ASSIGNEE_OID \
  --scope "/subscriptions/$SUBSCRIPTION/resourcegroups/homelab/providers/Microsoft.KeyVault/vaults/$AZ_KV_NAME" 


# add docker registry creds
az keyvault secret set --vault-name $AZ_KV_NAME \
  --name "mirror-registry-cred" \
  --value "$DOCKER_CONFIG_JSON"

# DB CRED: Create the secret "mysecret" with a JSON-formatted value
az keyvault secret set --vault-name $AZ_KV_NAME \
  --name "mysecret" \
  --value '{"username":"superman", "password":"louiselane"}'


2. Issuer URL:

In order to perform authentication the Azure EntraID global service needs to download the cluster's public keys to verify the tokens OCP sends.

The networking condition here is that the cluster is inside homelab network there's no ingress to the cluster available from the internet and cluster can only egress over a 200kbps connection.

To overcome this networking constraint we can host the cluster's public keys (JWKS.json) and metadata info (openid_configuration) in a publicly accessible location. I choose azure blob storage to host these info.

Below is how I achieved it:

2a. Extract the OIDC details from OCP:

These are our discovery documents for the cluster.


oc get --raw /openid/v1/jwks > jwks.json


oc get --raw /.well-known/openid-configuration > openid-configuration #---> Need to change this later.


2b. Create Storage to host JWKS and OpenID Config:


export AZURE_STORAGE_ACCOUNT="xxxxxxxxxxxx"
az storage account create --resource-group "homelab" --name $AZURE_STORAGE_ACCOUNT
az storage container create --name '$web'

az storage blob service-properties update \
--account-name $AZURE_STORAGE_ACCOUNT \
--static-website --index-document index.html


2c. Update the openid-configuration



nano openid-configuration.json

# before

{
  "issuer": "https://kubernetes.default.svc",
  "jwks_uri": "https://api.homelab.local:6443/openid/v1/jwks",
  "response_types_supported": [
    "id_token"
  ],
  "subject_types_supported": [
    "public"
  ],
  "id_token_signing_alg_values_supported": [
    "RS256"
  ]
}


# after (nano modify)

{
  "issuer": "https://$AZURE_STORAGE_ACCOUNT.z13.web.core.windows.net/",
  "jwks_uri": "https://AZURE_STORAGE_ACCOUNT.z13.web.core.windows.net/openid/v1/jwks",
  "response_types_supported": [
    "id_token"
  ],
  "subject_types_supported": [
    "public"
  ],
  "id_token_signing_alg_values_supported": [
    "RS256"
  ]
}



2d. Upload the Discovery Documents to the AZ Blob Storage:



az storage blob upload --account-name $AZURE_STORAGE_ACCOUNT \
--container-name '$web' --file openid-configuration.json \
--name .well-known/openid-configuration --overwrite

az storage blob upload --account-name ${AZURE_STORAGE_ACCOUNT} \
--container-name '$web' --file jwks.json --name openid/v1/jwks --overwrite

# (optional)
az storage account update --name ${AZURE_STORAGE_ACCOUNT} \
  --resource-group "homelab" \
  --bypass AzureServices

# test
curl -s "https://${AZURE_STORAGE_ACCOUNT}.z13.web.core.windows.net/.well-known/openid-configuration" | jq .

curl -s "https://${AZURE_STORAGE_ACCOUNT}.z13.web.core.windows.net/openid/v1/jwks" | jq .



This is our publicly available OIDC issuer URL.

Note: I understand, this looks suspiciously a security issue, since this is hosted and publicly available. But it may be not. Here's how I rationalised it:

  • The OIDC issuer URL cannot be behind a private endpoint because Microsoft Entra ID (the service verifying your token) is a global, multi-tenant cloud service. It lives outside your private virtual network. When it needs to verify the cluster's token, it acts as a client on the public internet to fetch the jwks.json.
  • This isn't a security risk because In the world of OIDC, the Discovery Document and JWKS are designed to be public. Even for major providers like Google or Microsoft, these files are intentionally open to the world. The files (jwks, openid_configuration) contain public keys and metadata information only. An attacker can download the public key but it can not be used to generate valid token and to forge a token they must have the private key which only resides in the cluster (in the bound-service-account-signing-key secret).
  • In simple analogy the JWKS is like a physical signature on a public domain. Anyone can look at it (to verify) and no one can use it to sign anything.


2e. Configure OpenShift to Use the OIDC Issuer:



oc patch authentication cluster --type=merge \ 
-p "{\"spec\":{\"serviceAccountIssuer\":\"https://${AZURE_STORAGE_ACCOUNT}.z13.web.core.windows.net/\"}}"



This is what will make the OCP API Server to issue signed JWT token with the https://xxxxxxxxxx.z13.web.core.windows.net as iss and that's how Azure EntraID will know which issuer URL to call to get the discovery documents for verification.

3. Managed Identity:

- The Managed Identity "owns" the permission. In this case the permission is "Key Vault Secrets User". It is well documented here

- When the ESO tries to talk to Key Vault, it isn't logging in as "ESO Pod"; it is logging in as homelab-eso-identity. Like "I am a Kubernetes pod, and I want to act as Managed Identity homelab-eso-identity.." Azure EntraID now has a specific "account number" to look up. It goes directly to that identity, checks its Federated Credentials, and sees if they match with the OCP's token.


az identity create --name "homelab-eso-identity" --resource-group "homelab" 
# note the clientId for future use in the K8s SA. Also note the PricipalID.

PRINCIPAL_ID=xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx
RESOURCE_GROUP=homelab
AZURE_KV_NAME=homelab-kv
az role assignment create --role "Key Vault Secrets User" --assignee $PRINCIPAL_ID \
--scope /subscriptions/$SUBSCRIPTION/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.KeyVault/vaults/$AZURE_KV_NAME



4. Federated Credential:


IDENTITY_NAME=homelab-eso-identity
RESOURCE_GROUP=homelab
az identity federated-credential create --name "ocp-eso-fed" --identity-name $IDENTITY_NAME \
  --resource-group $RESOURCE_GROUP \
  --issuer "https://${AZURE_STORAGE_ACCOUNT}.z13.web.core.windows.net/" \
  --subject "system:serviceaccount:external-secrets:eso-az-sa"

If the Managed Identity is the Lock and the Kubernetes Token is the Key, the Federated Credential is the Ruleset that determines if that specific key is allowed to turn that specific lock. Without a Federated Credential, Azure has no way of knowing if a token it receives is legitimate.
When your OCP sends a token (ESO POD), Azure asks three verification questions. The Federated Credential provides the pre-approved answers for the Managed Identity:

- Question 1: Who issued this? (Issuer): Azure checks: Does the token's iss field match issuer URL?

- Question 2: Who is the specific user? (Subject): Azure checks: Does the token's sub field match system:serviceaccount:external-secrets:eso-az-sa.

- Question 3: Who is the intended audience? (Audience): Azure checks: Does the token's aud field match api://AzureADTokenExchange? (when created is AZ it is reserved)

If all three match perfectly, Azure accepts that the pod as is who it says it is.

The OpenShift Components:

1. ESO( aka External Secret Operator):

I deployed the Red Hat published version this on my OpenShift 4.20 cluster. Here's a screen shot of the operator:



ESO can be deployed on openshift either via OpenShift UI or via CRD. I used CRD:



---
apiVersion: v1
kind: Namespace
metadata:
  name: external-secrets-operator
  labels:
    openshift.io/cluster-monitoring: "true"
  annotations:
    workload.openshift.io/allowed: management

---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-external-secrets-operator
  namespace: external-secrets-operator
spec:
  targetNamespaces: []

---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-external-secrets-operator
  namespace: external-secrets-operator
spec:
  channel: stable-v1
  name: openshift-external-secrets-operator
  source: default-catalog-source
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic
  startingCSV: external-secrets-operator.v1.0.0

---
apiVersion: operator.openshift.io/v1alpha1
kind: ExternalSecretsConfig
metadata:
  labels:
    app: external-secrets-operator
    app.kubernetes.io/name: cluster
  name: cluster
spec:
  controllerConfig:
    networkPolicies:
    - componentName: ExternalSecretsCoreController
      egress:
      - {}
      name: allow-external-secrets-egress


2. Service Account: 

The service account here is the same K8s SA. One special thing about it is that it has a special annotation using the ClientID of the Azure Managed Identity (previously created).  


---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: eso-az-sa
  namespace: external-secrets
  annotations:
    azure.workload.identity/client-id: xxxxx # AZURE_IDENTITY_CLIENT_ID

# The annotation tells Azure which identity this account "becomes."


3. CRDs: 

The diagram shows 2 of ESO's custom resources.

3a. NS or Cluster scoped SecretStore: 

The SecretStore is a namespace scopped CRD used by ESO to define how to authenticate and connect to an external secret provider (in this case which is AKV). When the same is needed to be defined/declared for the entire cluster, it is simply called ClusterSecretStore (in this example I used ClusterSecretSctore). Below is what my ClusterSecretStore looks like:


---
apiVersion: external-secrets.io/v1
kind: ClusterSecretStore # Use Cluster level to save bandwidth/repetition
metadata:
  name: azure-kv-css
spec:
  provider:
    azurekv:
      authType: WorkloadIdentity
      tenantId: xxxx-xxx-xxxx
      environmentType: PublicCloud
      vaultUrl: "https://homelab-kv.vault.azure.net"
      serviceAccountRef:
        name: eso-az-sa
        namespace: external-secrets


3b. ExternalSecret:

Defines the desired state of a Kubernetes Secret, specifying which external secret to fetch, how to transform it, and where to store it. It is a namespace scoped object, obviously. When an ExternalSecret is deployed in a NS a pairing K8s secret object holding secrets from external secret provider (in this case it is AKV) is created and managed by ESO. 

There's nothing special about these K8s secrets. A POD or a Deployment uses them/it the same way it would for any other K8s Secrets. The only special thing here is that it is managed by ESO, meaning, when a secret is rotated/changed/deleted in the AKV, these K8s secrets are also rotated/changed/deleted. 

Sample ExternalSecret for "imagePullSecret"


---
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: es-mirror-registry-creds
  namespace: group1
spec:
  refreshInterval: "4h" 
  secretStoreRef:
    name: azure-kv-css # Using the ClusterSecretStore you created earlier
    kind: ClusterSecretStore
  target:
    name: mirror-registry-creds # The name you'll use in 'imagePullSecrets'
    creationPolicy: Owner
    template:
      type: kubernetes.io/dockerconfigjson # Forces the correct K8s secret type
      data:
        .dockerconfigjson: "{{ `{{ .dockercfgjson | toString }}` }}"
  data:
  - secretKey: dockercfgjson
    remoteRef:
      key: mirror-registry-cred # The name of the secret in Azure


Sample ExternalSecret for "DB Credential"


---
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: es-pgsql-creds
  namespace: group1
spec:
  refreshInterval: "1h" # Crucial for your 200kbps link
  secretStoreRef:
    name: azure-kv-css
    kind: ClusterSecretStore
  target:
    name: pqsql-creds
  data:
  - secretKey: username
    remoteRef:
      key: app-pgsql
      property: username # Pulls "superman" from the JSON
  - secretKey: password
    remoteRef:
      key: app-pgsql
      property: password # Pulls "louiselane" from the JSON



4. Workload:

It is a simple in pattern K8s deployment. 

Here's a Deployment.yaml file


---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dbreaderapp
  # This will be deployed into 'group1' via your AppSet
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dbreaderapp
  template:
    metadata:
      labels:
        app: dbreaderapp
    spec:
      imagePullSecrets:
        - name: mirror-registry-creds
      containers:
      - name: secret-reader
        image: mirror-registry.alishomelab.local:8443/docker/dbreader:v1
        imagePullPolicy: Always
        env:
          - name: USERNAME
            valueFrom:
              secretKeyRef:
                name: pgsql-creds
                key: username # Assumes your secret has a 'username' key
          - name: PASSWORD
            valueFrom:
              secretKeyRef:
                name: pgsql-creds
                key: password # Assumes your secret has a 'password' key



A DevOps process can either "kubectl create" them (the pair of Deployment.yaml and ExternalSecret.yaml) or a GitOps process can apply them in the cluster namespaces. This is great for DevOps or App deployment process because it is almost (instead of Kind: Secret you deploy Kind: ExternalSecret) on par with any traditional K8s app deployment.   



Conclusion:

I think I will conclude this post with a ESO vs Vault Injection (Sidecar pattern).

ESO is an excellent tool to be used to get secrets/credentials from a centrally managed key vault (eg: AKV) without disrupting any application deployment and delivery pattern. From an app dev and devops perspective it would look like an in pattern secret consumption from K8s. Whereas, consuming secrets via Vault Injection may require custom or non K8s native in pattern implementation (eg: imagePullSecret). 
 
ESO is also a great fit when it is intermittent network connection (to the central vault). When used The Vault Injection every pod (consuming at least 1 secret from Vault) must call to Vault. The cluster will certainly become chatty.

ESO does not mutate the workloads consuming secrets. This makes it lightweight. Vault Injection adds additional memory consumption to each pod using it because of sidecar injection.

Vault Injection is probably a more security friendly that ESO as ESO will create a K8s Secret object with the synced secret which is only base64 encoded. This is not exactly leak preventive. 

ESO is great for secret management, Vault Inject is good for securing and management of secret. But this statement must be taken with grain of salt.


That's it.

  

Photo: generated by Nano Banana

Comments

Popular posts from this blog

A modern cloud native (and self serve) way to manage Virtual Machines

Really!! Are there could native way to deploy, LCM VMs and add Self Serve on top ???? In this post I will describe an art of the possibility using the below tools: RHDH: Red Hat Developer Hub (Open source project: Backstage ) OCP Virtualization: Red Hat OpenShift Virtualization (Open source project: KubeVirt ) AAP: Red Hat Ansible Automation Platform (Open source project: Ansible / AWX ) RHEL BootC: Image mode for Red Hat Enterprise Linux (Open source project: bootc ) GitOps: Red Hat OpenShift GitOps (Open source project: ArgoCD ) Quay Registry or any other OCI compliant registry All of these projects can be run on Red Hat OpenShift (Open source project: OKD ) OR on other Kubernetes distribution or on VMs (you pick your underlying infra. For this post I have used OpenShift for simplicity of deployment, integrated tools and narrowly focusing on the usecases instead of the deployment of the tools).  The main goal here is to: Easily deploy and lifecycle applications and stuffs ...

The story of a Hack Job

"So, you have hacked it" -- Few days ago one of the guys at work passed me this comment on a random discussion about something I built. I paused for a moment and pondered: Do I reply defending how that's not a hack. OR Do I just not bother I picked the second option for 2 reasons: It was late. It probably isn't worth defending the "hack vs" topic as the comment passed was out of context. So I chose the next best action and replied "Yep, sure did and it is working great.". I felt like Batman in the moment. In this post I will rant about the knowledge gap around hacking and then describe about one of the components of my home automation project (really, this is the main reason for this post) and use that as an example how hacking is cool and does not always mean bad. But first lets align on my definition of hacking: People use this term in good and bad, both ways. For example: "He/she did a hack job" -- Yeah, that probably...

Do you even Kubernetes ? - in private cloud

Kubernetes (“koo-burr-NET-eez”) /κυβερνήτης/ - Can be used as noun or verb. Noun "helmsman" or "pilot" or "Orchestrator". We use Kubernetes to achieve resiliency for our application. Verb Perform the act of doing Kubernetes. When done using TKG it is easy but can be super hard if the right tool is not used. Do you even Kubernetes? If I were to survey about how many people in IT industry (regardless of role) knows or at least heard about Kubernetes I would be very surprised if the percentage came out any less than at least 80%. I am curious though, How many people have actually deployed on Kubernetes? How many people have created a Kubernetes cluster? How? The answer could go either way of "Yeah, it's easy" OR "Dude!! it's hard". This is because, in my opinion, it all depends on choosing the right toolset that are fit for purpose. In this post I will create a Kubernetes cluster and deploy a microservice application End-To-End, th...

CKA Exam; May 2024: My take on it and cheat sheet

So, I finally got the little green tick of having CKA certification in my certification list. I put off this exam for so long that it seriously became not funny anymore. The internet has quite literally way more than 1000 posts on this topic. But what harm would one more post cause? So here's mine. I will write it from my perspective. I am writing this post just in case if anyone benefits from it, as I predict there could be many on the same boat as me. Background: Kubernetes, modern application architecture, DevSecOps etc are not new territory for me. In fact, I think I am fairly versed in K8s and related tech stack. But due my own imposter syndrome I have been putting off sitting the CKA exam. However, last week I thought about the CKA as "just another approval for my skills" and got the nudge to sit the exam.  Here's what I did till the day I sat for the exam. (Everybody is different but the below worked for me the best) The preparation: As I have been working with...

Openshift-Powered Homelab | Why, What, How

I wanted to build a Homelab for some time but it was taking a backseat as I always had access to cloud environments (eg: cloud accounts, VMware DC etc) and the use cases I was focusing on didn't really warrant for one. But lately, some new developments and opportunities in the industry triggered the need to explore use cases in a bare-metal server environment, ultimately leading to the built of my own homelab, called MetalSNO. In this post, I will discuss some of my key reasons for building a homelab, the goals I set for it, and the process I followed to building one from scratch. I'll conclude with some reflections on whether it was truly worth it and what I plan to do with it going forward. Compelling reasons (The Why ) My uses cases for a homelab weren't about hosting plex server, home automation etc (I have them on Raspberry PIs for some years now). My Homelab is really about exploring technologies and concepts that are on par with industry trend. Below are some of the ...