Managing edge devices has been a complex process as traditional IT ops tools fall short in distributed, low-connectivity environment to manage huge quantity of devices. Red Hat Edge Manager (Open source project: FlightControl, GA'd by Red Hat on late Jan, 2026) solves these challenges by providing streamlined management of edge devices and applications through a declarative approach. Now, there's a fair bit to unpack here. But for simplicity this is how I am going to map those 3 things here:
- Management of edge devices: I am mapping this to LCM (including upgrade, patch etc) of the underlying OS (in this case RHEL OS of BootC flavor or at least UBI based RHEL ).
- Managing applications: Mapping this to deploying applications and LCM of the applications stack on the OS.
- Declarative approach: This one is super interesting. To me this is very K8s-yy but in the world of edge devices running linux (RHEL OS, as of today).
And then this thing also has MCP: This is my next probe. It is still fairly new, but so far what I have found it certainly packs a lot of potential. Keep an eye here for my take on this (coming soon..)
The Setup:
Before we get onto its usecases (the 3 points mentioned above) lets quickly look at "the setup".
I have an OpenShift Hub cluster (OCP 4.20.15) where I have RHACM 2.15. I manage my edge servers (SNOs) from this RHACM. I deployed Red Hat Edge Manager on this Hub Cluster and exposed its UI integrated to the Fleet Management UI (that came with RHACM UI).
Note: The RHEM (Red Hat Edge Manager) helm chart is only available from OCP 4.19 on wards. It also requires RHACM 2.15 for UI integration.
Below is the diagram to visualise it.
Demo
RHEM (+bootc) or AAP (+bootc)
To be honest, I am not sure if there's a generic "one over the other" answer here. It may depend on factors like operational process, team size, team skills number of devices, upgrade cycle, release processes etc. Here's a diagram to visualise the differences:
- RHEM + bootc is more like continuous delivery whereas AAP + bootc is more like releases in cycles.
- RHEM + bootc introduces separation of concerns by design where os release, config changes and applications releases are independent whereas AAP + bootc needs to package applications, configurations etc in bootc container image and released as part of the OS release.
- Continuous config drift management is baked in by design in RHEM whereas using AAP it needs to be implemented. But BootC makes this technical difference very low impacting.
- When bootc is used their differences becomes more of a architectural principle (of course technical differences exist bit both may perform the same). This is because bootc packages it anyway and separation of concern may become a design choice rather than technical limitation.
Ok, with that out of the way lets get into technical bits and bobs.
RHEM Setup:
The deployment of RHEM on to the Hub Cluster and expose its UI in the Fleet Management interface was easy enough. The official documentation is pretty good (for the most part; I will list out few gotchas in this post). I followed it to deploy RHEM 1.0 on my OCP 4.20.
RHEM deployment on OCP:
Step1:
Create NS and extract the cert to be used.
kubectl create ns redhat-rhem kubectl get configmap default-ingress-cert \ -n openshift-config-managed \ -o jsonpath='{.data.ca-bundle\.crt}' > /tmp/ingress-ca.crt
Step2:
Select the project and install RHEM by browsing ecosystem > software catalog. Select the right operator (at the time of this post, only helm was available). Just before you deploy go to "yaml view" and edit the yaml to add the ingress-ca.crt previously extracted. Finally click deploy to deploy it. You can check the pods coming the created namespace (eg: redhat-rhem). Here's mine:
## most of it is default
global:
auth:
k8s:
apiUrl: 'https://kubernetes.default.svc'
createAdminUser: true
oidc:
organizationAssignment:
organizationName: default
type: static
usernameClaim:
- preferred_username
roleAssignment:
claimPath:
- groups
type: dynamic
clientId: flightctl-client
openshift:
clusterControlPlaneUrl: 'https://kubernetes.default.svc'
createAdminUser: true
caCert: |
-----BEGIN CERTIFICATE-----
extacted cert contemt
-----END CERTIFICATE-----
insecureSkipTlsVerify: false
gateway:
ports:
http: 80
tls: 443
enableMulticlusterExtensions: auto
enableOpenShiftExtensions: auto
exposeServicesMethod: auto
generateCertificates: auto
imagePullPolicy: IfNotPresent
multiclusterEngineNamespace: multicluster-engine
db:
builtin:
image:
image: registry.redhat.io/rhel9/postgresql-16
tag: 9.6-1752571367
storage:
size: 60Gi
resources:
requests:
cpu: 512m
memory: 512Mi
maxConnections: 200
external:
port: 5432
name: flightctl
type: builtin
kv:
image:
image: registry.redhat.io/rhel9/redis-7
tag: 9.6-1752567986
loglevel: warning
maxmemory: 1gb
maxmemoryPolicy: allkeys-lru
alertmanager:
image:
image: registry.redhat.io/rhacm2/prometheus-alertmanager-rhel9
tag: v2.13.2
enabled: true
api:
image:
image: registry.redhat.io/rhem/flightctl-api-rhel9
rateLimit:
trustedProxies:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
authRequests: 20
authWindow: 1h
enabled: true
requests: 300
window: 1m
cliArtifacts:
image:
image: registry.redhat.io/rhem/flightctl-cli-artifacts-rhel9
enabled: true
worker:
image:
image: registry.redhat.io/rhem/flightctl-worker-rhel9
clusterLevelSecretAccess: false
periodic:
image:
image: registry.redhat.io/rhem/flightctl-periodic-rhel9
consumers: 5
alertExporter:
image:
image: registry.redhat.io/rhem/flightctl-alert-exporter-rhel9
enabled: true
alertmanagerProxy:
image:
image: registry.redhat.io/rhem/flightctl-alertmanager-proxy-rhel9
enabled: true
telemetryGateway:
image:
image: registry.redhat.io/rhem/flightctl-telemetry-gateway-rhel9
ui:
image:
image: registry.redhat.io/rhem/flightctl-ui-rhel9
pluginImage: registry.redhat.io/rhem/flightctl-ui-ocp-rhel9
auth:
insecureSkipTlsVerify: false
enabled: true
clusterCli:
image:
image: registry.redhat.io/openshift4/ose-cli-rhel9
tag: v4.20.0
dbSetup:
image:
image: registry.redhat.io/rhem/flightctl-db-setup-rhel9
wait:
sleep: 2
timeout: 60
migration:
activeDeadlineSeconds: 0
backoffLimit: 2147483647
upgradeHooks:
scaleDown:
deployments:
- flightctl-periodic
- flightctl-worker
condition: chart
timeoutSeconds: 120
databaseMigrationDryRun: true
Step3:
I my case (rhem version 1.0), I had to deploy the RHEM console plugin (this may not be required in future versions and the helm deployment should be able auto detect RHACM presence and integrate the console plugin accordingly).
cat <<EOF | kubectl apply -f -
apiVersion: console.openshift.io/v1
kind: ConsolePlugin
metadata:
name: flightctl-plugin
spec:
displayName: 'Red Hat Edge Manager'
backend:
type: Service
service:
name: flightctl-ui
namespace: redhat-rhem
port: 8080
basePath: '/'
proxy:
- alias: api-proxy
authorization: UserToken
endpoint:
type: Service
service:
name: flightctl-ui #---> pay attention here. point to UI instead of API.
namespace: redhat-rhem
port: 8080 #---> pay attention here
EOF
kubectl patch cm flightctl-ui -n redhat-rhem --type=merge -p '{
"data": {
"FLIGHTCTL_SERVER": "/api/proxy/plugin/flightctl-plugin/api-proxy/",
"FLIGHTCTL_SERVER_EXTERNAL": "/api/proxy/plugin/flightctl-plugin/api-proxy/"
}
}'
kubectl patch console.operator.openshift.io cluster \--type=json -p='[{"op": "add", "path": "/spec/plugins/-", "value": "flightctl-plugin"}]'
Step4:
Create RHEM organization.
kubectl create ns rhem-default-org oc label namespace rhem-default-org io.flightctl/instance=redhat-rhem
Note: from documentation, it is said that:
- Namespace-to-Organization Mapping: Red Hat Edge Manager uses a 1:1 mapping between OpenShift namespaces and Organizations.
- Automatic Discovery: The act of labeling a namespace with io.flightctl/instance=<helm_release-name> triggers the automatic discovery and initialization of that namespace as a Red Hat Edge Manager Organization. Here helm-release-name=redhat-rhem
Step5:
Configure authorization for this org:
oc adm policy add-role-to-user view admin -n rhem-default-org oc adm policy add-role-to-user flightctl-admin-redhat-rhem admin -n rhem-default-org
That's it. RHEM should be up and running and accessible via UI and API. See screen shots below:
RHEM integration with RHEL:
In order to integrate the device to RHEM (ie: bring the device under RHEM's management) we need to do 2 things on the RHEL. In my case, since I was using BootC I added the below code in my bootc Dockerfile:
Install flightctl-agent
I added the below line in my Dockerfile:
RUN dnf -y install dnf-plugins-core && dnf clean all RUN dnf config-manager --add-repo https://rpm.flightctl.io/flightctl-epel.repo RUN dnf -y install flightctl-agent && dnf clean all
Add flightctl config to the OS
This is so that it knows where to report to and where to get the desired states from.
In order to do this I first needed to install flightctl cli on local machine. In my case, I spun up a docker (Dockerfile). Then executed the blow commands:flightctl login https://api.redhat-rhem.apps.myhub.xyz -k --token=$(oc whoami -t)
flightctl certificate request \
--signer=enrollment \--expiration=365d --output=embedded > config.yaml
Then I used this config.yaml file like below in my Docker:
Note: Since I am using BootC, managing the config.yaml is a critical part of the device identity lifecycle. The way the "bind" is done determines how the device first communicates with the management server.
RUN mkdir -p /etc/flightctl/ ADD config.yaml /etc/flightctl/config.yaml
Note: Since I am using BootC, managing the config.yaml is a critical part of the device identity lifecycle. The way the "bind" is done determines how the device first communicates with the management server.
- Early Binding (Baked into the Image): This is what I have done.
- Late Binding (Injected at Provisioning): In this scenario, the OS image is "generic," and the config.yaml is provided to the VM at the very first boot. This is done using cloud-init (NoCloud/ConfigDrive) or Ignition to write the file to /etc/flightctl/config.yaml during the hardware provisioning phase.
That's it for the setup of RHEM.
Summarising what is done so far:
- deployed/enabled RHEM on OCP
- exposed its integrated UI with RHACM.
- established pattern for early binding registering a BootC RHEL 9.7 (eg: a kiosk simulation) by adding the flightctl config.yaml in the bootc definition (ie: Dockerfile).
Setup declarative desired state:
Below is a visualisation of how this works:
- I created a Git repo to store the declarative state definition files.
- I create a Repository object in RHEM containing reference to the Git repo and authentication details (no, k8s git auth creds won't work here, at least for now).
- I created a ResourceSync object that contains references such as which git repo, which path, branch etc. This is the object that syncs the source code from Git repo to RHEM. This is almost like GitOps type behavior. But it is limited to Fleet object (at least for now).
- The Fleet object is the definition of declarative state of the devices. It lives in the Git repo and get created, updated, deleted etc in RHEM.
- When a device/OS (containing the config.yaml from RHEM server) starts it registers itself with RHEM. The system is represented as Device object. These Device objects needs to be approved. These devices can also be tagged matching with tag of a Fleet.
Note: All these object can be created without having any Git repo or config as code via RHEM UI. But ConfigAsCode is my preferred method.
Repository OBJ:
apiVersion: flightctl.io/v1beta1
kind: Repository
metadata:
name: rhem-default
spec:
url: https://github.com/xxxxxx/rhem-repo.git
type: git
httpConfig:
username: xxxxxxxx
password: xxxxxxxxx
skipServerVerification: trueResourceSync OBJ:
apiVersion: flightctl.io/v1beta1
kind: ResourceSync
metadata:
name: fleet
spec:
repository: rhem-default
path: /rhem/fleets/default-fleet.yaml
targetRevision: mainNote: Although these objects looks like K8s objects, (I learned the hard way that) the rhem deployment does not create CRD for them (at least, as of the time this post is written). Meaning, kubectl create/apply won't work and needs to be done using flightctl cli.
If these becomes CRDs then we can apply these using gitops or RHACM policies.
If these becomes CRDs then we can apply these using gitops or RHACM policies.
flightctl apply -f repository.yaml
flightctl apply -f resourcesync.yamlFleet OBJ:
Now the cool part begins. The fleet object contains declarative definition for the devices that will be or are under its management. Is specifies:
- OS image
- OS configs
- Applications
The above are great for:
- mitigating config drifs
- lowering operating cost
apiVersion: flightctl.io/v1beta1 kind: Fleet metadata: name: default spec: selector: matchLabels: fleet: default template: metadata: labels: fleet: default spec: os: image: quay.io/alitestseverything/my-rhel9-bootc:kioskv2
applications: - appType: container envVars: {} image: quay.io/alitestseverything/product-api:v1 name: product-api ports: - '3000:3000' config: - name: "silence-audit-logs" inline: - path: "/etc/sysctl.d/20-quiet-printk.conf" content: "kernel.printk = 3 4 1 3" mode: 0644 # STEP 1: Force the creation of the writable directory via tmpfiles # - name: prep-writable-storage # inline: # - path: "/etc/tmpfiles.d/kiosk.conf" # content: | # d /var/www 0755 root root - # d /var/www/html 0755 root root - # d /etc/flightctl/hooks.d/afterupdating 0755 root root - # mode: 0644 - name: prep-writable-storage inline: - path: "/etc/tmpfiles.d/kiosk.conf" content: | d /etc/flightctl/hooks.d/afterupdating 0755 root root - mode: 0644 - name: kiosk-refresh-hook inline: - path: /etc/flightctl/hooks.d/afterupdating/10-refresh-browser.yaml content: | - run: /usr/local/bin/kiosk-refresh.sh if: - path: /var/lib/kiosk/html op: [created, updated, deleted] # - name: gdm-restart-hook # inline: # - path: /etc/flightctl/hooks.d/afterupdating/11-restart-gdm.yaml # content: |- # - run: /usr/bin/systemctl restart gdm # timeout: 10s # if: # - path: /var/lib/kiosk/html # op: [created, updated, deleted] - name: motd-update inline: - path: "/etc/motd" content: "This system is managed by flightctl." mode: 0644 - name: kiosk-frontend configType: GitConfigProviderSpec gitRef: repository: rhem-default targetRevision: main path: /rhem/kiosk resources: - monitorType: CPU alertRules: - severity: Warning duration: 10m percentage: 70 description: 'CPU load is above 50% for more than 10 minutes' samplingInterval: 30s systemd: matchPatterns: - chronyd.service
Let's re-read the Fleet object definition. There's a lot happening here.
In order to make sense and the importance of this Fleet object we will need take a step back to the OS SOE, in my case it was a bootc RHEL 9.7 image defined using this Dockerfile. The way I have set up release the SOE image is that:
- My bootc contains lean OS layer with essential dependencies that are generic (eg: chromium for kiosk disply)
- It contains dependencies for the apps (eg: podman, installed).
- It has an initial placeholder of the front-end app (this is really just so; there was no need).
- It does not have any apps.
- It does not have edge site specific configuration.
Then once the device starts and registers itself for the first time with RHEM and is approved and tagged with fleet's label it device inherits the desired state definition from the fleet object. In this case below is what is happening:
- An application named product-api is deployed (container image) image quay.io/alitestseverything/product-api:v1
- A front-end application (really set of files) maintained in Git dir: /rhem/kiosk is also deployed. (Under the hood the front-end connects to the backend, product-api, app to provide application functionality and the browser displays the front-end app.)
- I have added 2 sample configuration at the OS layer called "motd-update" and "silence-audit-logs". Assume/imagine these are site specific configs (eg: different printers, different networks, wifi settings etc).
- I have a RHEM specific hook implementation called "kiosk-refresh-hook" to refresh the browser after front-end app is applied.
- There exists templatisation feature as well (eg: {{ .metadata.labels.region }}) to make these configuration a bit more dynamic. But this is, at least in version 1.0, seems limited. For example: I wanted to point to a local mirror-registry (a common edge deployment pattern is to cache contents closer to site to cater for intermittent connectivity) dynamically. But found out that the registry URL templating is not possible as of yet.
The main things I will highlight here in terms of is what was in the RHEL OS before and what was applied via fleet object are:
- Application layer is deliberately segregated from OS layer. This is standard DevOps.
- Site specific configs are segregated keeping the OS lean and generic.
- By segregating this functionalities we keep the separation of concerns among teams intact, reduce team dependency and keep the release process independent.
- RHEM's fleet object also provides a dashboard view of "state of devices" across fleet. This is like looking at hard of cattle (instead of each sheep individually) holistically.
In my opinion, these are super powerful things to have when managing devices in bulk and are the advantages of RHEM over traditional automated processes.
Note: spec.config[name: kiosk-frontend].path is a tricky one. The way this "path" works is that fleetagent on the OS/device will try to map the fullpath from there. For example: my github dir structure looks like below:
rh-gitops.git-rhem--kiosk---var----lib-----kiosk------html-------index.html, main.js, main.css-------images--------health.png, logo.png etc
flightctl agent maps this and deploys to the local device (that it is installed on; ie: the kiosk machine) like this: it will deploy index.html, main.js, main.css, images/* to /var/lib/kiosk/html/ dir because I mentioned path=/rhem/kiosk in the spec.config[name: kiosk-frontend].path field. (read this a few times to understand how the path field works)
Device OBJ:
Because I have added the config.yaml file as part of the bootc image building process (in Dockerfile) when the VM/Device starts it auto registers itself to RHEM server but as "pending approval". This shows up as device. Here we can do either of 2 things:
- We can simply approve it (and give it a friendly name) and make changes in UI or via flightctl cli.
- We can approve it (and give it a friendly name) and during approval we can assign labels matching to a fleet definition. In my case, I labeled it "fleet=default" matching it to the Fleet OBJ created earlier. This makes the device inherit the configs defined in the fleet object and that's how things are done in mass/bulk (20K, 100K devices). This is really powerful when ops team wants to manage 10s of thousands devices across different sites (could be at the edge, could be not so at the edge).
Here's a screenshot of a device in the Edge Manager UI with pending approval:
Upon approval and tagging to add it to a fleet object the device should pull the configs and images (apps and os) as per fleet definition.
It takes a few mins to complete the "update" process; meaning pulling the configs and applying it to the device, pulling images from image registry and running it as container and executing the hooks.
Conclusion:
The RHEM at its 1.0 GA'd in Jan, 2026 (although the opensource FlightControl project has been around for a few years prior and RHEM had been in tech preview for a few years) does come packed with very cool features and functionalities and it certainly feels like "made for edge". I also think for fleet management this might be useful too even though they may not be edge devices. It does have some limitation (eg: no template for container registry, K8s CRDs for pure gitops, OSes beyond RHEL etc) but I am hopeful those may be in the road map.
I cannot form a opinion here whether to use automation or RHEM for fleet management because bootc narrows the gaps. It is very possible that RHEM is a new tool and introduces changes to ops process and there may already be existing automated processes (using pipeline, AAP etc). But it certainly makes a very strong case to fleet management.
Comments
Post a Comment