Managing devices using Edge Manager

Managing edge devices has been a complex process as traditional IT ops tools fall short in distributed, low-connectivity environment to manage huge quantity of devices. Red Hat Edge Manager (Open source project: FlightControl, GA'd by Red Hat on late Jan, 2026) solves these challenges by providing streamlined management of edge devices and applications through a declarative approach. Now, there's a fair bit to unpack here. But for simplicity this is how I am going to map those 3 things here:

Management of edge devices: I am mapping this to LCM (including upgrade, patch etc) of the underlying OS (in this case RHEL OS of BootC flavor or at least UBI based RHEL ).
Managing applications: Mapping this to deploying applications and LCM of the applications stack on the OS.
Declarative approach: This one is super interesting. To me this is very K8s-yy but in the world of edge devices running linux (RHEL OS, as of today).

And then this thing also has MCP: This is my next probe. It is still fairly new, but so far what I have found it certainly packs a lot of potential. Keep an eye here for my take on this (coming soon..)

Solution Overview:

Before we get onto its usecases (the 3 points mentioned above) lets quickly look at "the setup".

I have an OpenShift Hub cluster (OCP 4.20.15) where I have RHACM 2.15. I manage my edge servers (SNOs) from this RHACM (a hub and spoke model). Now, I also want to manage my fleet of edge devices (eg kiosks etc) using RHEM; following the similar hub and spoke model (but for edge devices).

So, I deployed Red Hat Edge Manager on this Hub Cluster and exposed its UI integrated to the Fleet Management UI (that came with RHACM UI). Now from a command center perspective (the hub console) I have one place to go to for all my edge management.

Then, I configured the edge devices such as way that during boot they register themselves to the Hub’s RHEM and get their patches, configs, apps etc as per instructions from the Hub’s RHEM (almost just like how the SNOs work Hub’s RHACM). I have described this process in the later part of this post.

The configurations for the edge devices (eg: os, configs, apps) are all represented as config as code in my private Git repository and are synced and applied to the devices (managed by RHEM) as they are updated in the git repository. (This is very similar to gitops in k8s).

Note: The RHEM (Red Hat Edge Manager) helm chart is only available from OCP 4.19 on wards. It also requires RHACM 2.15 for UI integration.

Below is the diagram to visualise it.

Demo

RHEM (+bootc) or AAP (+bootc)

In this setup RHEL bootc is a critical component. But this post isn’t about bootc. So, here’s a really good article about bootc instead.

To be honest, I am not sure if there's a generic "one over the other" answer here. It may depend on factors like operational process, team size, team skills number of devices, upgrade cycle, release processes etc. Here's a diagram to visualise the differences:

In my opinion:

RHEM + bootc is more like continuous delivery whereas AAP + bootc is more like releases in cycles.
RHEM + bootc introduces separation of concerns by design where os release, config changes and applications releases are independent whereas AAP + bootc needs to package applications, configurations etc in bootc container image and released as part of the OS release.
Continuous config drift management is baked in by design in RHEM whereas using AAP it needs to be implemented. But BootC makes this technical difference very low impacting.
When bootc is used their differences becomes more of a architectural principle (of course technical differences exist bit both may perform the same). This is because bootc packages it anyway and separation of concern may become a design choice rather than technical limitation.
One major technical difference of BootC+RHEM vs BootC+AAP is that the ability to dynamically inject and manage configs (os configs, apps configs, apps profile (container, image source etc)). The reason it may become a critical design decision factor is because when used BootC+AAP it may create multiple versions of the image because of the differences in network profile, device profile (caused by small variations in different kind of configs) which can be easily tackled by BootC+RHEM.

Ok, with that out of the way lets get into technical bits and bobs.

RHEM Setup:

The deployment of RHEM on to the Hub Cluster and expose its UI in the Fleet Management interface was easy enough. The official documentation is pretty good (for the most part; I will list out few gotchas in this post). I followed it to deploy RHEM 1.0 on my OCP 4.20.

RHEM deployment on OCP:

Step1:

Create NS and extract the cert to be used.


kubectl create ns redhat-rhem

kubectl get configmap default-ingress-cert \
    -n openshift-config-managed \
    -o jsonpath='{.data.ca-bundle\.crt}' > /tmp/ingress-ca.crt

Step2:

Select the project and install RHEM by browsing ecosystem > software catalog. Select the right operator (at the time of this post, only helm was available). Just before you deploy go to "yaml view" and edit the yaml to add the ingress-ca.crt previously extracted. Finally click deploy to deploy it. You can check the pods coming the created namespace (eg: redhat-rhem). Here's mine:


## most of it is default

global:
  auth:
    k8s:
      apiUrl: 'https://kubernetes.default.svc'
      createAdminUser: true
    oidc:
      organizationAssignment:
        organizationName: default
        type: static
      usernameClaim:
        - preferred_username
      roleAssignment:
        claimPath:
          - groups
        type: dynamic
      clientId: flightctl-client
    openshift:
      clusterControlPlaneUrl: 'https://kubernetes.default.svc'
      createAdminUser: true
    caCert: |
      -----BEGIN CERTIFICATE-----
      extacted cert contemt
      -----END CERTIFICATE-----    
    insecureSkipTlsVerify: false
  gateway:
    ports:
      http: 80
      tls: 443
  enableMulticlusterExtensions: auto
  enableOpenShiftExtensions: auto
  exposeServicesMethod: auto
  generateCertificates: auto
  imagePullPolicy: IfNotPresent
  multiclusterEngineNamespace: multicluster-engine
db:
  builtin:
    image:
      image: registry.redhat.io/rhel9/postgresql-16
      tag: 9.6-1752571367
    storage:
      size: 60Gi
    resources:
      requests:
        cpu: 512m
        memory: 512Mi
    maxConnections: 200
  external:
    port: 5432
  name: flightctl
  type: builtin
kv:
  image:
    image: registry.redhat.io/rhel9/redis-7
    tag: 9.6-1752567986
  loglevel: warning
  maxmemory: 1gb
  maxmemoryPolicy: allkeys-lru
alertmanager:
  image:
    image: registry.redhat.io/rhacm2/prometheus-alertmanager-rhel9
    tag: v2.13.2
  enabled: true
api:
  image:
    image: registry.redhat.io/rhem/flightctl-api-rhel9
  rateLimit:
    trustedProxies:
      - 10.0.0.0/8
      - 172.16.0.0/12
      - 192.168.0.0/16
    authRequests: 20
    authWindow: 1h
    enabled: true
    requests: 300
    window: 1m
cliArtifacts:
  image:
    image: registry.redhat.io/rhem/flightctl-cli-artifacts-rhel9
  enabled: true
worker:
  image:
    image: registry.redhat.io/rhem/flightctl-worker-rhel9
  clusterLevelSecretAccess: false
periodic:
  image:
    image: registry.redhat.io/rhem/flightctl-periodic-rhel9
  consumers: 5
alertExporter:
  image:
    image: registry.redhat.io/rhem/flightctl-alert-exporter-rhel9
  enabled: true
alertmanagerProxy:
  image:
    image: registry.redhat.io/rhem/flightctl-alertmanager-proxy-rhel9
  enabled: true
telemetryGateway:
  image:
    image: registry.redhat.io/rhem/flightctl-telemetry-gateway-rhel9
ui:
  image:
    image: registry.redhat.io/rhem/flightctl-ui-rhel9
    pluginImage: registry.redhat.io/rhem/flightctl-ui-ocp-rhel9
  auth:
    insecureSkipTlsVerify: false
  enabled: true
clusterCli:
  image:
    image: registry.redhat.io/openshift4/ose-cli-rhel9
    tag: v4.20.0
dbSetup:
  image:
    image: registry.redhat.io/rhem/flightctl-db-setup-rhel9
  wait:
    sleep: 2
    timeout: 60
  migration:
    activeDeadlineSeconds: 0
    backoffLimit: 2147483647
upgradeHooks:
  scaleDown:
    deployments:
      - flightctl-periodic
      - flightctl-worker
    condition: chart
    timeoutSeconds: 120
  databaseMigrationDryRun: true

Step3:

I my case (rhem version 1.0), I had to deploy the RHEM console plugin (this may not be required in future versions and the helm deployment should be able auto detect RHACM presence and integrate the console plugin accordingly).


cat <<EOF | kubectl apply -f -
apiVersion: console.openshift.io/v1
kind: ConsolePlugin
metadata:
  name: flightctl-plugin
spec:
  displayName: 'Red Hat Edge Manager'
  backend:
    type: Service
    service:
      name: flightctl-ui
      namespace: redhat-rhem
      port: 8080
      basePath: '/'
  proxy:
    - alias: api-proxy
      authorization: UserToken
      endpoint:
        type: Service
        service:
          name: flightctl-ui        #---> pay attention here. point to UI instead of API.
          namespace: redhat-rhem
          port: 8080                #---> pay attention here
EOF

kubectl patch cm flightctl-ui -n redhat-rhem --type=merge -p '{
  "data": {
    "FLIGHTCTL_SERVER": "/api/proxy/plugin/flightctl-plugin/api-proxy/",
    "FLIGHTCTL_SERVER_EXTERNAL": "/api/proxy/plugin/flightctl-plugin/api-proxy/"
  }
}'

kubectl patch console.operator.openshift.io cluster \

--type=json -p='[{"op": "add", "path": "/spec/plugins/-", "value": "flightctl-plugin"}]'

Step4:

Create RHEM organization.


kubectl create ns rhem-default-org
oc label namespace rhem-default-org io.flightctl/instance=redhat-rhem

Note: from documentation, it is said that:

Namespace-to-Organization Mapping: Red Hat Edge Manager uses a 1:1 mapping between OpenShift namespaces and Organizations.
Automatic Discovery: The act of labeling a namespace with io.flightctl/instance=<helm_release-name> triggers the automatic discovery and initialization of that namespace as a Red Hat Edge Manager Organization. Here helm-release-name=redhat-rhem

Step5:

Configure authorization for this org:


oc adm policy add-role-to-user view admin -n rhem-default-org
oc adm policy add-role-to-user flightctl-admin-redhat-rhem admin -n rhem-default-org

That's it. RHEM should be up and running and accessible via UI and API. See screen shots below:

RHEM integration with RHEL:

In order to integrate the device to RHEM (ie: bring the device under RHEM's management) we need to do 2 things on the RHEL. In my case, since I was using BootC I added the below code in my bootc Dockerfile:

Install flightctl-agent

I added the below line in my Dockerfile:

RUN dnf -y install dnf-plugins-core && dnf clean all
RUN dnf config-manager --add-repo https://rpm.flightctl.io/flightctl-epel.repo
RUN dnf -y install flightctl-agent && dnf clean all

Add flightctl config to the OS

This is so that it knows where to report to and where to get the desired states from.

In order to do this I first needed to install flightctl cli on local machine. In my case, I spun up a docker (Dockerfile). Then executed the blow commands:

flightctl login https://api.redhat-rhem.apps.myhub.xyz -k --token=$(oc whoami -t)

flightctl certificate request \
--signer=enrollment \

--expiration=365d --output=embedded > config.yaml

Then I used this config.yaml file like below in my Docker:

RUN mkdir -p /etc/flightctl/
ADD config.yaml /etc/flightctl/config.yaml

Note: Since I am using BootC, managing the config.yaml is a critical part of the device identity lifecycle. The way the "bind" is done determines how the device first communicates with the management server.

Early Binding (Baked into the Image): This is what I have done.
Late Binding (Injected at Provisioning): In this scenario, the OS image is "generic," and the config.yaml is provided to the VM at the very first boot. This is done using cloud-init (NoCloud/ConfigDrive) or Ignition to write the file to /etc/flightctl/config.yaml during the hardware provisioning phase.

That's it for the setup of RHEM.

Summarising what is done so far:

deployed/enabled RHEM on OCP
exposed its integrated UI with RHACM.
established pattern for early binding registering a BootC RHEL 9.7 (eg: a kiosk simulation) by adding the flightctl config.yaml in the bootc definition (ie: Dockerfile).

Setup declarative desired state:

Below is a visualisation of how this works:

Describing the diagram:

I created a Git repo to store the declarative state definition files.
I create a Repository object in RHEM containing reference to the Git repo and authentication details (no, k8s git auth creds won't work here, at least for now).
I created a ResourceSync object that contains references such as which git repo, which path, branch etc. This is the object that syncs the source code from Git repo to RHEM. This is almost like GitOps type behavior. But it is limited to Fleet object (at least for now).
The Fleet object is the definition of declarative state of the devices. It lives in the Git repo and get created, updated, deleted etc in RHEM.
When a device/OS (containing the config.yaml from RHEM server) starts it registers itself with RHEM. The system is represented as Device object. These Device objects needs to be approved. These devices can also be tagged matching with tag of a Fleet.

Note: All these object can be created without having any Git repo or config as code via RHEM UI. But ConfigAsCode is my preferred method.

Repository OBJ:

apiVersion: flightctl.io/v1beta1
kind: Repository
metadata:
  name: rhem-default
spec:
  url: https://github.com/xxxxxx/rhem-repo.git
  type: git
  httpConfig:
    username: xxxxxxxx
    password: xxxxxxxxx
    skipServerVerification: true

ResourceSync OBJ:

apiVersion: flightctl.io/v1beta1
kind: ResourceSync
metadata:
  name: fleet
spec:
  repository: rhem-default
  path: /rhem/fleets/default-fleet.yaml
  targetRevision: main

Note: Although these objects looks like K8s objects, (I learned the hard way that) the rhem deployment does not create CRD for them (at least, as of the time this post is written). Meaning, kubectl create/apply won't work and needs to be done using flightctl cli.
If these becomes CRDs then we can apply these using gitops or RHACM policies.

flightctl apply -f repository.yaml
flightctl apply -f resourcesync.yaml

Fleet OBJ:

Now the cool part begins. The fleet object contains declarative definition for the devices that will be or are under its management. Is specifies:

OS image
OS configs
Applications

The above are great for:

mitigating config drifs
lowering operating cost


apiVersion: flightctl.io/v1beta1
kind: Fleet
metadata:
  name: default
spec:
  selector:
    matchLabels:
      fleet: default
  template:
    metadata:
      labels:
        fleet: default
    spec:
      os:
        image: quay.io/alitestseverything/my-rhel9-bootc:kioskv2
      applications:
        - appType: container
          envVars: {}
          image: quay.io/alitestseverything/product-api:v1
          name: product-api
          ports:
            - '3000:3000'
      config:
        - name: "silence-audit-logs"
          inline:
            - path: "/etc/sysctl.d/20-quiet-printk.conf"
              content: "kernel.printk = 3 4 1 3"
              mode: 0644
        # STEP 1: Force the creation of the writable directory via tmpfiles
        # - name: prep-writable-storage
        #   inline:
        #     - path: "/etc/tmpfiles.d/kiosk.conf"
        #       content: |
        #         d /var/www 0755 root root -
        #         d /var/www/html 0755 root root -
        #         d /etc/flightctl/hooks.d/afterupdating 0755 root root -
        #       mode: 0644
        - name: prep-writable-storage
          inline:
            - path: "/etc/tmpfiles.d/kiosk.conf"
              content: |
                d /etc/flightctl/hooks.d/afterupdating 0755 root root -
              mode: 0644
        - name: kiosk-refresh-hook
          inline:
            - path: /etc/flightctl/hooks.d/afterupdating/10-refresh-browser.yaml
              content: |
                - run: /usr/local/bin/kiosk-refresh.sh
                  if:
                    - path: /var/lib/kiosk/html
                      op: [created, updated, deleted]
        # - name: gdm-restart-hook
        #   inline:
        #     - path: /etc/flightctl/hooks.d/afterupdating/11-restart-gdm.yaml
        #       content: |-
        #         - run: /usr/bin/systemctl restart gdm
        #           timeout: 10s
        #           if:
        #             - path: /var/lib/kiosk/html
        #               op: [created, updated, deleted]
        - name: motd-update
          inline:
            - path: "/etc/motd"
              content: "This system is managed by flightctl."
              mode: 0644
        - name: kiosk-frontend
          configType: GitConfigProviderSpec
          gitRef:
            repository: rhem-default
            targetRevision: main
            path: /rhem/kiosk
      resources:
        - monitorType: CPU
          alertRules:
            - severity: Warning
              duration: 10m
              percentage: 70
              description: 'CPU load is above 50% for more than 10 minutes'
          samplingInterval: 30s
      systemd:
        matchPatterns:
          - chronyd.service

Let's re-read the Fleet object definition. There's a lot happening here.

In order to make sense and the importance of this Fleet object we will need take a step back to the OS SOE, in my case it was a bootc RHEL 9.7 image defined using this Dockerfile. The way I have set up release the SOE image is that:

My bootc contains lean OS layer with essential dependencies that are generic (eg: chromium for kiosk disply)
It contains dependencies for the apps (eg: podman, installed).
It has an initial placeholder of the front-end app (this is really just so; there was no need).
It does not have any apps.
It does not have edge site specific configuration.

Then once the device starts and registers itself for the first time with RHEM and is approved and tagged with fleet's label it device inherits the desired state definition from the fleet object. In this case below is what is happening:

An application named product-api is deployed (container image) image quay.io/alitestseverything/product-api:v1
A front-end application (really set of files) maintained in Git dir: /rhem/kiosk is also deployed. (Under the hood the front-end connects to the backend, product-api, app to provide application functionality and the browser displays the front-end app.)
I have added 2 sample configuration at the OS layer called "motd-update" and "silence-audit-logs". Assume/imagine these are site specific configs (eg: different printers, different networks, wifi settings etc).
I have a RHEM specific hook implementation called "kiosk-refresh-hook" to refresh the browser after front-end app is applied.
There exists templatisation feature as well (eg: {{ .metadata.labels.region }}) to make these configuration a bit more dynamic. But this is, at least in version 1.0, seems limited. For example: I wanted to point to a local mirror-registry (a common edge deployment pattern is to cache contents closer to site to cater for intermittent connectivity) dynamically. But found out that the registry URL templating is not possible as of yet.

The main things I will highlight here in terms of is what was in the RHEL OS before and what was applied via fleet object are:

Application layer is deliberately segregated from OS layer. This is standard DevOps.
Site specific configs are segregated keeping the OS lean and generic.
By segregating this functionalities we keep the separation of concerns among teams intact, reduce team dependency and keep the release process independent.
RHEM's fleet object also provides a dashboard view of "state of devices" across fleet. This is like looking at hard of cattle (instead of each sheep individually) holistically.

In my opinion, these are super powerful things to have when managing devices in bulk and are the advantages of RHEM over traditional automated processes.

Note: spec.config[name: kiosk-frontend].path is a tricky one. The way this "path" works is that fleetagent on the OS/device will try to map the fullpath from there. For example: my github dir structure looks like below:

rh-gitops.git

-rhem

--kiosk

---var

----lib

-----kiosk

------html

-------index.html, main.js, main.css

-------images

--------health.png, logo.png etc

flightctl agent maps this and deploys to the local device (that it is installed on; ie: the kiosk machine) like this: it will deploy index.html, main.js, main.css, images/* to /var/lib/kiosk/html/ dir because I mentioned path=/rhem/kiosk in the spec.config[name: kiosk-frontend].path field. (read this a few times to understand how the path field works)

Device OBJ:

Because I have added the config.yaml file as part of the bootc image building process (in Dockerfile) when the VM/Device starts it auto registers itself to RHEM server but as "pending approval". This shows up as device. Here we can do either of 2 things:

We can simply approve it (and give it a friendly name) and make changes in UI or via flightctl cli.
We can approve it (and give it a friendly name) and during approval we can assign labels matching to a fleet definition. In my case, I labeled it "fleet=default" matching it to the Fleet OBJ created earlier. This makes the device inherit the configs defined in the fleet object and that's how things are done in mass/bulk (20K, 100K devices). This is really powerful when ops team wants to manage 10s of thousands devices across different sites (could be at the edge, could be not so at the edge).

Here's a screenshot of a device in the Edge Manager UI with pending approval:

In this case the device just finished starting and registered in RHEM. Here's a screenshot of kiosk simulation device (running on low powered qemu):

Notice the front-end app simply displays a place holder "initial bootc rhel".

Upon approval and tagging to add it to a fleet object the device should pull the configs and images (apps and os) as per fleet definition.

Device starts "updating" as soon as it is approved.

It takes a few mins to complete the "update" process; meaning pulling the configs and applying it to the device, pulling images from image registry and running it as container and executing the hooks.

This also means, as per my fleet/desired state definition, the device is also "refreshed" with a running/functioning app (instead of the initial placeholder).

Conclusion:

The RHEM at its 1.0 GA'd in Jan, 2026 (although the opensource FlightControl project has been around for a few years prior and RHEM had been in tech preview for a few years) does come packed with very cool features and functionalities and it certainly feels like "made for edge". I also think for fleet management this might be useful too even though they may not be edge devices. It does have some limitation (eg: no template for container registry, K8s CRDs for pure gitops, OSes beyond RHEL etc) but I am hopeful those may be in the road map.

I cannot form a opinion here whether to use automation or RHEM for fleet management because bootc narrows the gaps. It is very possible that RHEM is a new tool and introduces changes to ops process and there may already be existing automated processes (using pipeline, AAP etc). But it certainly makes a very strong case to fleet management.

According To Ali

Search This Blog