Rust for IoT: Building a Secure ESP32 Weather Sensor with MQTT, TLS, and OTA Updates

./esp32.jpg

During downtime between freelance projects, I often use the time to learn new things. This time, I explored the world of IoT and built a weather sensor with an ESP32 board, then decided to document the experience.

The sensor reads temperature, humidity, pressure, co2 and air quality data and sends it securely using MQTT over TLS.

It also supports over-the-air (OTA) firmware upgrades. New versions of firmware are served by a minimal OTA backend that pulls versioned binaries from an OCI registry. This post captures the engineering behind it.

Defining the requirements

For the sensors, I wanted a device that:

  • Is developed with no_std libraries
  • Reports environmental data using MQTT over TLS (aka MQTTS)
  • Uses UART or I2C protocols to read sensor data
  • Can be extended with multiple sensors (bme280, sds011, scd30, etc.)
  • Supports safe remote OTA updates and stores firmware in an OCI-compliant registry

For the hosting part, I wanted the infrastructure to run on Talos managed Kubernetes cluster using recent software releases like InfluxDB v3.

Due to the nature of memory management in embedded programming I decided to use Rust as programming language.

Rust gave me control, safety, and the right tooling for low-level, async, resource-constrained projects.

For the sake of learning, I avoided using ESP-IDF-SVC (IDF stands for IoT Development Framework) which may simplify some aspects but came with limitations (e.g. no MQTT v5 support at the time).

The setup

The diagram below provide an overview of the interactions between the ESP32 and the rest of the stack

fiwrecmahwBftEeaMehScrEtePke2cr38h2fu0dopargtraadewuespaiutnbhgleOirMtsQahdTFaTltSuax(OTA)IngresHsa-rNbGoIrNX(OCI)p:Tr8Co8Px8y3NanoMQTelegrafInfluxDBGrafana

Hardware

  • ESP32 DevKit v1
  • BME280 sensor for temperature, humidity, pressure
  • Optional SCD30 for temperature, humidity, COâ‚‚
  • Optional SDS011 for PM2.5 and PM10

Infrastructure using Kubernetes

The entire backend stack runs on a Kubernetes cluster built with Talos Linux. Talos is simple to use and provides an immutable, minimal OS for secure, production-grade Kubernetes nodes.

The cluster runs the following components:

  • NanoMQ for MQTTS message routing
  • InfluxDB v3 for storing time-series sensor data
  • Grafana for dashboards and visualizations
  • Cert Manager to manage TLS certs
  • Ingress NGINX for external access, the port TCP/8883 is routed directly to the NanoMQ server which does TLS itself
  • OtaFlux to serve firmware updates
  • Harbor as a private OCI registry

I won’t go in details regarding each components and how it works but if you are interested, all config and manifests are stored in this GitHub repo: homie-lab.

This setup gives me full declarative control and reliable OTA delivery infrastructure.

Firmware

The firmware is structured around async tasks and static buffers:

  • esp-hal and embassy provides esp32 chip interaction, wifi management and task orchestration
  • Sensor read + MQTT publish loop using rust-mqtt
  • OTA version check at boot and every N cycles (default: every hour) using esp-hal-ota. It also support automatic rollback in case of failure.
  • MQTT over TLS 1.2 using esp-mbedtls

The app connects to Wi-Fi, initializes I2C/UART sensors, and cycles between publishing measurements and checking for updates. In case of network failure, there is a retry mechanism in place.

Each firmware binary is compiled with only the required configuration and sensor drivers using Cargo features. This way, each device can enable exactly the sensors it needs, whether it’s BME280, SCD30, SDS011, or any combination while keeping the binary minimal and tailored.

I use the following command to enable features and flash the chip:

cargo espflash flash \
    --release \
    --features influx,tls,mtls,ota,bme280 \
    --no-default-features \
    -T ./partitions.csv \
    --erase-parts otadata

Certificates

Since we are using TLS, I decided to create my own self-signed CA using Cert-Manager.

In a perfect world, we would have 1 certificate per device but for the sake of simplicity I’m using a single certificate which is used by all devices.

The self-signed CA is used to sign certificates for all devices but also for NanoMQ (which has the usage set to server auth instead of client auth).

Certificates are stored in Yaml and follow GitOps practices:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: esp32
spec:
  isCA: false
  commonName: esp32
  secretName: esp32
  issuerRef:
    name: selfsigned-ca-issuer
    kind: ClusterIssuer
    group: cert-manager.io
  duration: 87600h # 10 years
  renewBefore: 360h # 15d
  privateKey:
    algorithm: RSA
    encoding: PKCS1
    size: 2048
  usages:
    - digital signature
    - key encipherment
    - client auth

Then, I use the following command to retrieve the certificates and include them in the devices configuration file:

# CA certificate
kubectl get secrets -n devices esp32 -o jsonpath='{.data.tls\.ca}' | base64 -d > tls.ca
# TLS certificate
kubectl get secrets -n devices esp32 -o jsonpath='{.data.tls\.crt}' | base64 -d > tls.crt
# TLS private key
kubectl get secrets -n devices esp32 -o jsonpath='{.data.tls\.key}' | base64 -d > tls.key

The device configuration file is defined in Toml format:

wifi_ssid = "my-wifi"
wifi_psk = "wifi-password"

device_id = "esp32-living-room"

mqtt_hostname = "nanomq.example.com"
mqtt_port = 8883

// ...

tls_ca = """
-----BEGIN CERTIFICATE-----
// your CA certificate here
-----END CERTIFICATE-----
"""
tls_key = """
-----BEGIN RSA PRIVATE KEY-----
// your private key here
-----END RSA PRIVATE KEY-----
"""
tls_cert = """
-----BEGIN CERTIFICATE-----
// your certificate here
-----END CERTIFICATE-----

Over-the-air (OTA) firmware update

OCI-compliant registry for firmware version control

To support OTA updates, I wanted to store firmware binaries in an OCI registry.

OCI registries are typically used for storing container images, but they can also hold other types of artifacts. That’s where ORAS comes in. ORAS (OCI Registry As Storage) is a toolset that makes it easy to push and pull non-container artifacts such as firmware binaries, SBOMs, or signatures using similar workflows as container images.

Using an OCI registry for firmware makes a lot of sense. The content-addressable nature of OCI images ensures immutability, so once a binary is pushed, you know it won’t change. Versioning is built-in, you can tag each firmware release and let devices fetch specific versions or check for updates. Security is improved through artifact signing and support for SBOMs, which helps with verifying and tracing what’s deployed. With built-in access control, you can restrict who can push or pull firmware. And because registries are designed for global distribution, they can reliably serve updates to a fleet of devices with minimal setup.

OTA server: OtaFlux

The esp-hal-ota repo includes a basic OTA example. I used it as inspiration to build OtaFlux, a custom OTA server that fetches firmware from an OCI-compliant registry.

It’s key features are:

  • Pulls device-specific firmware artifacts from a Harbor registry
  • Caches latest firmware in-memory, keyed by device ID
  • Supports gzip, zstd, and tar archives
  • Calculate the CRC of the firmware binary to ensure data integrity
  • Prometheus metrics: cache hit/miss, request latency

OtaFlux exposes the following endpoints:

  • /version?device=ID returns:
    0.1.3         # version
    0xdeadbeef    # CRC
    147456        # size
    
  • /firmware?device=ID returns the raw binary

These are parsed directly by the client’s firmware.

OTA client firmware update flow

Every hour, each device:

  • Calls /version?device=ID to check for updates
  • Downloads /firmware?device=ID if an update exists
  • Verifies data integrity using CRC (Cyclic Redundancy Check) and size, writes to flash, reboots if valid

All managed by esp-hal-ota, no dynamic memory.

Developer experience

When a new version of the firmware needs to be released, a CI/CD workflow compile for each device the firmware binary using espflash and upload them to the OCI registry using ORAS.

The compiled binary contains the enabled sensors logic and device specific configuration (device id, wifi ssid, certificates, etc.).

Below is the command I used:

# compile with device supported features
cargo build --release \
    --features influx,bme280,tls,mtls,ota \
    --no-default-features

# save as binary image
espflash save-image --chip esp32 \
    ./target/xtensa-esp32-none-elf/release/esp32_home_sensor \
    ./firmware.bin

# push to OCI registry
oras push "my-registry.example.com:443/my-repository/esp32-living-room:0.1.2" \
    --config /dev/null:application/vnd.oci.image.config.v1+json \
    firmware.bin:application/vnd.espressif.esp32.firmware.v1+binary

Dashboarding

After setting up the infrastructure and deploying multiple sensors throughout my home, I created a Grafana dashboard to visualize the collected data:

./grafana-weather-dashboard.png

The dashboard tracks several key metrics:

  • Temperature, humidity and pressure trends across different rooms, with min/max alerts for unusual patterns
  • Air quality measurements including Dew points levels and particulate matter

Lessons Learned

  • TLS on ESP32 is tight on memory. A full TLS 1.2 handshake with ECC can use around 25–40 KB of RAM, while RSA may push that up to 70 KB. Static buffers and solid error handling help avoid fragmentation and out-of-memory issues.
  • Coordinating Embassy’s task arena and heap allocations is tricky. Finding the right task-arena-size to leave enough heap for TLS and other tasks took some tuning.
  • OTA updates must be safe by design. Use immutable, semver-tagged firmware binaries and always verify CRC before rebooting to apply updates.
  • Avoid dynamic allocations when possible. They make memory use unpredictable and increase the risk of fragmentation on long-running systems.
  • The Embassy async runtime fits well on constrained hardware. Its static memory model makes behavior more predictable and reliable under load.
  • Rust is an excellent choice for embedded systems. Compared to higher-level alternatives, it provides memory safety without a garbage collector, zero-cost abstractions, and full control over low-level behavior, ideal for environments with tight resource constraints.

Next Steps

  • Add support for signed firmware validation
  • Report OTA update status via MQTT
  • Explore delta updates for large binaries
  • Consider event-driven OTA triggers or add jitter to the firmware version check to prevent overloading the server (who knows, maybe one day my OTA server will serve 1000’s of devices 😉)

Final Thoughts

After completing this project, I’m convinced that Rust represents a significant step forward for embedded systems development. While the learning curve is steeper than using Arduino or ESP-IDF, the benefits become clear once you’re past the initial hurdles:

  • Reliability at scale: For devices that need to run unattended for months or years, Rust’s compile-time guarantees eliminate entire classes of runtime errors.
  • Developer confidence: The ability to refactor code with confidence that memory issues will be caught at compile time dramatically speeds up the development cycle after the initial implementation.
  • Future-proofing: As IoT devices become more security-critical, Rust’s inherent security properties will likely make it increasingly important in the embedded space.
  • Ecosystem growth: The Rust embedded ecosystem, while still maturing, has reached a point where complex projects like this are entirely feasible without falling back to C.

OTA firmware updates were essential to make this project practical, and combining them with an OCI registry creates a surprisingly powerful and flexible deployment pipeline that feels more like modern cloud development than traditional embedded systems work.


If you are into embedded systems, self-hosted infrastructure, or ESP32 development using Rust, you can check the full working implementation from the following repositories:


What would you do differently or automate further? I’m keen on ideas, especially around real-world device scaling.