Portainer.io

The use case

Why edge AI inferencing is a deployment and operations problem, not just a model problem

What Portainer does here: KubeSolo installs single-node Kubernetes on GPU edge hardware in a single command. The Portainer agent connects each device back to a central management instance via outbound connection (no inbound ports required). From that central instance, the operations team manages the entire fleet: deploys inference stacks via GitOps, monitors GPU utilization and container health, updates models by committing to a repository, and applies RBAC so regional teams see only their sites.

Computer vision models running at the edge: inspecting products on a production line, monitoring retail checkouts for loss prevention, analyzing crop health across thousands of hectares of farmland, detecting defects in welded components at sub-millimeter precision: share a set of constraints that make cloud-hosted inference architecturally inappropriate for them. Latency is the primary driver: a vision model that must classify an object on a conveyor belt has a decision window measured in milliseconds. Sending a frame to a cloud endpoint, waiting for inference, and receiving a classification instruction back will never be fast enough. The model must run where the camera is.

Connectivity is the second constraint. Agricultural edge sites, remote manufacturing facilities, logistics depots, and retail stores do not have guaranteed high-bandwidth internet connections. The inference model must run locally and operate continuously whether or not there is a network path to the outside world. Cloud-dependent inference is not viable in environments where connectivity is intermittent or bandwidth-constrained.

The third constraint is scale. A single retailer might have 5,000 stores, each running several camera-mounted inference workloads. An agricultural technology company might have vision systems operating across hundreds of farms. Managing those workloads: deploying model updates, monitoring inference health, rolling out configuration changes, replacing failed nodes: cannot be done by logging into each device individually. It requires central management of a distributed fleet.

The edge AI deployment challenge is not model quality: it is fleet management. A retailer deploying containerized AI applications to 5,000 Axis cameras across their store estate, or an industrial operator managing inference workloads across hundreds of production-line Jetson nodes, faces the same problem: getting all of them consistently deployed, updated, and monitored without touching each device individually. That is where most edge AI programs stall.

Portainer addresses this through two complementary components. KubeSolo is Portainer's lightweight single-node Kubernetes distribution, designed specifically for GPU and NPU-enabled edge devices: NVIDIA Jetson Orin, NVIDIA IGX, x86 systems with discrete GPUs like the RTX Pro 6000, Intel NUCs with Arc or Xe graphics, and Raspberry Pi 5 nodes with the AI HAT+. It installs in a single command, runs with a minimal footprint, and exposes GPU and NPU resources to containerized workloads without requiring Kubernetes operational expertise at the edge site. KubeSolo is the first choice for any edge AI vision deployment. For edge sites that do not require dedicated AI accelerator hardware, Docker and Podman are fully supported as lighter-weight alternatives managed through the same Portainer fleet interface. The Portainer agent, running inside KubeSolo (or alongside Docker/Podman) on each device, connects back to the central Portainer server and makes every edge device manageable as part of a unified fleet: regardless of how many sites there are.

Real-world deployment scenarios

Real-world edge AI scenarios where this architecture applies

These are industry examples illustrating where this architecture is operationally relevant. They are not Portainer customer references. In each case, the deployment pattern: GPU-enabled inference at distributed sites, centrally managed, updated without site visits: is exactly what Portainer and KubeSolo are built for.

Retail loss prevention (industry example)

everseen.com — Evercheck

Vision AI at self-checkout across thousands of stores

Retail loss prevention platforms deploy vision AI models at checkout points to detect loss patterns in real time: misscanned items, product substitution, checkout avoidance. The system runs at each store, processes camera feeds locally at inference speed, and surfaces alerts to staff without routing video data to the cloud. Everseen's Evercheck platform is a leading example of this deployment pattern, operating at thousands of stores across major global retailers.

How Portainer fits this pattern: KubeSolo on edge compute at each store. An inference server (such as NVIDIA Triton) serving the vision model. The Portainer agent connects each store back to central management for model updates, health monitoring, and configuration rollout, without requiring IT staff on-site.

Precision agriculture (industry example)

deere.com — See & Spray

Computer vision for targeted herbicide application in the field

Precision agriculture platforms use high-speed cameras and AI models mounted on agricultural sprayers to identify and target individual weeds while leaving crops untreated. John Deere's See & Spray is a well-documented example of this approach at scale. The model runs on-machine at vehicle speed with no connectivity dependency. Frame classification must complete in under 50ms to actuate spray nozzles correctly.

How Portainer fits this pattern: GPU-enabled edge compute running KubeSolo on the machine itself. NVIDIA Jetson or IGX hardware for rugged, low-power inference. Portainer manages model version control and fleet updates via connectivity windows when machines return to connectivity range.

Smart irrigation and greenhouse monitoring (industry example)

monnit.com — Greenhouse Monitoring

Vision and sensor-fusion AI for precision crop management

Greenhouse operators and large-scale irrigation networks deploy computer vision models alongside sensor networks to detect plant stress, disease markers, and water distribution anomalies early, before they become crop losses. Sensor and irrigation platforms in this space (Monnit and DripWorks are examples) integrate vision-based analysis with sensor data to drive automated intervention decisions at the field level.

How Portainer fits this pattern: KubeSolo on edge nodes at greenhouse sections or field zones. Vision models and sensor fusion pipelines run as containerized workloads. Portainer coordinates model updates and configuration from a central operations dashboard across the entire growing estate.

Manufacturing quality inspection (industry example)

gotapway.com — Manufacturing

No-code vision AI for quality inspection, PPE compliance, and warehouse operations

Vision AI platforms for manufacturing deploy containerized inference workloads on production-line hardware to inspect components at speeds and resolutions that manual inspection cannot achieve: PCB trace defects, weld quality, surface classification, PPE compliance, and warehouse activity monitoring. Tapway's no-code computer vision platform covers this range of use cases across manufacturing, warehouse, and food and beverage verticals, packaging inference models alongside camera feed ingestion and alerting into a deployable application stack. The no-code architecture means the ISV can ship a single containerized stack that runs across diverse customer sites and hardware without custom integration work per deployment.

How Portainer fits this pattern: NVIDIA Jetson AGX Orin or IGX on the production line. An inference server serving Tapway's vision models alongside their alerting and results routing containers, deployed as a single stack. Portainer manages deployment across multiple lines and plants from a single management instance, with per-line RBAC for operations teams and GitOps-driven model updates across the fleet.

Traffic monitoring and active management (industry example)

citilog.com — Incident Management

Real-time incident detection and traffic flow management at highway and urban scale

AI-powered traffic management platforms run deep learning inference directly on cameras or roadside edge compute to detect incidents, classify vehicles, measure flow, and trigger signal adjustments in real time: with sub-second decision windows that make cloud-routed inference architecturally unworkable. Citilog's automatic incident detection platform is a documented example of this pattern, processing camera feeds at the edge across highway tunnels, bridges, and urban intersections across multiple countries, with inference running on-device or on local server hardware proximate to the camera infrastructure.

How Portainer fits this pattern: KubeSolo on roadside edge compute at each monitored site. An inference server running Citilog's deep learning models alongside a results router and telemetry container. Portainer manages model updates and configuration rollout across hundreds of roadside nodes from a single operations instance, with per-region RBAC for traffic management center teams.

Wilderness wildfire detection (industry example)

pano.ai — Solution

Early smoke detection across remote forest and wildland terrain using networked AI camera stations

Wildfire detection platforms deploy networks of rotating HD cameras at tower sites across high-risk terrain, running computer vision inference on each station to detect smoke signatures before a 911 call is placed. The hardware is remote, often solar-powered, and operating on intermittent or bandwidth-constrained connectivity: making cloud-dependent inference impractical. Pano AI is a leading example of this deployment pattern, with camera stations operating across 17 US states, Australia, and Canada for utilities, state fire agencies, and private forest operators. Their systems are documented to alert fire agencies an average of 45 minutes faster than the first 911 call.

How Portainer fits this pattern: KubeSolo on edge compute at each tower station, running smoke detection inference, camera feed ingestion, and telemetry containers as a single managed stack. The Portainer agent establishes an outbound connection from each site: no inbound ports required. Portainer manages model updates and configuration changes across the full camera network from one operations dashboard, with offline devices receiving queued updates on reconnection.

Architecture

How Portainer and KubeSolo manage edge AI inferencing

The architecture for far-edge AI inferencing with Portainer follows a hub-and-spoke model. A central Portainer server instance (which may itself run in the enterprise's data center or on a private cloud) manages a fleet of edge devices, each running KubeSolo and a Portainer agent. The relationship between the central server and each edge agent is outbound-only from the agent's perspective, meaning no inbound firewall rules are required at edge sites.

Far-edge AI inferencing topology: Portainer hub-and-spoke

Each edge site runs KubeSolo: a single-node Kubernetes distribution that installs with a single shell command and requires no Kubernetes expertise to operate at the site level. KubeSolo integrates the NVIDIA GPU Operator, meaning GPU resources are automatically available to containerized workloads (an inference server such as NVIDIA Triton or vLLM, or a custom container) without manual driver configuration. On NVIDIA Jetson hardware, this includes the unified memory architecture where CPU and GPU share physical memory: relevant for edge devices where separate GPU VRAM is not available.

Hardware reference

GPU and NPU-enabled edge hardware for vision inferencing

Edge inference hardware spans from purpose-built smart cameras with onboard compute: where the inference happens at the lens, with no separate edge server required: through embedded GPU modules for multi-stream workloads, up to site-level servers handling many camera feeds simultaneously. KubeSolo and Portainer manage all of these from the same fleet interface.

Hardware	Compute	Form factor	Vision inferencing fit
Axis ARTPEC-9 cameras (e.g. P3265-V, Q6135-LE)	Dedicated NPU, 2–4 TOPS	IP camera with onboard Linux, containerized app runtime (ACAP)	Single-stream inference at the camera itself: object detection, classification, anomaly detection. No edge server required. Portainer manages application deployment across camera fleets via Docker-compatible interface.
Bosch INTEOX cameras (MIC inteox, flexidome inteox)	Intel Movidius VPU	IP camera running open Linux, Docker container support natively	On-camera AI inference for retail analytics, perimeter security, and traffic monitoring. Bosch's open platform runs containerized third-party AI applications directly on the camera. Portainer manages deployment and updates across sites.
Hanwha Vision AI cameras (QNV-8080R AI, XNV series)	Onboard NPU	IP camera with embedded AI inference engine	Edge-native people counting, vehicle detection, and behavior analytics. Inference runs on the camera with no cloud dependency. Suitable for retail, transport, and smart city deployments at scale.
Raspberry Pi 5 + AI HAT+ (Hailo-8L)	13 TOPS (Hailo-8L NPU) + Pi 5 CPU	SBC, fanless, low power (<10W total), standard Pi form factor	Cost-effective single-stream inference for object detection, classification, and pose estimation. Strong fit for large fleets where per-unit cost is the primary constraint: retail shelf monitoring, agricultural sensors, smart building deployments. Runs containerized inference workloads via Docker; Portainer manages the fleet identically to any other Docker environment.
Intel NUC (NUC 14 Pro / NUC 13 Arena Canyon with Intel Arc or Xe GPU)	Intel Arc or Xe integrated GPU, Intel NPU on Core Ultra variants	Mini PC, fanless or low-profile, x86, standard power	Versatile x86 edge node for multi-stream inference, local model serving, and aggregation workloads. Intel OpenVINO runtime optimizes vision models for Xe/Arc hardware. Suitable for smart retail, office automation, and any deployment where x86 compatibility and standard Linux tooling are preferred over ARM embedded.
NVIDIA Jetson Orin NX (16GB)	16 GB unified GPU/CPU memory	Module / embedded, fanless options	Multi-camera aggregation node or standalone inference for higher-complexity models. Single stream at high resolution or light multi-stream at moderate resolution.
NVIDIA Jetson AGX Orin (64GB)	64 GB unified GPU/CPU memory	Developer kit + industrial carrier boards	High-throughput multi-stream inspection, multiple simultaneous vision models, or aggregating feeds from several onboard-compute cameras in a zone.
NVIDIA IGX Orin	64 GB unified, safety-certifiable	Industrial ruggedized enclosure	Medical device inference, robotics vision, and certified industrial applications where functional safety certification is required.
NVIDIA RTX Pro 6000 Blackwell	96 GB GDDR7 ECC	Workstation PCIe card in ruggedized chassis	Site-level multi-model server aggregating feeds from many cameras. Development, validation, and high-throughput production workloads.
Advantech / Kontron industrial PCs with embedded GPU	Varies (NVIDIA embedded or discrete)	DIN-rail, rugged enclosure, extended temp range	Factory floor and outdoor deployments where vibration, temperature, and ingress protection ratings are required alongside GPU inference capability.

Deployment specifics

How a far-edge AI vision stack is deployed using Portainer and KubeSolo

Install KubeSolo on the edge device

KubeSolo installs with a single shell command sudo sh install. No Kubernetes configuration expertise is required at the site. The installer configures the single-node Kubernetes cluster, integrates the NVIDIA GPU Operator (making GPU resources immediately available to workloads), and generates a kubeconfig at the standard path. The device is ready for containerized inference workloads within minutes of hardware power-on.

Connect the edge device to the central Portainer server

The Portainer agent is deployed inside KubeSolo and configured with the central Portainer server address. The agent establishes an outbound connection: no inbound ports are required at the edge site. From the central Portainer instance, the edge device immediately appears in the fleet view with its GPU resources, node status, and available workload namespaces visible. This works whether the edge device has a reliable internet connection or an intermittent one the agent buffers state and reconnects automatically.

Deploy the vision inference stack via GitOps

The ISV's application stack for the vision workload: an inference server (such as NVIDIA Triton), a camera feed ingestion container, results router, telemetry container: is defined as Kubernetes manifests or a Helm chart in a Git repository. Portainer's GitOps engine targets the edge device and deploys the stack automatically when the repository is updated. Model weights (TensorRT or ONNX format) are pre-staged in the device's local storage or pulled from the internal container registry (such as Harbor) on first deployment. Portainer shows the deployment status, container health, and GPU utilization for the deployed stack.

Configure RBAC for site-level operations access

Field operations teams at each site (or regional teams managing groups of sites) are given scoped access in Portainer. A retail operations team managing 200 stores in a region can view and restart workloads on their stores without access to other regions or to the central management configuration. An ISV support team can access inference server logs for debugging without access to the customer's infrastructure. RBAC scopes are defined once and applied consistently across all edge devices.

Model updates: fleet-scale rolling deploy

When an updated vision model is ready (new TensorRT weights compiled for the target hardware, tested in a staging environment), the Git repository is updated with the new image tag or model path. Portainer's GitOps engine detects the change and triggers a rolling update across all targeted edge devices, sequentially or in configurable batches. Sites that are temporarily disconnected receive the update when they reconnect. Rollback to the previous version is a single Git revert.

Portainer advantage

What Portainer and KubeSolo solve that alternative approaches do not

Edge AI inferencing on GPU and NPU hardware is a new deployment category. The organizations building these fleets are not migrating from existing tooling: the ESP32-class hardware that Greengrass and traditional MDM platforms were built for is a different world. They are making a tooling choice for the first time, and the temptation is to build something bespoke: a custom provisioning pipeline, a thin orchestration layer on top of Docker, a home-grown update mechanism written to fit the specific hardware the team is working with today. That works for a pilot. It starts breaking at fifty sites and is genuinely unmanageable at five hundred. The person who built it becomes the single point of failure for the entire fleet. Every new hardware type, every model update cadence change, every ISV application that ships as a Helm chart rather than a tarball, is a new engineering project. Portainer and KubeSolo exist precisely for this moment: before the bespoke tooling debt accumulates, when the right architecture decision costs nothing extra and the wrong one costs years.

✓

KubeSolo turns any GPU edge device into a Kubernetes node in minutes

The single-command install removes the Kubernetes expertise barrier from edge deployment entirely. The person provisioning the edge device does not need to understand Kubernetes internals. KubeSolo handles GPU Operator integration, networking, and kubeconfig generation automatically. The result is a production-ready container runtime with GPU scheduling capability, at a site that may not have a Kubernetes engineer anywhere near it.

✓

Central management of thousands of edge sites from a single Portainer instance

Portainer's fleet management was designed for exactly this pattern: large numbers of heterogeneous environments managed from one place. A retail operator with 5,000 stores, an agricultural technology company with hundreds of field deployments, a logistics network with dozens of depots: all of those devices appear in Portainer's fleet view, each with its health status, GPU utilization, running workloads, and deployment history visible without any per-site configuration overhead.

✓

Model updates without site visits or per-device SSH sessions

Updating a vision model across a fleet of 5,000 edge devices via SSH is not operationally viable. Portainer's GitOps-driven fleet update (commit a new image tag, watch the rolling deploy propagate) turns a model update into a Git operation. The operations team does not touch individual devices. Devices that are offline when the update triggers receive it on reconnection. The entire update cycle is auditable through Portainer's deployment history.

✓

Offline resilience by default

Edge devices that lose connectivity to the central Portainer server continue running their inference workloads exactly as deployed. The Portainer agent reconnects automatically when connectivity is restored and syncs the latest state. This is not a special configuration: it is how the agent is designed to operate. For agricultural and remote industrial deployments where connectivity is intermittent by definition, this is a foundational requirement, not a nice-to-have.

✓

ISV application delivery to customer-managed edge fleets

For ISVs building edge AI applications: vision inspection platforms, agricultural AI systems, retail AI: Portainer provides the deployment and management channel to customer-owned fleets without requiring the ISV to build their own device management infrastructure. The ISV defines the application as a Helm chart or Kubernetes manifest. The customer's Portainer instance manages the fleet. The ISV retains update and configuration control through the GitOps channel. Neither party needs to build custom tooling for the other.

Deploy vision AI to your edge fleet

KubeSolo installs on a GPU edge device in under five minutes. The Portainer agent connects it to central fleet management automatically. If you are building an edge AI fleet and deciding on tooling now, this is the conversation to have before the bespoke build starts.

Talk to a solutions engineer

Frequently asked questions

Direct answers to questions about managing edge AI inferencing fleets with Portainer and KubeSolo.

What is KubeSolo and why is it recommended for edge AI deployments?

KubeSolo is Portainer's single-node Kubernetes distribution designed for resource-constrained edge devices. It installs with a single shell command, requires no Kubernetes expertise to operate at the site level, integrates the NVIDIA GPU Operator automatically, and connects to a central Portainer management instance via an outbound agent. It is purpose-built for the constraint profile of GPU-enabled edge compute: minimal overhead, GPU-native, and operationally autonomous when the WAN link drops.

Does Portainer support NVIDIA Jetson and IGX hardware for edge inference?

Yes. KubeSolo runs on NVIDIA Jetson AGX Orin, Jetson Orin NX, Jetson Orin Nano, and IGX hardware. The NVIDIA GPU Operator exposes the unified CPU-GPU memory architecture on Jetson devices to containerized workloads automatically. NVIDIA Triton Inference Server, vLLM, and custom inference containers all deploy and run identically to any other KubeSolo workload.

How does Portainer manage model updates across a large fleet of edge sites?

Portainer's GitOps engine monitors a Git repository for changes. When a new model version is committed (updated image tag or model weights path), Portainer triggers a rolling update across all targeted edge devices, sequentially or in configurable batches. Sites that are offline when the update is published receive it automatically on reconnection. Rollback is a single Git revert.

What happens to edge AI workloads when the site loses internet connectivity?

Nothing. The Portainer edge agent buffers state locally and the KubeSolo workloads continue running regardless of WAN connectivity. The inference server keeps processing camera feeds, the vision model keeps running, and results keep flowing to local systems. Update instructions queued at the central Portainer instance are delivered when connectivity resumes.

Can Portainer manage a mixed fleet of edge sites with different hardware?

Yes. A single Portainer instance manages KubeSolo nodes, Docker environments, and Podman environments simultaneously. Edge groups allow fleet targeting by hardware type, site, customer, or any combination. Different application versions can be deployed to different hardware groups from the same central catalogue.

How does RBAC work for multi-site edge AI deployments with multiple operators?

Portainer's RBAC scopes access to edge groups. A regional operations team managing 200 sites sees only their sites. An ISV support team can access inference server logs for debugging without access to customer infrastructure configuration. Policies are defined once centrally and applied consistently across all edge devices.

AI vision inferencing at far-edge sites on GPU hardware you own

Why edge AI inferencing is a deployment and operations problem, not just a model problem

Real-world edge AI scenarios where this architecture applies

Vision AI at self-checkout across thousands of stores

Computer vision for targeted herbicide application in the field

Vision and sensor-fusion AI for precision crop management

No-code vision AI for quality inspection, PPE compliance, and warehouse operations

Real-time incident detection and traffic flow management at highway and urban scale

Early smoke detection across remote forest and wildland terrain using networked AI camera stations

How Portainer and KubeSolo manage edge AI inferencing

GPU and NPU-enabled edge hardware for vision inferencing

How a far-edge AI vision stack is deployed using Portainer and KubeSolo

Install KubeSolo on the edge device

Connect the edge device to the central Portainer server

Deploy the vision inference stack via GitOps

Configure RBAC for site-level operations access

Model updates: fleet-scale rolling deploy

What Portainer and KubeSolo solve that alternative approaches do not

KubeSolo turns any GPU edge device into a Kubernetes node in minutes

Central management of thousands of edge sites from a single Portainer instance

Model updates without site visits or per-device SSH sessions

Offline resilience by default

ISV application delivery to customer-managed edge fleets

Deploy vision AI to your edge fleet

Frequently asked questions

What is KubeSolo and why is it recommended for edge AI deployments?

Does Portainer support NVIDIA Jetson and IGX hardware for edge inference?

How does Portainer manage model updates across a large fleet of edge sites?

What happens to edge AI workloads when the site loses internet connectivity?

Can Portainer manage a mixed fleet of edge sites with different hardware?

How does RBAC work for multi-site edge AI deployments with multiple operators?

On this page

Other use cases

Relevant products