Skip to content

Commit 36c8af7

Browse files
weltekialexellis
authored andcommitted
Add instructions for deploying a NATS cluster for OpenFaaS
These instruction used to be hosted in the customers repo. Signed-off-by: Han Verstraete (OpenFaaS Ltd) <han@openfaas.com>
1 parent 5b9e86c commit 36c8af7

1 file changed

Lines changed: 118 additions & 16 deletions

File tree

docs/openfaas-pro/jetstream.md

Lines changed: 118 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -43,30 +43,16 @@ On the blog we show reference examples built upon these architectural patterns:
4343

4444
**Embedded NATS server**
4545

46-
For staging and development environments OpenFaaS can be deployed with an embedded version of the NATS server which uses an in-memory store. This is the default when you install OpenFaaS using the [OpenFaaS Helm chart](https://github.com/openfaas/faas-netes/blob/master/chart/openfaas/README.md).
46+
For staging and development environments OpenFaaS can be deployed with an embedded version of the NATS server without persitance. This is the default when you install OpenFaaS using the [OpenFaaS Helm chart](https://github.com/openfaas/faas-netes/blob/master/chart/openfaas/README.md).
4747

4848
If the NATS Pod restarts, you will lose all messages that it contains. This could happen if you update the chart and the version of the NATS server has changed, or if a node is removed from the cluster.
4949

5050
**External NATS server**
5151

5252
For production environments you should install NATS separately using its Helm chart.
53-
5453
NATS can be configured with a quorum of at least 3 replicas so it can recover data if one of the replicas should crash. You can also enable a persistent volume in the NATS chart for additional durability.
5554

56-
If you are running with 3 replicas of the NATS server, then update the OpenFaaS chart to reflect that in the `nats.streamReplication` parameter. With this in place, the stream for queued messages will be replicated across the 3 NATS servers.
57-
58-
```yaml
59-
nats:
60-
streamReplication: 3
61-
external:
62-
enabled: true
63-
host: "nats.nats"
64-
port: "4222"
65-
```
66-
67-
By default the NATS helm chart will be installed into the nats namespace with the name of `nats`, but you can customise this if you wish by setting the `nats.external.host` parameter.
68-
69-
Instructions for a recommended NATS production deployment are available for customers though the [customer community repo](https://github.com/openfaas/customers/blob/master/jetstream.md)
55+
See [Deploy NATS for OpenFaaS](#deploy-nats-for-openfaas) for instruction on how to configure OpenFaaS and deploy NATS.
7056

7157
## Features
7258

@@ -172,6 +158,122 @@ The stream is automatically created by the queue-worker of it does not exist. Si
172158
kubectl rollout restart -n openfaas deploy/queue-worker
173159
```
174160

161+
## Deploy NATS for OpenFaaS
162+
163+
[NATS JetStream](https://docs.nats.io/nats-concepts/jetstream) is a highly available, scale-out messaging system. It is used in OpenFaaS Pro for queueing asynchronous invocations.
164+
165+
The OpenFaaS Helm chart installs NATS by default, but the included configuration is designed only for development or non-critical internal use.
166+
167+
By default, NATS runs with a single replica and no persistent storage. This means:
168+
169+
- If the NATS Pod goes down, all asynchronous OpenFaaS requests will fail until it comes back online.
170+
- Pending messages are not stored persistently, so they are lost if the Pod or node crashes.
171+
- All queued messages are lost during redeployments or when the NATS service restarts.
172+
173+
The default setup is often sufficient for development or staging, but it is not suitable for production. For reliable operation, production systems need additional configuration to provide high availability and data durability.
174+
175+
The next section describes our recommended configuration for a production-grade installation of OpenFaaS with an external NATS deployment.
176+
177+
### Configure a NATS Cluster
178+
179+
Deploy NATS in your cluster using the official [NATS Helm chart](https://docs.nats.io/running-a-nats-service/nats-kubernetes).
180+
181+
A minimum of three NATS servers is required for production. This ensures that message processing can continue even if one Pod or node fails.
182+
183+
The Helm chart deploys NATS as a StatefulSet. To avoid a single point of failure it is recommended to schedule each Pod on a different Kubernetes node.
184+
185+
At the networking level, the clustering in NATS uses Gossip to announce basic cluster membership, and Raft to store and track messages. If you experience issues, check your network policies and security groups allow for these protocols.
186+
187+
Add the following to the values.yaml configuration for your NATS deployment:
188+
189+
```yaml
190+
config:
191+
cluster:
192+
enabled: true
193+
replicas: 3
194+
jetstream:
195+
enabled: true
196+
```
197+
198+
It is recommended to add a topology spread constraint to ensure Pods are distributed across different nodes:
199+
200+
```yaml
201+
podTemplate:
202+
topologySpreadConstraints:
203+
kubernetes.io/hostname:
204+
maxSkew: 1
205+
whenUnsatisfiable: DoNotSchedule
206+
```
207+
208+
### Enable persistent volumes
209+
210+
[Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) (PV) is how Kubernetes adds stateful storage to containers. Applications define a Persistent Volume Claim (PVC), which is fulfilled by a storage engine, resulting in storage being provisioned, and then attached to the container as a PV.
211+
212+
The NATS Helm chart enables file storage for JetStream by default. The PV can be configured through a template in the NATS Helm chart.
213+
214+
In your values.yaml file for the NATS Helm chart:
215+
216+
```yaml
217+
config:
218+
jetstream:
219+
fileStore:
220+
enabled: true
221+
dir: /data
222+
pvc:
223+
enabled: true
224+
size: 10Gi
225+
```
226+
227+
The `size` value will depend on your own usage. We suggest you start with a generous estimate, then monitor actual usage.
228+
If not specified the default storage class will be used. You can use a different storage class by setting the `config.jetstream.fileStore.pvc.storageClassName` parameter.
229+
230+
When running on-premises, if you do not have a storage driver, you can try [Longhorn from the CNCF](https://github.com/longhorn/longhorn).
231+
232+
On AWS, EBS volumes are recommended.
233+
234+
### Deploy NATS
235+
236+
Deploy NATS with your custom configuration values:
237+
238+
```bash
239+
helm repo add nats https://nats-io.github.io/k8s/helm/charts/
240+
241+
kubectl create namespace nats
242+
helm upgrade --install nats nats/nats --values nats-values.yaml
243+
```
244+
245+
### Connect OpenFaaS to an external NATS server
246+
247+
Update the values.yaml configuration for your OpenFaaS deployment to point to your external NATS installation:
248+
249+
```yaml
250+
nats:
251+
external:
252+
enabled: true
253+
clusterName: "openfaas"
254+
host: "nats.nats"
255+
port: "4222"
256+
```
257+
258+
#### Stream replication settings
259+
260+
Each OpenFaaS queue is backed by a dedicated JetStream stream. To protect messages against node or Pod failures, streams should use replication.
261+
262+
The replication factor determines how many servers store a copy of the data:
263+
264+
- Replicas = 1 – Default in OpenFaaS: fastest, but not resilient. A crash or outage can result in message loss.
265+
- Replicas = 3 – Recommended for production: balances performance with resilience. Can tolerate the loss of one NATS server.
266+
- Replicas = 5 – Maximum: tolerates two server failures but with reduced performance and higher resource usage.
267+
268+
For production environments, set the replication factor to at least 3 in the OpenFaaS Helm configuration:
269+
270+
```yaml
271+
nats:
272+
streamReplication: 3
273+
```
274+
275+
See the [NATS documentation on stream replication](https://docs.nats.io/nats-concepts/jetstream#persistent-and-consistent-distributed-storage) for more details.
276+
175277
## See also
176278

177279
- [The Next Generation of Queuing: JetStream for OpenFaaS](https://www.openfaas.com/blog/jetstream-for-openfaas/)

0 commit comments

Comments
 (0)