How to secure egress with F5 Service Proxy for Kubernetes
Outline: Securing Egress Challenges How F5 can help Technical bit on how it works Getting trafficinto your clusters to your workloads is just a small part of the cluster admin's tasks, and there are many options available. Controlling the packets going out is harder and often ignored. This makes your clusters more vulnerable to security risks because they don’t follow the same strict rules as your traditional networks. This article will dive deeper into how SPK can control traffic exiting your clusters, even when your application workload uses multus to attach additional external interfaces. Secure Egress Challenges By default, a pod deployed using calico CNI will follow the default route to get out of the cluster. Traffic will look like it’s coming from the worker host’s external IP address on the management interface. While KubernetesNetworkPolicies can be used for egress, it becomes painful to manage the lifecycle of hundreds or thousands of policies across all namespaces as the cluster grows. If you deploy a pod with multus interfaces, as commonly seen with telco applications, you add another way for that pod to bypass any NetworkPolicies applied within the cluster. What if there was a way to manage egress dynamically (as pods are spun up and down) and easily so that the cluster admin could centrally configure and control traffic flowing out of the cluster? How F5 can help Service Proxy for Kubernetes (SPK) is a cloud-native application traffic management solution, designed for communication service provider (CoSP) 5G networks and other application workloads. With SPK and its Calico egress gateway feature, managing a pod's default calico network interface as well as any multus interfaces becomes easy and consistent with the CSRC daemonset. Kernel routes are automatically configured so that the pods traffic will always be routed via the SPK pod where you can apply consistent, namespace-aware network policies, source NAT translation, and other controls. If the "watched" application workload is deleted, the corresponding host rules also get removed. Technical Overview This section will provide an overview of how to configure the above scenario. Host Prerequisites On the host, two shims of type macvlan bridges are created on physical interfaces, one for the application pod's calico traffic and one for the macvlan traffic, which will forward packets on to SPK. These interfaces allow connectivity to the SPK's "internal" and "external2" interfaces, respectively. ip link add spk-shim link ens224 type macvlan mode bridge ip addr add 10.1.30.244/24 dev shim1 ip link set shim1 up ip link add spk-shim2 link ens256 type macvlan mode bridge ip addr add 10.1.10.244/24 dev shim2 ip link set shim2 up Application Prerequisites and Configuration In theSPK controller values.yaml file, configure your application workload namespaces in the watchNamespace block. watchNamespace: - "spk-apps" - "spk-apps-2" Since we want SPK to do the source NAT for pod egress traffic, we create an IPPool with natOutgoing set to false. This IPPool will be used by the applications. apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: name: app-ip-pool spec: cidr: 10.124.0.0/16 ipipMode: Always natOutgoing: false Ensure that the application namespaces are annotated like below to use the IPPool. kubectl annotate namespace spk-apps "cni.projectcalico.org/ipv4pools"=[\"app-ip-pool\"] kubectl annotate namespace spk-apps-2 "cni.projectcalico.org/ipv4pools"=[\"app-ip-pool\"] Deploy your application. See below for an example deployment manifest for the application. Note that I'm attaching a secondary macvlan interface, which is in addition to the default calico interface. It will get an IP address automatically as configured in the corresponding NetworkAttachmentDefinition. Note the specific labels used by SPK, which allows you to enable traffic routing to SPK on a per application basis. Additionally, the enableSecureSPK=true label will instruct SPK to create additional listeners that will pick up traffic coming from the pod's secondary macvlan interface. (Will show these listeners later) apiVersion: apps/v1 kind: Deployment metadata: name: nginx annotations: spec: selector: matchLabels: app: nginx replicas: 2 template: metadata: annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "macvlan-conf-ens256-myapp1" } ]' labels: app: nginx enableSecureSPK: "true" enablePseudoCNI: "true" secureSPKPort: "8050" secureSPKCNFPodIfName: "net1" secondaryCNINodeIfName: "spk-shim2" primaryCNINodeIfName: "spk-shim" secureSPKNetAttachDefName: "macvlan-conf-ens256" secureSPKEgressVlanName: "external" SPK Configuration Deploy the custom resource that will configure a listener that does two things: listen for traffic coming from the internal vlan, or the calico interface of targeted application pods SNAT the traffic so that the source IP is an IP address of SPK apiVersion: "k8s.f5net.com/v1" kind: F5SPKEgress metadata: name: egress-crd namespace: ns-f5-spk spec: #leave commented out for snat automap #egressSnatpool: "snatpool-1" dualStackEnabled: false maxTmmReplicas: 1 vlans: vlanList: [internal] disableListedVlans: false Next, we deploy the CSRC Daemonset that dynamically creates the kernel rules and routes for us. Note that I am setting the daemonsetMode to "pseudoCNI" which means I want to route both primary (calico) and secondary interface traffic to SPK. values-csrc.yaml image: repository: gitlab.tky.lab:5050/registry/spk/200 # daemonset mode, regular, secureSPK, or pseudoCNI #daemonsetMode: "regular" daemonsetMode: "pseudoCNI" ipFamily: "ipv4" imageCredentials: name: f5-common-pull-creds config: iptableid: 200 interfacename: "spk-shim" #tmmpodinterfacename: "internal" json: ipPoolCidrInfo: cidrList: - name: cluster-cidr0 value: "172.21.107.192/26" - name: cluster-cidr1 value: "172.21.66.0/26" - name: cluster-cidr2 value: "10.124.18.192/26" - name: node-cidr0 value: "10.1.11.0/24" - name: node-cidr1 value: "10.1.10.0/24" ipPoolList: - name: default-ipv4-ippool value: "172.21.64.0/18" - name: spk-app1-pool value: "10.124.0.0/16" Testing You can then log onto the worker node that is hosting the applications and confirm the routes and rules are created. Essentially, the rules are making calico interfaces use a custom route table that ensures that the default route is via the SPK. # ip rule 0: from all lookup local 32254: from all to 172.21.107.192/26 lookup main 32254: from all to 172.21.66.0/26 lookup main 32254: from all to 10.124.18.192/26 lookup main 32254: from all to 172.28.15.0/24 lookup main 32254: from all to 10.1.10.0/24 lookup main 32257: from 10.124.18.207 lookup ns-f5-spkshim1ipv4257 <--match on app pod1 calico IP!!! 32257: from 10.124.18.211 lookup ns-f5-spkshim1ipv4257 <--match on app pod2 calico IP!!! 32258: from 10.1.10.171 lookup ns-f5-spkshim2ipv4258 <--match on app pod1 macvlan IP!!! 32258: from 10.1.10.170 lookup ns-f5-spkshim2ipv4258 <--match on app pod2 macvlan IP!!! 32766: from all lookup main 32767: from all lookup default # ip route show table ns-f5-spkshim1ipv4257 default via 10.1.30.242 dev shim1 10.1.30.242 via 10.1.30.242 dev shim1 # ip route show table ns-f5-spkshim2ipv4258 default via 10.1.10.160 dev shim2 10.1.10.160 via 10.1.10.160 dev shim2 If I then try to execute a curl command towards a server that exists in a network segment beyond SPK, the application pod will hit the CSRC-configured ip rule and then forwarded to its new default gateway, which is SPK. Since SPK has Source NAT enabled, the "Client IP" from the server perspective is the self-IP of SPK. This means you can apply firewall policies to application workloads in a deterministic way as well as have visibility into what kind of traffic is coming out of your clusters. k exec -it nginx-7d7699f86c-hsx48 -n my-app1 -- curl 10.1.70.30 ================================================ ___ ___ ___ _ | __| __| | \ ___ _ __ ___ /_\ _ __ _ __ | _||__ \ | |) / -_) ' \/ _ \ / _ \| '_ \ '_ \ |_| |___/ |___/\___|_|_|_\___/ /_/ \_\ .__/ .__/ |_| |_| ================================================ Node Name: F5 Docker vLab Short Name: server.tky.f5se.com Server IP: 10.1.70.30 Server Port: 80 Client IP: 10.1.30.242 Client Port: 59248 Client Protocol: HTTP Request Method: GET Request URI: / host_header: 10.1.70.30 user-agent: curl/7.88.1 A simple tcpdump command run in the debug container of SPK confirms that the pod's calico interface IP (10.124.18.192) is the source IP of the incoming traffic on SPK, and after Source NAT is applied using the self-IP of SPK (10.1.30.242), the packet is sent out towards the server. /tcpdump -nni 0.0 tcp port 80 ----snip---- 12:34:51.964200 IP 10.124.18.192.48194 > 10.1.70.30.80: Flags [P.], seq 1:75, ack 1, win 225, options [nop,nop,TS val 4077628853 ecr 777672368], length 74: HTTP: GET / HTTP/1.1 in slot1/tmm0 lis=egress-ipv4 port=1.1 trunk= ----snip---- 12:34:51.964233 IP 10.1.30.242.48194 > 10.1.70.30.80: Flags [P.], seq 1:75, ack 1, win 225, options [nop,nop,TS val 4077628853 ecr 777672368], length 74: HTTP: GET / HTTP/1.1 out slot1/tmm0 lis=egress-ipv4 port=1.1 trunk= Let's take a look at egress application traffic that is using the secondary macvlan interface. In this case, I have not configured Source NAT so SPK will forward the traffic out, retaining the original pod IP. k exec -it nginx-7d7699f86c-g4hpv -n my-app1 -- curl 10.1.80.30 ================================================ ___ ___ ___ _ | __| __| | \ ___ _ __ ___ /_\ _ __ _ __ | _||__ \ | |) / -_) ' \/ _ \ / _ \| '_ \ '_ \ |_| |___/ |___/\___|_|_|_\___/ /_/ \_\ .__/ .__/ |_| |_| ================================================ Node Name: F5 Docker vLab Short Name: ue-client3 Server IP: 10.1.80.30 Server Port: 80 Client IP: 10.1.10.170 Client Port: 56436 Client Protocol: HTTP Request Method: GET Request URI: / host_header: 10.1.80.30 user-agent: curl/7.88.1 Another tcpdump command run in the debug container of SPK shows that it receives the above GET request and sends it out without Source NAT in this case. /tcpdump -nni 0.0 tcp port 80 ----snip---- 13:54:40.389281 IP 10.1.10.170.56436 > 10.1.80.30.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 4087715696 ecr 61040149], length 74: HTTP: GET / HTTP/1.1 in slot1/tmm0 lis=secure-egress-ipv4-virtual-server port=1.2 trunk= ----snip---- 13:54:40.389305 IP 10.1.10.170.56436 > 10.1.80.30.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 4087715696 ecr 61040149], length 74: HTTP: GET / HTTP/1.1 out slot1/tmm0 lis=secure-egress-ipv4-virtual-server port=1.2 trunk= You can use the familiar tmctl command inside the debug container of SPK to confirm the statistics for both listeners that process the pod's primary (egress-ipv4) and secondary (secure-egress-ipv4-virtual-server) interface egress traffic. /tmctl -f /var/tmstat/blade/tmm0 virtual_server_stat -s name,clientside.bytes_in,clientside.bytes_out,no_staged_acl_match_accept -w 200 name clientside.bytes_in clientside.bytes_out no_staged_acl_match_accept ---------------------------------------------- ------------------- -------------------- -------------------------- secure-egress-ipv4-virtual-server 394 996 1 egress-ipv4 394 1011 1 Now that you have egress traffic routed to the SPK data plane pods, you can use the below F5 published custom resource definitions (CRDs) to apply granular access control lists (ACLs) to meet your security requirements. The firewall configuration is defined as code (YAML manifests) so it natively integrates with K8s and portable across clusters. F5BigContextGlobal: CRD to define the default global firewall behavior and reference the firewall policy. F5BigFwPolicy: CRD to define your firewall rules. In summary, the above diagrams and configuration snippets show how SPK can capture all egress traffic in a dynamic way so that you don't have to sacrifice security and control in your ever-changing Kubernetes clusters.52Views0likes0CommentsWhat is Web Cache Exploitation?
Let’s talk about Web Cache Exploitation. There was a presentation done at BlackHat/DefCon 2024 discussing this, and here is the link to a writeup done by the presenter: https://portswigger.net/research/gotta-cache-em-all That article details how different HTTP servers and proxies react when presented with specially crafted URLs. These discrepancies have the potential to be used for use in different types of web cache attacks. My goal here is to give a brief overview and discuss further about how NGINX can be involved in this as well as mitigations that are possible. As such, it is a good idea to reference that article as I am only summarizing pieces of it here. Especially since the researcher did such a great job of writing this up. Definitions: First, here are a few terms that will be used in this article: Web caching — the process of storing copies of web files either on the user’s device or in a third-party device such as a proxy or Content Delivery Network (CDN). The purpose of this is to speed up the serving of static content by presenting it from the store instead of the backend server. This saves time and resources. Web caches use keys to determine which responses should be stored or not. These usually use the URL in some fashion, then map to the stored response. Web Cache Poisoning — the act of inserting fake content into the cache, causing clients to pull content they were not intending to inadvertently. Web Cache Deception — the act of tricking the backend server to place dynamic content into a cache thinking that it was static. This can be especially bad if the data is intended for an authenticated user. Delimiters — one or more characters in a sequence that indicate a separation (end/beginning) of the elements in a stream of text or data. An example of this could be the question mark in a URI indicating that a query is starting. Normalization - concerning web traffic, the process of standardizing data for consistency across network paths. We see this a lot with web traffic using % notation for certain characters, such as %20 for a space. Detecting Delimiters and Normalization: The article describes that the RFC (https://datatracker.ietf.org/doc/html/rfc3986) states which characters are used as delimiters. The issue is that the RFC is very permissive and allows each instance to add to that list. They then give a few examples of how to detect the delimiters that backend servers or caches use. This can then help to determine if there is a discrepancy between them. For example: the article shows sending a request for /home and then a request for /home$abcd to see if the response is the same or not. This can also be used to see if the cached request is served up when specific delimiters are used. The second discrepancy that the article discusses is with normalization. Using delimiters, the path is extracted and then it is normalized to determine any encoded character or dot-segments that may be used. I will explain what those are: Encoding is used sometimes when a delimiter character needs to be interpreted by the application rather than the HTTP parser. For example: %2F used instead of a forward slash /. Dot-segment normalization is a way to reference a resource from a relative path. Also referred to as a path traversal a lot of the time. For example: ../ used to move back to one directory. The RFC says how to code URLs and handle dot-segments. But it doesn’t say how a request should be forwarded or changed, which makes it hard to tell which vendors agree with each other. Similar to what was done in the delimiter section, the article gives different examples of how to detect discrepancies in decoding behavior. For example: the article gives a table that lists different cache proxies as well as HTTP servers and how each treats a request for /hello..%2fworld. NGINX resolves this to /world whereas Apache does not normalize it at all. Deception: Cache rules are used to determine if a resource is static and should be stored or not. The discrepancies mentioned in the last section can be leveraged to exploit cache rules possibly leading to dynamic content being stored. The article describes different data attributes that cache proxies may use to determine if a resource is static or not. These include static extensions, static directories, and static files. Static extensions may include file types such as .css, .js, .pdf, and more. Some proxies may have rules setup that cause these extensions to allow caching. An example given in the article is where the dollar sign is a delimiter on the backend server but not the proxy. This can cause the response to a specific path to be cached when it should not be. Normalization discrepancies can be used to exploit this as well by encoding a delimiter. Example: request for /account$static.css will be stored by the proxy due to the .css extension, but due to the delimiter, the response from the backend is for /account which may be a client's authorized account data. Static directory rules are those that match the path used for the request. Some common examples are /static, /shared, /media, and more.. This is similar to static extensions, where delimiter discrepancies and normalization discrepancies can be used for exploitation. This involves hiding a path traversal after a character that is a delimiter on the backend server. The static directory is then placed after the path traversal, causing the proxy to resolve it but not the backend server. Example: request: /account$/..%2Fstatic/any cache proxy sees: /static/any backend server sees: /account Static files are files that may not necessarily be in a static directory or have a static extension but are expected to stay static on every site. Examples of these files are /robots.txt or /favicon.ico. Exploiting these types of rules is similar to how static directories are exploited. In other words, this example would look like the previous except replace 'static/any' with 'robots.txt'. Poisoning: If the attacker can get a cache to store a specific response to the key that the cache is using, then they can steer users to that response when they visit. Delimiters and normalization can be exploited to carry out cache poisoning. By combining these with cache poisoning, it could be possible to modify a cache key to point to a highly visited site. There are many ways to combine these to try and use this. These include key normalization and delimiters used by both the backend server and the cache on the frontend. Key normalization may happen before the cache key is generated. This can allow for poisoning of the mapped resource if the backend server is interpreting the path differently. This is similar to our above example for static directories. If a path traversal is placed between the path for the backend server and the path you want cached, you may be able to map one to the other. Example: URL: /path/../../home Cache Key: /home Backend Server: /path As this shows, it is possible to create the cache with a key pointing to /home but returns the response for /path. So, when a user visits /home they will not receive the page expected, but instead they will get the page that the malicious actor wanted them to get. Server delimiters can be used for this when the cache is not using the same delimiter. This allows for the creation of a key for the response as the delimiter will prevent the backend server from fully resolving the path. This is similar to the last example, but with the delimiter placed before the path traversal. Example: URL: /path$/../home Cache Key: /home Backend Server: /path Cache delimiters are harder since special characters that the browser will allow are harder to find for web caches. The pound sign can do this, though, as some caches use it as a delimiter. This is similar to the previous example but would be the other way around as the backend server path would be last after the traversal. Example: URL: /path#../home Cache Key: /path Backend Server: /home Mitigation/Defense: The first thing to note is that none of this means that vendors are doing anything wrong with their products. The differences in how each handles normalization and delimiters is expected given the freedom to add their own options. Also, I mentioned that I would further discuss how NGINX could be involved in these kinds of attacks. Naturally, as NGINX can be used as a proxy and a web server, it can be involved in these types of transactions. So it really falls on how NGINX handles normalization and delimiters when compared to a web cache being used in the same path. The author of that article does a great job of comparing multiple vendors for backend servers, CDNs, and frameworks. The first defense would be to try and use products that will align in how they parse data to try and prevent as many opportunities as possible for this to happen. The next defense and probably the best design choice would be to add a cache control to your pages to prevent caching of pages that should never be cached. This would mean adding a 'Cache-Control' header with values of 'no-store' and 'private' to any dynamically generated responses. Then also ensure that any of the cache rules cannot override the header that is set. Another option would be to add a WAF into the path of the traffic. Just looking at a lot of the requests used in these examples, I can see that ASM/Advanced WAF or NGINX App Protect would be pretty effective at stopping a lot of these requests. Path traversal and meta-character One thing that was discussed in the article in regard to NGINX was how it handles the newline-encoded byte (%0A) in a rewrite rule. This byte is used as a path delimiter in NGINX. A common use of the rewrite rule is to use the regex of (.*) to write the rest of the path to then new location. For example: rewrite /path/.(*) /newpath/$1 break; This will work in most situations, but if the newline byte is added then it will stop at that delimiter. For example: /path/test%0abcde ---> /newpath/test You can see how it gets cut off after the encoded byte is hit. I did some research on this and found a similar situation with the return rule in NGINX.https://reversebrain.github.io/2021/03/29/The-story-of-Nginx-and-uri-variable/ This blog shows how the Carriage Return Line Feed (CRLF) can be used to inject a header into the response. I tested this by firing up an NGINX container, and adding a location configuration to my nginx.conf file like this: server { location /static/ { return 302 http://localhost$uri; } I then send a request with the encoded CRLF (%0D%0A) and then the header I want injected after that: curl "http://127.0.0.1:8081/static/%0d%0aX-Foo:%20CLRF" -v * Trying 127.0.0.1:8081... * Connected to 127.0.0.1 (127.0.0.1) port 8081 > GET /static/%0d%0aX-Foo:%20CLRF HTTP/1.1 > Host: 127.0.0.1:8081 > User-Agent: curl/8.6.0 > Accept: */* > < HTTP/1.1 302 Moved Temporarily < Server: nginx/1.27.0 < Date: Thu, 15 Aug 2024 18:15:46 GMT < Content-Type: text/html < Content-Length: 145 < Connection: keep-alive < Location: http://localhost/static/ < X-Foo: CLRF <-----header injected < <html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx/1.27.0</center> </body> </html> * Connection #0 to host 127.0.0.1 left intact That blog also describes how to avoid that happening by changing the return directive to use $request_uri instead of $uri or $document_uri. This made me wonder if it was possible to similarly modify the rewrite directive to avoid the issue with the newline-encoded byte being used as a path delimiter. After searching, I found this page in GitHub:https://github.com/kubernetes/ingress-nginx/issues/11607 Which then links to: https://trac.nginx.org/nginx/ticket/2452 These pages are discussing this issue with using the newline-encoded byte as a delimiter. The response in the ticket was to use this regex (?s) to enable single-line mode. I re-configured my NGINX container to add another couple of locations so I could test this: server { location /static/ { return 302 http://localhost$uri; } location /user/ { rewrite /user/(.*) /account/$1 redirect; } location /test/ { rewrite /test/(?s)(.*) /account/$1 redirect; } So now I have two rewrite directives, one for testing the issue and one for testing the workaround. Now send a request and see if it works. curl "http://127.0.0.1:8081/user/%0d%0aX-Foo:%20CLRF" -v * Trying 127.0.0.1:8081... * Connected to 127.0.0.1 (127.0.0.1) port 8081 > GET /user/%0d%0aX-Foo:%20CLRF HTTP/1.1 > Host: 127.0.0.1:8081 > User-Agent: curl/8.6.0 > Accept: */* > < HTTP/1.1 302 Moved Temporarily < Server: nginx/1.27.0 < Date: Thu, 15 Aug 2024 18:56:48 GMT < Content-Type: text/html < Content-Length: 145 < Location: http://127.0.0.1/account/%0D <---Newline delimiter was hit. < Connection: keep-alive < <html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx/1.27.0</center> </body> </html> * Connection #0 to host 127.0.0.1 left intact For the first test, it cutoff at the newline-encoded byte as expected. Now to test the workaround. curl "http://127.0.0.1:8081/test/%0d%0aX-Foo:%20CLRF" -v * Trying 127.0.0.1:8081... * Connected to 127.0.0.1 (127.0.0.1) port 8081 > GET /test/%0d%0aX-Foo:%20CLRF HTTP/1.1 > Host: 127.0.0.1:8081 > User-Agent: curl/8.6.0 > Accept: */* > < HTTP/1.1 302 Moved Temporarily < Server: nginx/1.27.0 < Date: Thu, 15 Aug 2024 19:32:50 GMT < Content-Type: text/html < Content-Length: 145 < Location: http://127.0.0.1/account/%0D%0AX-Foo:%20CLRF <-------Appears to have worked. < Connection: keep-alive < <html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx/1.27.0</center> </body> </html> * Connection #0 to host 127.0.0.1 left intact Changing regular expressions to enable single-line mode prevents the possibility of any confusion being introduced by newline characters. This is just an FYI as I thought it was interesting to see issues raised in the past by others and what suggestions were given. Last Thoughts: First of all, I would like to thank Michael Hedges and Parker Green, both from F5 Networks for bringing this to our attention. As shown in the examples and the article written by the researcher, these types of attacks are not extremely difficult to carry out and can have very significant ramifications in specific scenarios. As such, taking this into account when setting up a site is key. This would include the configuration of pages to use cache controls and which vendors to use for both web servers as well as web caching proxies. The article I referenced at the beginning does a good job of breaking down how each vendor handles different scenarios. That makes for a great reference point to start with.21Views0likes0CommentsHow I Did it - Migrating Applications to Nutanix NC2 with F5 Distributed Cloud Secure Multicloud Networking
In this edition of "How I Did it", we will explore how F5 Distributed Cloud Services (XC) enables seamless application extension and migration from an on-premises environment to Nutanix NC2 clusters.103Views1like0CommentsSSL Orchestrator Advanced Use Cases: Detecting Generative AI
Introduction Quick, take a look at the following list and answer this question: "What do these movies have in common?" 2001: A Space Odyssey Westworld Tron WarGames Electric Dreams The Terminator The Matrix Eagle Eye Ex Machina Avengers: Age of Ultron M3GAN If you answered, "They're all about artificial intelligence", yes, but... If you answered, "They're all about artificial intelligence that went terribly, sometimes horribly wrong", you'd be absolutely correct. The simple fact is...artificial intelligence (AI) can be scary. Proponents for, and opponents against will disagree on many aspects, but they can all at least acknowledge there's a handful of ways to do AI correctly...and a million ways to do it badly. Not to be an alarmist, but whileSkyNet was fictional, semi-autonomousguns on robot dogs is not... But then why am I talking about this on a technical forum you may ask? Well, when most of the above films were made, AI was largely still science fiction. That's clearly not the case anymore, and tools like ChatGPT are just the tip of the coming AI frontier. To be fair, I don't make the claim that all AI is bad, and many have indeed lauded ChatGPT and other generative AI tools as the next great evolution in technology. But it's also fair to say that generative AI tools, like ChatGPT, have a very real potential to cause harm. At the very least, these tools can be convincing, even when they're wrong. And worse, they could lead to sensitive information disclosures. One only has to do a cursory search to find a few examples of questionable behavior: Lawyers File Motion Written by AI, Face Sanctions and Possible Disbarment Higher Ed Beware: 10 Dangers of ChatGPT Schools Need to Know ChatGPT and AI in the Workplace: Should Employers Be Concerned? OpenAI's New Chatbot Will Tell You How to Shoplift and Make Explosives Giant Bank JP Morgan Bans ChatGPT Use Among Employees Samsung Bans ChatGPT Among Employees After Sensitive Code Leak But again...what does this have to do with a technical forum? And more important, what does this have to do with you? Simply stated, if you are in an organization where generative AI toolscould be abused, understanding, and optionally controlling how and when these tools are accessed, could help to prevent the next big exploit or disclosure. If you search beyond the above links, you'll find an abundance of information on both the benefits, and security concerns of AI technologies. And ultimately you'll still be left to decide if these AI tools are safe for your organization. It may simply be worthwhile to understand WHAT tools are being used. And in some cases, it may be important to disable access to these. Given the general depth and diversity of AI functions within arms-reach today, and growing, it'd be irresponsible to claim "complete awareness". The bulk of these functions are delivered over standard HTTPS, so the best course of action will be to categorize on known assets, and adjust as new ones come along. As of the publishing of this article, the industry has yet to define a standard set of categories for AI, and specifically, generative AI. So in this article, we're going to build one and attach that to F5 BIG-IP SSL Orchestrator to enable proactive detection and optional control of Internet-based AI tool access in your organization. Let's get started! BIG-IP SSL Orchestrator Use Case: Detecting Generative AI The real beauty of this solution is that it can be implemented faster than it probably took to read the above introduction. Essentially, you're going to create a custom URL category on F5 BIG-IP, populate that with known generative AI URLs, and employ that custom category in a BIG-IP SSL Orchestrator security policy rule. Within that policy rule, you can elect to dynamically decrypt and send the traffic to the set of inspection products in your security enclave. Step 1: Create the custom URL category and populate with known AI URLs - Access the BIG-IP command shell and run the following command. This will initiate a script that creates and populates the URL category: curl -s https://raw.githubusercontent.com/f5devcentral/sslo-script-tools/main/sslo-generative-ai-categories/sslo-create-ai-category.sh |bash Step 2: Create a BIG-IP SSL Orchestrator policy rule to use this data - The above script creates/re-populates a custom URL category named SSLO_GENERATIVE_AI_CHAT, and inthat category is a set of known generative AI URLs. To use, navigate to the BIG-IP SSL Orchestrator UI and edit a Security Policy. Click add to create a new rule, use the "Category Lookup (All)" policy condition, then add the above URL category. Set the Action to "Allow", SSL Proxy Action to "Intercept", and Service Chain to whatever service chain you've already created. With Summary Logging enabled in the BIG-IP SSL Orchestrator topology configuration, you'll also get Syslog reporting for each AI resource match - who made the request, to what, and when. The URL category is employed here to identify known AI tools. In this instance, BIG-IP SSL Orchestrator is used to make that assessment and act on it (i.e. allow, TLS intercept, service chain, log). Should you want even more granular control over conditions and actions of the decrypted AI tool traffic, you can also deploy an F5 Secure Web Gateway Services policy inside the SSL Orchestrator service chain. With SWG, you can expand beyond simple detection and blocking, and build more complex rules to decide who can access, when, and how. It should be said that beyond logging, allowing, or denying access to generative AI tools, SSL Orchestrator is also going to provide decryption and the opportunity to dynamically steer the decrypted AI traffic to any set of security products best suited to protect against any potential malware. Summary As previously alluded, this is not an exhaustive list of AI tool URLs. Not even close. But it contains the most common you'll see in the wild. The above script populates with an initial list of URLs that you are free to update as you become aware of new one. And of course we invite you to recommend additional AI tools to add to this list. References:https://github.com/f5devcentral/sslo-script-tools/tree/main/sslo-generative-ai-categories1.9KViews4likes1CommentSecuring the LLM User Experience with an AI Firewall
As artificial intelligence (AI) seeps into the core day-to-day operations of enterprises, a need exists to exert control over the intersection point of AI-infused applications and the actual large language models (LLMs) that answer the generated prompts. This control point should serve to impose security rules to automatically prevent issues such as personally identifiable information (PII) inadvertently exposed to LLMs. The solution must also counteract motivated, intentional misuse such as jailbreak attempts, where the LLM can be manipulated to provide often ridiculous answers with the ensuing screenshotting attempting to discredit the service. Beyond the security aspect and the overwhelming concern of regulated industries, other drivers include basic fiscal prudence 101, ensuring the token consumption of each offered LLM model is not out of hand. This entire discussion around observability and policy enforcement for LLM consumption has given rise to a class of solutions most frequently referred to as AI Firewalls or AI Gateways (AI GW). An AI FW might be leveraged by a browser plugin, or perhaps applying a software development kit (SDK) during the coding process for AI applications. Arguably, the most scalable and most easily deployed approach to inserting AI FW functionality into live traffic to LLMs is to use a reverse proxy. A modern approach includes the F5 Distributed Cloud service, coupled with an AI FW/GW service, cloud-based or self-hosted, that can inspect traffic intended for LLMs like those of OpenAI, Azure OpenAI, or privately operated LLMs like those downloaded from Hugging Face. A key value offered by this topology, a reverse proxy handing off LLM traffic to an AI FW, which in turn can allow traffic to reach target LLMs, stems from the fact that traffic is seen, and thus controllable, in both directions. Should an issue be present in a user’s submitted prompt, also known as an “inference”, it can be flagged: PII (Personally Identifiable Information) leakage is a frequent concern at this point. In addition, any LLM responses to prompts are also seen in the reverse path: consider a corrupted LLM providing toxicity in its generated replies. Not good. To achieve a highly performant reverse proxy approach to secured LLM access, a solution that can span a global set of users, F5 worked with Prompt Security to deploy an end-to-end AI security layer. This article will explore the efficacy and performance of the live solution. Impose LLM Guardrails with the AI Firewall and Distributed Cloud An AI firewall such as the Prompt Security offering can get in-line with AI LLM flows through multiple means. API calls from Curl or Postman can be modified to transmit to Prompt Security when trying to reach targets such as OpenAI or Azure OpenAI Service. Simple firewall rules can prevent employee direct access to these well-known API endpoints, thus making the Prompt Security route the sanctioned method of engaging with LLMs. A number of other methods could be considered but have concerns. Browser plug-ins have the advantage of working outside the encryption of the TLS layer, in a manner similar to how users can use a browser’s developer tools to clearly see targets and HTTP headers of HTTPS transactions encrypted on the wire. Prompt Security supports plugins. A downside, however, of browser plug-ins is the manageability issue, how to enforce and maintain across-the-board usage, simply consider the headache non-corporate assets used in the work environment. Another approach, interesting for non-browser, thick applications on desktops, think of an IDE like VSCode, might be an agent approach, whereby outbound traffic is handled by an on-board local proxy. Again, Prompt can fit in this model however the complexity of enforcement of the agent, like the browser approach, may not always be easy and aligned with complete A-to-Z security of all endpoints. One of the simplest approaches is to ingest LLM traffic through a network-centric approach. An F5 Distributed Cloud HTTPS load balancer, for instance, can ingest LLM-bound traffic, and thoroughly secure the traffic at the API layer, things like WAF policy and DDoS mitigations, as examples. HTTP-based control plane security is the focus here, as opposed to the encapsulated requests a user is sending to an LLM. The HTTPS load balancer can in turn hand off traffic intended for the likes of OpenAI to the AI gateway for prompt-aware inspections. F5 Distributed Cloud (XC) is a good architectural fit for inserting a third-party AI firewall service in-line with an organization’s inferencing requests. Simply project a FQDN for the consumption of AI services; in this article we used the domain name “llmsec.busdevF5.net” into the global DNS, advertising one single IP address mapping to the name. This DNS advertisement can be done with XC. The IP address, through BGP-4 support for anycast, will direct any traffic to this address to the closest of 27 international points of presence of the XC global fabric. Traffic from a user in Asia may be attracted to Singapore or Mumbai F5 sites, whereas a user in Western Europe might enter the F5 network in Paris or Frankfurt. As depicted, a distributed HTTPS load balancer can be configured – “distributed” reflects the fact traffic ingressing in any of the global sites can be intercepted by the load balancer. Normally, the server name indicator (SNI) value in the TLS Client Hello can be easily used to pick the correct load balancer to process this traffic. The first step in AI security is traditional reverse proxy core security features, all imposed by the XC load balancer. These features, to name just a few, might include geo-IP service policies to preclude traffic from regions, automatic malicious user detection, and API rate limiting; there are many capabilities bundled together. Clean traffic can then be selected for forwarding to an origin pool member, which is the standard operation of any load balancer. In this case, the Prompt Security service is the exclusive member of our origin pool. For this article, it is a cloud instantiated service - options exist to forward to Prompt implemented on a Kubernetes cluster or running on a Distributed Cloud AppStack Customer Edge (CE) node. Block Sensitive Data with Prompt Security In-Line AI inferences, upon reaching Prompt’s security service, are subjected to a wide breadth of security inspections. Some of the more important categories would include: Sensitive data leakage, although potentially contained in LLM responses, intuitively the larger proportion of risk is within the requesting prompt, with user perhaps inadvertently disclosing data which should not reach an LLM Source code fragments within submissions to LLMs, various programming languages may be scanned for and blocked, and the code may be enterprise intellectual property OWASP LLM top 10 high risk violations, such as LLM jailbreaking where the intent is to make the LLM behave and generate content that is not aligned with the service intentions; the goal may be embarrassing “screenshots”, such as having a chatbot for automobile vendor A actually recommend a vehicle from vendor B OWASP Prompt Injection detection, considered one of the most dangerous threats as the intention is for rogue users to exfiltrate valuable data from sources the LLM may have privileged access to, such as backend databases Token layer attacks, such as unauthorized and excessive use of tokens for LLM tasks, the so-called “Denial of Wallet” threat Content moderation, ensuring a safe interaction with LLMs devoid of toxicity, racial and gender discriminatory language and overall curated AI experience aligned with those productivity gains that LLMs promise To demonstrate sensitive data leakage protection, a Prompt Security policy was active which blocked LLM requests with, among many PII fields, a mailing address exposed. To reach OpenAI GPT3.5-Turbo, one of the most popular and cost-effective models in the OpenAI model lineup, prompts were sent to an F5 XC HTTPS load balancer at address llmsec.busdevf5.net. Traffic not violating the comprehensive F5 WAF security rules were proxied to the Prompt Security SaaS offering. The prompt below clearly involves a mailing address in the data portion. The ensuing prompt is intercepted by both the F5 and Prompt Security solutions. The first interception, the distributed HTTPS load balancer offered by F5 offers rich details on the transaction, and since no WAF rules or other security policies are violated, the transaction is forwarded to Prompt Security. The following demonstrates some of the interesting details surrounding the transaction, when completed (double-click to enlarge). As highlighted, the transaction was successful at the HTTP layer, producing a 200 Okay outcome. The traffic originated in the municipality of Ashton, in Canada, and was received into Distributed Cloud in F5’s Toronto (tr2-tor) RE site. The full details around the targeted URL path, such as the OpenAI /v1/chat/completions target and the user-agent involved, vscode-restclient, are both provided. Although the HTTP transaction was successful, the actual AI prompt was rejected, as hoped for, by Prompt Security. Drilling into the Activity Monitor in the Prompt UI, one can get a detailed verdict on the transaction (double-click). Following the yellow highlights above, the prompt was blocked, and the violation is “Sensitive Data”. The specific offending content, the New York City street address, is flagged as a precluded entity type of “mailing address”. Other fields that might be potentially blocking candidates with Prompt’s solution include various international passports or driver’s license formats, credit card numbers, emails, and IP addresses, to name but a few. A nice, time saving feature offered by the Prompt Security user interface is to simply choose an individual security framework of interest, such as GDPR or PCI, and the solution will automatically invoke related sensitive data types to detect. An important idea to grasp: The solution from Prompt is much more nuanced and advanced than simple REGEX; it invokes the power of AI itself to secure customer journeys into safe AI usage. Machine learning models, often transformer-based, have been fine-tuned and orchestrated to interpret the overall tone and tenor of prompts, gaining a real semantic understanding of what is being conveyed in the prompt to counteract simple obfuscation attempts. For instance, using printed numbers, such as one, two, three to circumvent Regex rules predicated on numerals being present - this will not succeed. This AI infused ability to interpret context and intent allows for preset industry guidelines for safe LLM enforcement. For instance, simply indicating the business sector is financial will allow the Prompt Security solution to pass judgement, and block if desired, financial reports, investment strategy documents and revenue audits, to name just a few. Similar awareness for sectors such as healthcare or insurance is simply a pull-down menu item away with the policy builder. Source Code Detection A common use case for LLM security solutions is identification and, potentially, blocking submissions of enterprise source code to LLM services. In this scenario, this small snippet of Python is delivered to the Prompt service: def trial(): return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 sum(trial() for i in range(10_000)) / 10_000 A policy is in place for Python and JavaScript detection and was invoked as hoped for. curl --request POST \ --url https://llmsec.busdevf5.net/v1/chat/completions \ --header 'authorization: Bearer sk-oZU66yhyN7qhUjEHfmR5T3BlbkFJ5RFOI***********' \ --header 'content-type: application/json' \ --header 'user-agent: vscode-restclient' \ --data '{"model": "gpt-3.5-turbo","messages": [{"role": "user","content": "def trial():\n return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500\n\nsum(trial() for i in range(10_000)) / 10_000"}]}' Content Moderation for Interactions with LLMs One common manner of preventing LLM responses from veering into undesirable territory is for the service provider to implement a detailed system prompt, a set of guidelines that the LLM should be governed by when responding to user prompts. For instance, the system prompt might instruct the LLM to serve as polite, helpful and succinct assistant for customers purchasing shoes in an online e-commerce portal. A request for help involving the trafficking of narcotics should, intuitively, be denied. Defense in depth has traditionally meant no single point of failure. In the above scenario, screening both the user prompt and ensuring LLM response for a wide range of topics leads to a more ironclad security outcome. The following demonstrates some of the topics Prompt Security can intelligently seek out; in this simple example, the topic of “News & Politics” has been singled out to block as a demonstration. Testing can be performed with this easy Curl command, asking for a prediction on a possible election result in Canadian politics: curl --request POST \ --url https://llmsec.busdevf5.net/v1/chat/completions \ --header 'authorization: Bearer sk-oZU66yhyN7qhUjEHfmR5T3Blbk*************' \ --header 'content-type: application/json' \ --header 'user-agent: vscode-restclient' \ --data '{"model": "gpt-3.5-turbo","messages": [{"role": "user","content": "Who will win the upcoming Canadian federal election expected in 2025"}],"max_tokens": 250,"temperature": 0.7}' The response, available in the Prompt Security console, is also presented to the user. In this case, a Curl user leveraging the VSCode IDE. The response has been largely truncated for brevity, fields that are of interest is an HTTP “X-header” indicating the transaction utilized the F5 site in Toronto, and the number of tokens consumed in the request and response are also included. Advanced LLM Security Features Many of the AI security concerns are given prominence by the OWASP Top Ten for LLMs, an evolving and curated list of potential concerns around LLM usage from subject matter experts. Among these are prompt injection attacks and malicious instructions often perceived as benign by the LLM. Prompt Security uses a layered approach to thwart prompt injection. For instance, during the uptick in interest in ChatGPT, DAN (Do Anything Now) prompt injection was widespread and a very disruptive force, as discussed here. User prompts will be closely analyzed for the presence of the various DAN templates that have evolved over the past 18 months. More significantly, the use of AI itself allows the Prompt solution to recognize zero-day bespoke prompts attempting to conduct mischief. The interpretative powers of fine-tuned, purpose-built security inspection models are likely the only way to stay one step ahead of bad actors. Another chief concern is protection of the system prompt, the guidelines that reel in unwanted behavior of the offered LLM service, what instructed our LLM earlier in its role as a shoe sales assistant. The system prompt, if somehow manipulated, would be a significant breach in AI security, havoc could be created with an LLM directed astray. As such, Prompt Security offers a policy to compare the user provided prompt, the configured system prompt in the API call, and the response generated by the LLM. In the event that a similarity threshold with the system prompt is exceeded in the other fields, the transaction can be immediately blocked. An interesting advanced safeguard is the support for a “canary” word - a specific value that a well behaved LLM should never present in any response, ever. The detection of the canary word by the Prompt solution will raise an immediate alert. One particularly broad and powerful feature in the AI firewall is the ability to find secrets, meaning tokens or passwords, frequently for cloud-hosted services, that are revealed within user prompts. Prompt Security offers the ability to scour LLM traffic for in excess of 200 meaningful values. Just as a small representative sample of the industry’s breadth of secrets, these can all be detected and acted upon: Azure Storage Keys Detector Artifactory Detector Databricks API tokens GitLab credentials NYTimes Access Tokens Atlassian API Tokens Besides simple blocking, a useful redaction option can be chosen. Rather than risk compromise of credentials and obfuscated value will instead be seen at the LLM. F5 Positive Security Models for AI Endpoints The AI traffic delivered and received from Prompt Security’s AI firewall is both discovered and subjected to API layer policies by the F5 load balancer. Consider the token awareness features of the AI firewall, excessive token consumption can trigger an alert and even transaction blocking. This behavior, a boon when LLMs like the OpenAI premium GPT-4 models may have substantial costs, allows organizations to automatically shut down a malicious actor who illegitimately got hold of an OPENAI_API key value and bombarded the LLM with prompts. This is often referred to as a “Denial of Wallet” situation. F5 Distributed Cloud, with its focus upon the API layer, has congruent safeguards. Each unique user of an API service is tracked to monitor transactional consumption. By setting safeguards for API rate limiting, an excessive load placed upon the API endpoint will result in HTTP 429 “Too Many Request” in response to abusive behavior. A key feature of F5 API Security is the fact that it is actionable in both directions, and also an in-line offering, unlike some API solutions which reside out of band and consume proxy logs for reporting and threat detection. With the automatic discovery of API endpoints, as seen in the following screenshot, the F5 administrator can see the full URL path which in this case exercises the familiar OpenAI /v1/chat/completions endpoint. As highlighted by the arrow, the schema of traffic to API endpoints is fully downloadable as an OpenAPI Specification (OAS), formerly known as a Swagger file. This layer of security means fields in API headers and bodies can be validated for syntax, such that a field whose schema expects a floating-point number can see any different encoding, such as a string, blocked in real-time in either direction. A possible and valuable use case: allow an initial unfettered access to a service such as OpenAI, by means of Prompt Security’s AI firewall service, for a matter of perhaps 48 hours. After a baseline of API endpoints has been observed, the API definition can be loaded from any saved Swagger files at the end of this “observation” period. The loaded version can be fully pruned of undesirable or disallowed endpoints, all future traffic must conform or be dropped. This is an example of a “positive security model”, considered a gold standard by many risk-adverse organizations. Simply put, a positive security model allows what has been agreed upon through and rejects everything else. This ability to learn and review your own traffic, and then only present Prompt Security with LLM endpoints that an organization wants exposed is an interesting example of complementing an AI security solution with rich API layer features. Summary The world of AI and LLMs is rapidly seeing investment, in time and money, from virtually all economic sectors; the promise of rapid dividends in the knowledge economy is hard to resist. As with any rapid deployment of new technology, safe consumption is not guaranteed, and it is not built in. Although LLMs often suggest guardrails are baked into offerings, a 30-second search of the Internet will expose firsthand experiences where unexpected outcomes when invoking AI are real. Brand reputation is at stake and false information can be hallucinated or coerced out of LLMs by determined parties. By combining the ability to ingest globally at high-speed dispersed users and apply a first level of security protections, F5 Distributed Cloud can be leveraged as an onboarding for LLM workloads. As depicted in this article, Prompt Security can in turn handle traffic egressing F5’s distributed HTTPS load balancers and provide state-of-the-art AI safeguards, including sensitive data detection, content moderation and other OWASP-aligned mechanisms like jailbreak and prompt injection mitigation. Other deployment models exist, including deploying Prompt Security’s solution on-premises, self-hosted in cloud tenants, and running the solution on Distributed Cloud CE nodes themselves is supported.810Views3likes1CommentSecure, Deliver and Optimize Your Modern Generative AI Apps with F5
In this demo, Foo-Bang Chan explores how F5's solutions can help you implement, secure, and optimize your chatbots and other AI applications. This will ensure they perform at their best while protecting sensitive data. One of the AI frameworks showed is Enterprise Retrieval-Augmented Generation (RAG). This demo leverages F5 Distributed Cloud (XC) AppStack, Distributed Cloud WAAP, NGINX Plus as API Gateway, API-Discovery, API-Protection, LangChain, Vector databases, and Flowise AI.343Views5likes2CommentsOWASP Tactical Access Defense Series: Broken Function Level Authorization (BFLA)
Broken Function Level Authorization (BFLA) is a type of security vulnerability in web applications where an attacker can access functionality or perform actions they should not be authorized to perform. This problem happens when an application doesn’t check access control on functions or endpoints correctly. This lets users do things that are not allowed. In this article, we are going through API5 item from OWASP Top 10 API Security risks and exploring F5 BIG-IP Access Policy Manager (APM) as a role in our arsenal Let’s consider our test application for each retail agent to submit their sales data, but without the ability to retrieve any from the system. In HTTP terms, the retail agent can POST but not allowed to perform GET, while the manager can perform GET to check agents performance, and collected data. Mitigating Risks with BIG-IP APM BIG-IP APM per-request granularity: with per-request granularity, organizations can dynamically enforce access policies based on various factors such as user identity, device characteristics, and contextual information. This enables organizations to implement fine-grained access controls at the API level, mitigating the risks associated with Broken Function Level Authorization. Key Features: Dynamic Access Control Policies: BIG-IP APM empowers organizations to define dynamic access control policies that adapt to changing conditions in real-time. By evaluating each API request against these policies, BIG-IP APM ensures that authorized users can only perform specific authorized functions (actions) on specified resources. Granular Authorization Rules: BIG-IP APM enables organizations to define granular authorization rules that govern access to individual objects or resources within the API ecosystem. By enforcing strict permission checks at the object level, F5 BIG-IP APM prevents unauthorized functions. Related Content F5 BIG-IP Access Policy Manager | F5 Introduction to OWASP API Security Top 10 2023 OWASP Top 10 API Security Risks – 2023 - OWASP API Security Top 10 API Protection Concepts OWASP Tactical Access Defense Series: How BIG-IP APM Strengthens Defenses Against OWASP Top 10 OWASP Tactical Access Defense Series: Broken Object-Level Authorization and BIG-IP APM F5 Hybrid Security Architectures (Part 5 - F5 XC, BIG-IP APM, CIS, and NGINX Ingress Controller) OWASP Tactical Access Defense Series: Broken Authentication and BIG-IP APM OWASP Tactical Access Defense Series: Broken Object Property-Level Authorization and BIG-IP APM OWASP Tactical Access Defense Series: Unrestricted Resource Consumption103Views0likes0CommentsF5 Hybrid Security Architectures: Part 3 F5 XC API Protection and NGINX Ingress
Here in this example solution, we will be using DevSecOps practices to deploy an AWS Elastic Kubernetes Service (EKS) cluster running the Arcadia Finance test web application serviced by F5 NGINX Ingress Controller for Kubernetes. For protection, will provide API Discovery and Security with F5 Distributed Cloud's Web App and API Protection service. Introduction: For those of you following along with the F5 Hybrid Security Architectures series, welcome back! If this is your first foray into the series and would like some background, have a look at the intro article. This series is using theF5 Hybrid Security ArchitecturesGitHub repo and CI/CD platform to deploy F5 based hybrid security solutions based on DevSecOps principles. This repo is a community supported effort to provide not only a demo and workshop, but also a stepping stone for utilizing these practicesin your own F5 deployments. If you find any bugs or have any enhancement requests, open a issue or better yet contribute! API Security: APIs are an integral part of our daily routine, facilitating everything from critical to mundane tasks. From banking and ride-sharing apps to the weather updates we check before stepping out, APIs enable these functionalities. Given the sensitive nature of the data that can be exposed by unprotected APIs, the need for effective security cannot be stressed enough. With F5 Distributed Cloud Web App and API protection security teams can discover, inventory, and secure these critical APIs. Here in this example solution, we will be using DevSecOps practices to deploy an AWS Elastic Kubernetes Service (EKS) cluster running the Brewz test web application serviced by F5 NGINX Ingress Controller. To secure our application and APIs, we will deploy F5 Distributed Cloud's Web App and API Protection service. This will provide us API Discovery and Security as well as a traditional Web Application Firewall and Malicious User Detection. Distributed Cloud WAAP:Available for SaaS-based deploymentsand provides comprehensive security solutions designed to safeguard web applications and APIs from a wide range of cyber threats. This solution utilizes a distributed cloud architecture, which enables it to provide real-time protection and scale to meet the needs of large enterprises. NIGNX Ingress Controller for Kubernetes: A lightweight software solution that helps manage app connectivity at the edge of a Kubernetes cluster by directing requests to the appropriate services and pods. It provides advanced load balancing, routing, identity, and security, as well as montioring and observability features. XC WAAP + NGINX Ingress Controller Workflow GitHub Repo: F5 Hybrid Security Architectures Prerequisites: F5 Distributed Cloud Account (F5 XC) Create an F5 XC API certificate NGINX Ingress Controller license AWS Account- Due to the assets being created, free tier will not work. Terraform Cloud Account GitHub Account Assets xc: F5 Distributed Cloud WAAP nic:NGINX Ingress Controller infra: AWS Infrastructure (VPC, IGW, etc.) eks:AWS Elastic Kubernetes Service brewz: Brewz SPA test web application Tools Cloud Provider: AWS Infrastructure as Code: Terraform Infrastructure as Code State: Terraform Cloud CI/CD: GitHub Actions Terraform Cloud Workspaces: Create a workspace for each asset in the workflow chosen Workflow: xc-nic Workspaces: infra, eks, nic, brewz, xc Your Terraform Cloud console should resemble the following: Variable Set: Create a Variable Set with the following values. IMPORTANT:Ensure sensitive values are appropriatelymarked. AWS_ACCESS_KEY_ID: Your AWS Access Key ID - Environment Variable AWS_SECRET_ACCESS_KEY: Your AWS Secret Access Key - Environment Variable AWS_SESSION_TOKEN: Your AWS Session Token - Environment Variable VOLT_API_P12_FILE: Your F5 XC API certificate. Set this to api.p12 - Environment Variable VES_P12_PASSWORD: Set this to the password you supplied when creating your F5 XC API key - Environment Variable nginx_jwt: Your NGINX Java Web Token associatedwith your NGINX license - Terraform Variable ssh_key: Your ssh key for access to created compute assets - Terrraform Variable tf_cloud_organization: Your Terraform Cloud Organization name - Terraform Variable Your Variable Set should resemble the following: GitHub Fork and Clone Repo:F5 Hybrid Security Architectures ctions Secrets: Create the following GitHub Actions secrets in your forked repo XC_P12: The base64 encoded F5 XC API certificate TF_API_TOKEN: Your Terraform Cloud API token TF_CLOUD_ORGANIZATION: Your Terraform Cloud Organization TF_CLOUD_WORKSPACE_workspace: Create for each workspace used in your workflow. EX:TF_CLOUD_WORKSPACE_XCwould be created with the value xc Your GitHub Actions Secrets should resemble the following: Setup Deployment Branch and Terraform Local Variables: Step 1: Check out a branch for the deploy workflow using the following naming convention xc-nic deployment branch: deploy-xcapi-nic Step 2:Rename infra/terraform.tfvars.examples to infra/terraform.tfvars and add the following data #Global project_prefix = "Your project identifier" resource_owner = "You" #AWS aws_region = "Your AWS region" ex: us-west-1 azs = "Your AWS availability zones" ex: ["us-west-1a", "us-west-1b"] #Assets nic = true nap = false bigip = false bigip-cis = false Step 3: Rename xc/terraform.tfvars.examples to xc/terraform.tfvars and add the following data #XC Global api_url = "https://.console.ves.volterra.io/api" xc_tenant = "Your XC Tenant Name" xc_namespace = "Your XC namespace" #XC LB app_domain = "Your App Domain" #XC WAF xc_waf_blocking = true #XC AI/ML Settings for MUD, APIP - NOTE: Only set if using AI/ML settings from the shared namespace xc_app_type = [] xc_multi_lb = false #XC API Protection and Discovery xc_api_disc = true xc_api_pro = true xc_api_spec = ["Path to uploaded API spec"] *See below screen shot for how to obtain this value. #XC Bot Defense xc_bot_def = false #XC DDoS xc_ddos = false #XC Malicious User Detection xc_mud = true For Path to API Spec navigate to Manage->Files->Swagger Files, click the three dots next to your OAS, and choose "Copy Latest Version's URL". Paste this into the xc_api_spec in the xc/terraform.tfvars. Step 4: Modify line 16 in the .gitignore and comment out the *.tfvars line with # and save the file Step 5: Commit your changes Deployment: Step 1: Push your deploy branch to the forked repo Step 2:Back in GitHub, navigate to the Actions tab of your forked repo and monitor your build Step 3: Once the pipeline completes, verify your assets were deployed to AWS and F5 XC Step 4:Check your Terraform Outputs for XC and verify your app is available by navigating to the FQDN API Discovery andSecurity Dashboards: After leaving the Brewz test app deployed for a while we can start to see the API graph form. The F5 XC WAAP platform learns the schema structure of the API by analyzing sampled request data, then reverse-engineering the schema to generates an OpenAPI spec. The platform validates what is deploy versus what is discovered and tags any Shadow APIs that are found. We can also check the dashboards for any attacks that may have occurred while we were waiting for discovery to finish. The internet being what it is, it didn't take long for the platform to protect us against some attacks. Deployment Teardown: Step 1:From your deployment branch check out a branch for the destroy workflow using the following naming convention xc-nic destroy branch: destroy-xcapi-nic Step 2: Push your destroy branch to the forked repo Step 3: Back in GitHub, navigate to the Actions tab of your forked repo and monitor your build Step 4:Once the pipeline completes, verify your assets were destroyed Conclusion: In this article we have shown how to utilize the F5 Hybrid Security Architectures GitHub repo and CI/CD pipeline to deploy a tiered security architecture utilizing F5 XC WAAP and NGINX Ingress Controller to protect a test API running in AWS EKS. While the code and security policies deployed are generic and not inclusive of all use-cases, they can be used as a steppingstone for deploying F5 based hybrid architectures in your own environments. Workloads are increasingly deployed across multiple diverse environments and application architectures. Organizations need the ability to protect their essential applications regardless of deployment or architecture circumstances. Equally important is the need to deploy these protections with the same flexibility and speed as the apps they protect. With the F5 WAF portfolio, coupled with DevSecOps principles, organizations can deploy and maintain industry-leading security without sacrificing the time to value of their applications. Not only can Edge and Shift Left principles exist together, but they can also work in harmony to provide a more effective security solution. Article Series: F5 Hybrid Security Architectures (Intro - One WAF Engine, Total Flexibility) F5 Hybrid Security Architectures (Part 1 - F5's Distributed Cloud WAF and BIG-IP Advanced WAF) F5 Hybrid Security Architectures (Part 2 - F5's Distributed Cloud WAF and NGINX App Protect WAF) F5 Hybrid Security Architectures (Part 3 - F5 XC API Protection and NGINX Ingress Controller) F5 Hybrid Security Architectures (Part 4 - F5 XC BOT and DDoS Defense and BIG-IP Advanced WAF) F5 Hybrid Security Architectures (Part 5 - F5 XC, BIG-IP APM, CIS, and NGINX Ingress Controller) For further information or to get started: F5 Distributed Cloud Platform (Link) F5 Distributed Cloud WAAP Services (Link) F5 Distributed Cloud WAAP YouTube series (Link) F5 Distributed Cloud WAAP Get Started (Link)5KViews5likes2Comments