APM Configuration to Support Duo MFA using iRule
Overview BIG-IP APM has supported Duo as an MFA provider for a long time with RADIUS-based integration. Recently, Duo has added support for Universal Prompt that uses Open ID Connect (OIDC) protocol to provide two-factor authentication. To integrate APM as an OIDC client and resource server, and Duo as an Identity Provider (IdP), Duo requires the user’s logon name and custom parameters to be sent for Authentication and Token request. This guide describes the configuration required on APM to enable Duo MFA integration using an iRule. iRules addresses the custom parameter challenges by generating the needed custom values and saving them in session variables, which the OAuth Client agent then uses to perform MFA with Duo. This integration procedure is supported on BIG-IP versions 13.1, 14.1x, 15.1x, and 16.x. To integrate Duo MFA with APM, complete the following tasks: 1. Choose deployment type: Per-request or Per-session 2. Configure credentials and policies for MFA on the DUO web portal 3. Create OAuth objects on the BIG-IP system 4. Configure the iRule 5. Create the appropriate access policy/policies on the BIG-IP system 6. Apply policy/policies and iRule to the APM virtual server Choose deployment type APM supports two different types of policies for performing authentication functions. Per-session policies: Per-session policies provide authentication and authorization functions that occur only at the beginning of a user’s session. These policies are compatible with most APM use cases such as VPN, Webtop portal, Remote Desktop, federation IdP, etc. Per-request policies: Per-request policies provide dynamic authentication and authorization functionality that may occur at any time during a user’s session, such as step-up authentication or auditing functions only for certain resources. These policies are only compatible with Identity Aware Proxy and Web Access Management use cases and cannot be used with VPN or webtop portals. This guide contains information about setting up both policy types. Prerequisites Ensure the BIG-IP system has DNS and internet connectivity to contact Duo directly for validating the user's OAuth tokens. Configure credentials and policies for MFA on Duo web portal Before you can protect your F5 BIG-IP APM Web application with Duo, you will first need to sign up for a Duo account. 1. Log in to the Duo Admin Panel and navigate to Applications. 2. Click Protect an application. Figure 1: Duo Admin Panel – Protect an Application 3. Locate the entry for F5 BIG-IP APM Web in the applications list and click Protect to get the Client ID, Client secret, and API hostname. You will need this information to configure objects on APM. Figure 2: Duo Admin Panel – F5 BIG-IP APM Web 4. As DUO is used as a secondary authentication factor, the user’s logon name is sent along with the authentication request. Depending on your security policy, you may want to pre-provision users in Duo, or you may allow them to self-provision to set their preferred authentication type when they first log on. To add users to the Duo system, navigate to the Dashboard page and click the Add New...-> Add User button. A Duo username should match the user's primary authentication username. Refer to the https://duo.com/docs/enrolling-users link for the different methods of user enrollment. Refer to Duo Universal Prompt for additional information on Duo’s two-factor authentication. Create OAuth objects on the BIG-IP system Create a JSON web key When APM is configured to act as an OAuth client or resource server, it uses JSON web keys (JWKs) to validate the JSON web tokens it receives from Duo. To create a JSON web key: 1. On the Main tab, select Access > Federation > JSON Web Token > Key Configuration. The Key Configuration screen opens. 2. To add a new key configuration, click Create. 3. In the ID and Shared Secret fields, enter the Client ID and Client Secret values respectively obtained from Duo when protecting the application. 4. In the Type list, select the cryptographic algorithm used to sign the JSON web key. Figure 3: Key Configuration screen 5. Click Save. Create a JSON web token As an OAuth client or resource server, APM validates the JSON web tokens (JWT) it receives from Duo. To create a JSON web token: 1. On the Main tab, select Access > Federation > JSON Web Token > Token Configuration. The Token Configuration screen opens. 2. To add a new token configuration, click Create. 3. In the Issuer field, enter the API hostname value obtained from Duo when protecting the application. 4. In the Signing Algorithms area, select from the Available list and populate the Allowed and Blocked lists. 5. In the Keys (JWK) area, select the previously configured JSON web key in the allowed list of keys. Figure 4: Token Configuration screen 6. Click Save. Configure Duo as an OAuth provider APM uses the OAuth provider settings to get URIs on the external OAuth authorization server for JWT web tokens. To configure an OAuth provider: 1. On the Main tab, select Access > Federation > OAuth Client / Resource Server > Provider. The Provider screen opens. 2. To add a provider, click Create. 3. In the Name field, type a name for the provider. 4. From the Type list, select Custom. 5. For Token Configuration (JWT), select a configuration from the list. 6. In the Authentication URI field, type the URI on the provider where APM should redirect the user for authentication. The hostname is the same as the API hostname in the Duo application. 7. In the Token URI field, type the URI on the provider where APM can get a token. The hostname is the same as the API hostname in the Duo application. Figure 5: OAuth Provider screen 8. Click Finished. Configure Duo server for APM The OAuth Server settings specify the OAuth provider and role that Access Policy Manager (APM) plays with that provider. It also sets the Client ID, Client Secret, and Client’s SSL certificates that APM uses to communicate with the provider. To configure a Duo server: 1. On the Main tab, select Access > Federation > OAuth Client / Resource Server > OAuth Server. The OAuth Server screen opens. 2. To add a server, click Create. 3. In the Name field, type a name for the Duo server. 4. From the Mode list, select how you want the APM to be configured. 5. From the Type list, select Custom. 6. From the OAuth Provider list, select the Duo provider. 7. From the DNS Resolver list, select a DNS resolver (or click the plus (+) icon, create a DNS resolver, and then select it). 8. In the Token Validation Interval field, type a number. In a per-request policy subroutine configured to validate the token, the subroutine repeats at this interval or the expiry time of the access token, whichever is shorter. 9. In the Client Settings area, paste the Client ID and Client secret you obtained from Duo when protecting the application. 10. From the Client's ServerSSL Profile Name, select a server SSL profile. Figure 6: OAuth Server screen 11. Click Finished. Configure an auth-redirect-request and a token-request Requests specify the HTTP method, parameters, and headers to use for the specific type of request. An auth-redirect-request tells Duo where to redirect the end-user, and a token-request accesses the authorization server for obtaining an access token. To configure an auth-redirect-request: 1. On the Main tab, select Access > Federation > OAuth Client / Resource Server > Request. The Request screen opens. 2. To add a request, click Create. 3. In the Name field, type a name for the request. 4. For the HTTP Method, select GET. 5. For the Type, select auth-redirect-request. 6. As shown in Figure 7, specify the list of GET parameters to be sent: request parameter with value depending on the type of policy For per-request policy: %{subsession.custom.jwt_duo} For per-session policy: %{session.custom.jwt_duo} client_id parameter with type client-id response_type parameter with type response-type Figure 7: Request screen with auth-redirect-request (Use “subsession.custom…” for Per-request or “session.custom…” for Per-session) 7. Click Finished. To configure a token-request: 1. On the Main tab, select Access > Federation > OAuth Client / Resource Server > Request. The Request screen opens. 2. To add a request, click Create. 3. In the Name field, type a name for the request. 4. For the HTTP Method, select POST. 5. For the Type, select token-request. 6. As shown in Figure 8, specify the list of POST parameters to be sent: client_assertion parameter with value depending on the type of policy For per-request policy: %{subsession.custom.jwt_duo_token} For per-session policy: %{session.custom.jwt_duo_token} client_assertion_type parameter with value urn:ietf:params:oauth:client-assertion-type:jwt-bearer grant_type parameter with type grant-type redirect_uri parameter with type redirect-uri Figure 8: Request screen with token-request (Use “subsession.custom…” for Per-request or “session.custom…” for Per-session) 7. Click Finished. Configure the iRule iRules gives you the ability to customize and manage your network traffic. Configure an iRule that creates the required sub-session variables and usernames for Duo integration. Note: This iRule has sections for both per-request and per-session policies and can be used for either type of deployment. To configure an iRule: 1. On the Main tab, click Local Traffic > iRules. 2. To create an iRules, click Create. 3. In the Name field, type a name for the iRule. 4. Copy the sample code given below and paste it in the Definition field. Replace the following variables with values specific to the Duo application: <Duo Client ID> in the getClientId function with Duo Application ID. <Duo API Hostname> in the createJwtToken function with API Hostname. For example, https://api-duohostname.com/oauth/v1/token. <JSON Web Key> in the getJwkName function with the configured JSON web key. Note: The iRule ID here is set as JWT_CREATE. You can rename the ID as desired. You specify this ID in the iRule Event agent in Visual Policy Editor. Note: The variables used in the below example are global, which may affect your performance. Refer to the K95240202: Understanding iRule variable scope article for further information on global variables, and determine if you use a local variable for your implementation. proc randAZazStr {len} { return [subst [string repeat {[format %c [expr {int(rand() * 26) + (rand() > .5 ? 97 : 65)}]]} $len]] } proc getClientId { return <Duo Client ID> } proc getExpiryTime { set exp [clock seconds] set exp [expr $exp + 900] return $exp } proc getJwtHeader { return "{\"alg\":\"HS512\",\"typ\":\"JWT\"}" } proc getJwkName { return <JSON Web Key> #e.g. return "/Common/duo_jwk" } proc createJwt {duo_uname} { set header [call getJwtHeader] set exp [call getExpiryTime] set client_id [call getClientId] set redirect_uri "https://" set redirect [ACCESS::session data get "session.server.network.name"] append redirect_uri $redirect append redirect_uri "/oauth/client/redirect" set payload "{\"response_type\": \"code\",\"scope\":\"openid\",\"exp\":${exp},\"client_id\":\"${client_id}\",\"redirect_uri\":\"${redirect_uri}\",\"duo_uname\":\"${duo_uname}\"}" set jwt_duo [ ACCESS::oauth sign -header $header -payload $payload -alg HS512 -key [call getJwkName] ] return $jwt_duo } proc createJwtToken { set header [call getJwtHeader] set exp [call getExpiryTime] set client_id [call getClientId] set aud "<Duo API Hostname>/oauth/v1/token" #Example: set aud https://api-duohostname.com/oauth/v1/token set jti [call randAZazStr 32] set payload "{\"sub\": \"${client_id}\",\"iss\":\"${client_id}\",\"aud\":\"${aud}\",\"exp\":${exp},\"jti\":\"${jti}\"}" set jwt_duo [ ACCESS::oauth sign -header $header -payload $payload -alg HS512 -key [call getJwkName] ] return $jwt_duo } when ACCESS_POLICY_AGENT_EVENT { set irname [ACCESS::policy agent_id] if { $irname eq "JWT_CREATE" } { set ::duo_uname [ACCESS::session data get "session.logon.last.username"] ACCESS::session data set session.custom.jwt_duo [call createJwt $::duo_uname] ACCESS::session data set session.custom.jwt_duo_token [call createJwtToken] } } when ACCESS_PER_REQUEST_AGENT_EVENT { set irname [ACCESS::perflow get perflow.irule_agent_id] if { $irname eq "JWT_CREATE" } { set ::duo_uname [ACCESS::session data get "session.logon.last.username"] ACCESS::perflow set perflow.custom [call createJwt $::duo_uname] ACCESS::perflow set perflow.scratchpad [call createJwtToken] } } Figure 9: iRule screen 5. Click Finished. Create the appropriate access policy/policies on the BIG-IP system Per-request policy Skip this section for a per-session type deployment The per-request policy is used to perform secondary authentication with Duo. Configure the access policies through the access menu, using the Visual Policy Editor. The per-request access policy must have a subroutine with an iRule Event, Variable Assign, and an OAuth Client agent that requests authorization and tokens from an OAuth server. You may use other per-request policy items such as URL branching or Client Type to call Duo only for certain target URIs. Figure 10 shows a subroutine named duosubroutine in the per-request policy that handles Duo MFA authentication. Figure 10: Per-request policy in Visual Policy Editor Configuring the iRule Event agent The iRule Event agent specifies the iRule ID to be executed for Duo integration. In the ID field, type the iRule ID as configured in the iRule. Figure 11: iRule Event agent in Visual Policy Editor Configuring the Variable Assign agent The Variable Assign agent specifies the variables for token and redirect requests and assigns a value for Duo MFA in a subroutine. This is required only for per-request type deployment. Add sub-session variables as custom variables and assign their custom Tcl expressions as shown in Figure 12. subsession.custom.jwt_duo_token = return [mcget {perflow.scratchpad}] subsession.custom.jwt_duo = return [mcget {perflow.custom}] Figure 12: Variable Assign agent in Visual Policy Editor Configuring the OAuth Client agent An OAuth Client agent requests authorization and tokens from the Duo server. Specify OAuth parameters as shown in Figure 13. In the Server list, select the Duo server to which the OAuth client directs requests. In the Authentication Redirect Request list, select the auth-redirect-request configured earlier. In the Token Request list, select the token-request configured earlier. Some deployments may not need the additional information provided by OpenID Connect. You could, in that case, disable it. Figure 13: OAuth Client agent in Visual Policy Editor Per-session policy Configure the Per Session policy as appropriate for your chosen deployment type. Per-request: The per-session policy must contain at least one logon page to set the username variable in the user’s session. Preferably it should also perform some type of primary authentication. This validated username is used later in the per-request policy. Per-session: The per-session policy is used for all authentication. A per-request policy is not used. Figures 14a and 14b show a per-session policy that runs when a client initiates a session. Depending on the actions you include in the access policy, it can authenticate the user and perform actions that populate session variables with data for use throughout the session. Figure 14a: Per-session policy in Visual Policy Editor performs both primary authentication and Duo authentication (for per-session use case) Figure 14b: Per-session policy in Visual Policy Editor performs primary authentication only (for per-request use case) Apply policy/policies and iRule to the APM virtual server Finally, apply the per-request policy, per-session policy, and iRule to the APM virtual server. You assign iRules as a resource to the virtual server that users connect. Configure the virtual server’s default pool to the protected local web resource. Apply policy/policies to the virtual server Per-request policy To attach policies to the virtual server: 1. On the Main tab, click Local Traffic > Virtual Servers. 2. Select the Virtual Server. 3. In the Access Policy section, select the policy you created. 4. Click Finished. Figure 15: Access Policy section in Virtual Server (per-request policy) Per-session policy Figure 16 shows the Access Policy section in Virtual Server when the per-session policy is deployed. Figure 16: Access Policy section in Virtual Server (per-session policy) Apply iRule to the virtual server To attach the iRule to the virtual server: 1. On the Main tab, click Local Traffic > Virtual Servers. 2. Select the Virtual Server. 3. Select the Resources tab. 4. Click Manage in the iRules section. 5. Select an iRule from the Available list and add it to the Enabled list. 6. Click Finished.16KViews10likes48CommentsArchitecture Options for Kubernetes Service Discovery in Distributed Cloud
The F5 Distributed Cloud (XC) Virtual Edition (VE) Customer Edge (CE) platform can be deployed within your data center or cloud environment. It can perform service discovery for services in your Kubernetes (K8s) clusters. Why do Service discovery? Service discovery is important in systems that change and move around, like microservices architectures. It helps find and connect services automatically. Instead of hard coding network locations, service discovery makes sure that services can easily find and communicate with each other, even when they scale or change locations. This improves scalability, resilience, and simplifies managing services in complex environments like Kubernetes or cloud infrastructures. By reducing manual intervention, service discovery enhances the overall efficiency and reliability of application deployments. The F5 Distributed Cloud (XC) CE can use the native kube-apiserver, or Consul, to query for services as they come online enabling admins to reference these discovered services. These services become XC origin pool definitions and can then be published locally through a proxy (http load balancer) on the CE itself or via our Global Application Delivery Network (ADN) - (Regional Edge Deployment). The F5 XC Load Balancer does more than just balance packets. It offers a set of SAAS security services that are easy to use. Customers can have a globally redundant layer of security while serving content from private K8’s clusters. This write-up covers two distinct service discovery architecture options available with XC. Secure K8s Gateway (VE CE) Kubernetes Sitetype Customer Edge (K8s sitetype CE) Depending on your service discovery use-case you may end up with one of these two options or both as they are not mutually exclusive. The first option of using the CE as a Secure K8s Gateway, may be the easier option for folks not particularly versed with the nuances of Kubernetes. Architecture 1:Virtual Edition Customer Edge (VE CE) If a picture is worth a thousand words then a working lab environment is worth a million. This repo walks through an entire Secure K8s GW setup and will leave you with a config that could easily be expanded upon. You can quickly build a PoC and start getting familiar with these modern app capabilities by using these tools. The readme includes details on how to use everything and what functions the various tools provide. It's all shell script and yaml so it's very easy to read through these and understand what's going on. https://github.com/dober-man/ve-ce-secure-k8s-gw This repo is designed to automate the deployment and configuration of a secure Kubernetes Gateway in the F5 Distributed Cloud (F5 XC) environment. It provides scripts and YAML configurations to set up secure communication, networking policies, and infrastructure components required to establish a secure gateway in a Kubernetes cluster. The readme file also documents the pre-reqs but you will at a minimum need an XC tenant, an XC Virtual Edition CE and an Ubuntu 22.04 server. If you do not have an XC tenant or VE CE, reach out to your local F5 Account team. Please use the issues feature of Github to report any discrepancies with the builds or documentation. Architecture 2:Kubernetes Sitetype Customer Edge (K8s sitetype CE) In this architecture, the entire CE runs as a service within the cluster. This model is going to require a bit more fundamental knowledge of Kubernetes and more tightly integrates with the cluster. You can quickly build a PoC and start getting familiar with these modern app capabilities by using this repo. https://github.com/dober-man/k8s-sitetype-ce This repo is focused on automating the deployment of the k8s-sitetype CE in a Kubernetes cluster. It provides scripts to simplify the process of setting up a secure site gateway for handling network traffic between cloud environments, on-premises infrastructure, and edge locations. The readme file documents the pre-reqs, but you will at a minimum need an XC tenant and an Ubuntu 22.04 server. If you do not have an XC tenant, reach out to your local F5 Account team. Please use the issues feature of Github to report any discrepancies with the builds or documentation. Summary - F5 Distributed Cloud offers a number of Kubernetes integration options for service discovery but also offers several other capabilities including Virtual K8s (Namespace as a Service) and Managed K8s which will be covered in future articles. Please feel free to drop a like or leave a comment below.30Views0likes0CommentsHow to secure egress with F5 Service Proxy for Kubernetes
Outline: Securing Egress Challenges How F5 can help Technical bit on how it works Getting trafficinto your clusters to your workloads is just a small part of the cluster admin's tasks, and there are many options available. Controlling the packets going out is harder and often ignored. This makes your clusters more vulnerable to security risks because they don’t follow the same strict rules as your traditional networks. This article will dive deeper into how SPK can control traffic exiting your clusters, even when your application workload uses multus to attach additional external interfaces. Secure Egress Challenges By default, a pod deployed using calico CNI will follow the default route to get out of the cluster. Traffic will look like it’s coming from the worker host’s external IP address on the management interface. While KubernetesNetworkPolicies can be used for egress, it becomes painful to manage the lifecycle of hundreds or thousands of policies across all namespaces as the cluster grows. If you deploy a pod with multus interfaces, as commonly seen with telco applications, you add another way for that pod to bypass any NetworkPolicies applied within the cluster. What if there was a way to manage egress dynamically (as pods are spun up and down) and easily so that the cluster admin could centrally configure and control traffic flowing out of the cluster? How F5 can help Service Proxy for Kubernetes (SPK) is a cloud-native application traffic management solution, designed for communication service provider (CoSP) 5G networks and other application workloads. With SPK and its Calico egress gateway feature, managing a pod's default calico network interface as well as any multus interfaces becomes easy and consistent with the CSRC daemonset. Kernel routes are automatically configured so that the pods traffic will always be routed via the SPK pod where you can apply consistent, namespace-aware network policies, source NAT translation, and other controls. If the "watched" application workload is deleted, the corresponding host rules also get removed. Technical Overview This section will provide an overview of how to configure the above scenario. Host Prerequisites On the host, two shims of type macvlan bridges are created on physical interfaces, one for the application pod's calico traffic and one for the macvlan traffic, which will forward packets on to SPK. These interfaces allow connectivity to the SPK's "internal" and "external2" interfaces, respectively. ip link add spk-shim link ens224 type macvlan mode bridge ip addr add 10.1.30.244/24 dev shim1 ip link set shim1 up ip link add spk-shim2 link ens256 type macvlan mode bridge ip addr add 10.1.10.244/24 dev shim2 ip link set shim2 up Application Prerequisites and Configuration In theSPK controller values.yaml file, configure your application workload namespaces in the watchNamespace block. watchNamespace: - "spk-apps" - "spk-apps-2" Since we want SPK to do the source NAT for pod egress traffic, we create an IPPool with natOutgoing set to false. This IPPool will be used by the applications. apiVersion: crd.projectcalico.org/v1 kind: IPPool metadata: name: app-ip-pool spec: cidr: 10.124.0.0/16 ipipMode: Always natOutgoing: false Ensure that the application namespaces are annotated like below to use the IPPool. kubectl annotate namespace spk-apps "cni.projectcalico.org/ipv4pools"=[\"app-ip-pool\"] kubectl annotate namespace spk-apps-2 "cni.projectcalico.org/ipv4pools"=[\"app-ip-pool\"] Deploy your application. See below for an example deployment manifest for the application. Note that I'm attaching a secondary macvlan interface, which is in addition to the default calico interface. It will get an IP address automatically as configured in the corresponding NetworkAttachmentDefinition. Note the specific labels used by SPK, which allows you to enable traffic routing to SPK on a per application basis. Additionally, the enableSecureSPK=true label will instruct SPK to create additional listeners that will pick up traffic coming from the pod's secondary macvlan interface. (Will show these listeners later) apiVersion: apps/v1 kind: Deployment metadata: name: nginx annotations: spec: selector: matchLabels: app: nginx replicas: 2 template: metadata: annotations: k8s.v1.cni.cncf.io/networks: '[ { "name": "macvlan-conf-ens256-myapp1" } ]' labels: app: nginx enableSecureSPK: "true" enablePseudoCNI: "true" secureSPKPort: "8050" secureSPKCNFPodIfName: "net1" secondaryCNINodeIfName: "spk-shim2" primaryCNINodeIfName: "spk-shim" secureSPKNetAttachDefName: "macvlan-conf-ens256" secureSPKEgressVlanName: "external" SPK Configuration Deploy the custom resource that will configure a listener that does two things: listen for traffic coming from the internal vlan, or the calico interface of targeted application pods SNAT the traffic so that the source IP is an IP address of SPK apiVersion: "k8s.f5net.com/v1" kind: F5SPKEgress metadata: name: egress-crd namespace: ns-f5-spk spec: #leave commented out for snat automap #egressSnatpool: "snatpool-1" dualStackEnabled: false maxTmmReplicas: 1 vlans: vlanList: [internal] disableListedVlans: false Next, we deploy the CSRC Daemonset that dynamically creates the kernel rules and routes for us. Note that I am setting the daemonsetMode to "pseudoCNI" which means I want to route both primary (calico) and secondary interface traffic to SPK. values-csrc.yaml image: repository: gitlab.tky.lab:5050/registry/spk/200 # daemonset mode, regular, secureSPK, or pseudoCNI #daemonsetMode: "regular" daemonsetMode: "pseudoCNI" ipFamily: "ipv4" imageCredentials: name: f5-common-pull-creds config: iptableid: 200 interfacename: "spk-shim" #tmmpodinterfacename: "internal" json: ipPoolCidrInfo: cidrList: - name: cluster-cidr0 value: "172.21.107.192/26" - name: cluster-cidr1 value: "172.21.66.0/26" - name: cluster-cidr2 value: "10.124.18.192/26" - name: node-cidr0 value: "10.1.11.0/24" - name: node-cidr1 value: "10.1.10.0/24" ipPoolList: - name: default-ipv4-ippool value: "172.21.64.0/18" - name: spk-app1-pool value: "10.124.0.0/16" Testing You can then log onto the worker node that is hosting the applications and confirm the routes and rules are created. Essentially, the rules are making calico interfaces use a custom route table that ensures that the default route is via the SPK. # ip rule 0: from all lookup local 32254: from all to 172.21.107.192/26 lookup main 32254: from all to 172.21.66.0/26 lookup main 32254: from all to 10.124.18.192/26 lookup main 32254: from all to 172.28.15.0/24 lookup main 32254: from all to 10.1.10.0/24 lookup main 32257: from 10.124.18.207 lookup ns-f5-spkshim1ipv4257 <--match on app pod1 calico IP!!! 32257: from 10.124.18.211 lookup ns-f5-spkshim1ipv4257 <--match on app pod2 calico IP!!! 32258: from 10.1.10.171 lookup ns-f5-spkshim2ipv4258 <--match on app pod1 macvlan IP!!! 32258: from 10.1.10.170 lookup ns-f5-spkshim2ipv4258 <--match on app pod2 macvlan IP!!! 32766: from all lookup main 32767: from all lookup default # ip route show table ns-f5-spkshim1ipv4257 default via 10.1.30.242 dev shim1 10.1.30.242 via 10.1.30.242 dev shim1 # ip route show table ns-f5-spkshim2ipv4258 default via 10.1.10.160 dev shim2 10.1.10.160 via 10.1.10.160 dev shim2 If I then try to execute a curl command towards a server that exists in a network segment beyond SPK, the application pod will hit the CSRC-configured ip rule and then forwarded to its new default gateway, which is SPK. Since SPK has Source NAT enabled, the "Client IP" from the server perspective is the self-IP of SPK. This means you can apply firewall policies to application workloads in a deterministic way as well as have visibility into what kind of traffic is coming out of your clusters. k exec -it nginx-7d7699f86c-hsx48 -n my-app1 -- curl 10.1.70.30 ================================================ ___ ___ ___ _ | __| __| | \ ___ _ __ ___ /_\ _ __ _ __ | _||__ \ | |) / -_) ' \/ _ \ / _ \| '_ \ '_ \ |_| |___/ |___/\___|_|_|_\___/ /_/ \_\ .__/ .__/ |_| |_| ================================================ Node Name: F5 Docker vLab Short Name: server.tky.f5se.com Server IP: 10.1.70.30 Server Port: 80 Client IP: 10.1.30.242 Client Port: 59248 Client Protocol: HTTP Request Method: GET Request URI: / host_header: 10.1.70.30 user-agent: curl/7.88.1 A simple tcpdump command run in the debug container of SPK confirms that the pod's calico interface IP (10.124.18.192) is the source IP of the incoming traffic on SPK, and after Source NAT is applied using the self-IP of SPK (10.1.30.242), the packet is sent out towards the server. /tcpdump -nni 0.0 tcp port 80 ----snip---- 12:34:51.964200 IP 10.124.18.192.48194 > 10.1.70.30.80: Flags [P.], seq 1:75, ack 1, win 225, options [nop,nop,TS val 4077628853 ecr 777672368], length 74: HTTP: GET / HTTP/1.1 in slot1/tmm0 lis=egress-ipv4 port=1.1 trunk= ----snip---- 12:34:51.964233 IP 10.1.30.242.48194 > 10.1.70.30.80: Flags [P.], seq 1:75, ack 1, win 225, options [nop,nop,TS val 4077628853 ecr 777672368], length 74: HTTP: GET / HTTP/1.1 out slot1/tmm0 lis=egress-ipv4 port=1.1 trunk= Let's take a look at egress application traffic that is using the secondary macvlan interface. In this case, I have not configured Source NAT so SPK will forward the traffic out, retaining the original pod IP. k exec -it nginx-7d7699f86c-g4hpv -n my-app1 -- curl 10.1.80.30 ================================================ ___ ___ ___ _ | __| __| | \ ___ _ __ ___ /_\ _ __ _ __ | _||__ \ | |) / -_) ' \/ _ \ / _ \| '_ \ '_ \ |_| |___/ |___/\___|_|_|_\___/ /_/ \_\ .__/ .__/ |_| |_| ================================================ Node Name: F5 Docker vLab Short Name: ue-client3 Server IP: 10.1.80.30 Server Port: 80 Client IP: 10.1.10.170 Client Port: 56436 Client Protocol: HTTP Request Method: GET Request URI: / host_header: 10.1.80.30 user-agent: curl/7.88.1 Another tcpdump command run in the debug container of SPK shows that it receives the above GET request and sends it out without Source NAT in this case. /tcpdump -nni 0.0 tcp port 80 ----snip---- 13:54:40.389281 IP 10.1.10.170.56436 > 10.1.80.30.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 4087715696 ecr 61040149], length 74: HTTP: GET / HTTP/1.1 in slot1/tmm0 lis=secure-egress-ipv4-virtual-server port=1.2 trunk= ----snip---- 13:54:40.389305 IP 10.1.10.170.56436 > 10.1.80.30.80: Flags [P.], seq 1:75, ack 1, win 229, options [nop,nop,TS val 4087715696 ecr 61040149], length 74: HTTP: GET / HTTP/1.1 out slot1/tmm0 lis=secure-egress-ipv4-virtual-server port=1.2 trunk= You can use the familiar tmctl command inside the debug container of SPK to confirm the statistics for both listeners that process the pod's primary (egress-ipv4) and secondary (secure-egress-ipv4-virtual-server) interface egress traffic. /tmctl -f /var/tmstat/blade/tmm0 virtual_server_stat -s name,clientside.bytes_in,clientside.bytes_out,no_staged_acl_match_accept -w 200 name clientside.bytes_in clientside.bytes_out no_staged_acl_match_accept ---------------------------------------------- ------------------- -------------------- -------------------------- secure-egress-ipv4-virtual-server 394 996 1 egress-ipv4 394 1011 1 Now that you have egress traffic routed to the SPK data plane pods, you can use the below F5 published custom resource definitions (CRDs) to apply granular access control lists (ACLs) to meet your security requirements. The firewall configuration is defined as code (YAML manifests) so it natively integrates with K8s and portable across clusters. F5BigContextGlobal: CRD to define the default global firewall behavior and reference the firewall policy. F5BigFwPolicy: CRD to define your firewall rules. In summary, the above diagrams and configuration snippets show how SPK can capture all egress traffic in a dynamic way so that you don't have to sacrifice security and control in your ever-changing Kubernetes clusters.82Views0likes0CommentsWhat is Web Cache Exploitation?
Let’s talk about Web Cache Exploitation. There was a presentation done at BlackHat/DefCon 2024 discussing this, and here is the link to a writeup done by the presenter: https://portswigger.net/research/gotta-cache-em-all That article details how different HTTP servers and proxies react when presented with specially crafted URLs. These discrepancies have the potential to be used for use in different types of web cache attacks. My goal here is to give a brief overview and discuss further about how NGINX can be involved in this as well as mitigations that are possible. As such, it is a good idea to reference that article as I am only summarizing pieces of it here. Especially since the researcher did such a great job of writing this up. Definitions: First, here are a few terms that will be used in this article: Web caching — the process of storing copies of web files either on the user’s device or in a third-party device such as a proxy or Content Delivery Network (CDN). The purpose of this is to speed up the serving of static content by presenting it from the store instead of the backend server. This saves time and resources. Web caches use keys to determine which responses should be stored or not. These usually use the URL in some fashion, then map to the stored response. Web Cache Poisoning — the act of inserting fake content into the cache, causing clients to pull content they were not intending to inadvertently. Web Cache Deception — the act of tricking the backend server to place dynamic content into a cache thinking that it was static. This can be especially bad if the data is intended for an authenticated user. Delimiters — one or more characters in a sequence that indicate a separation (end/beginning) of the elements in a stream of text or data. An example of this could be the question mark in a URI indicating that a query is starting. Normalization - concerning web traffic, the process of standardizing data for consistency across network paths. We see this a lot with web traffic using % notation for certain characters, such as %20 for a space. Detecting Delimiters and Normalization: The article describes that the RFC (https://datatracker.ietf.org/doc/html/rfc3986) states which characters are used as delimiters. The issue is that the RFC is very permissive and allows each instance to add to that list. They then give a few examples of how to detect the delimiters that backend servers or caches use. This can then help to determine if there is a discrepancy between them. For example: the article shows sending a request for /home and then a request for /home$abcd to see if the response is the same or not. This can also be used to see if the cached request is served up when specific delimiters are used. The second discrepancy that the article discusses is with normalization. Using delimiters, the path is extracted and then it is normalized to determine any encoded character or dot-segments that may be used. I will explain what those are: Encoding is used sometimes when a delimiter character needs to be interpreted by the application rather than the HTTP parser. For example: %2F used instead of a forward slash /. Dot-segment normalization is a way to reference a resource from a relative path. Also referred to as a path traversal a lot of the time. For example: ../ used to move back to one directory. The RFC says how to code URLs and handle dot-segments. But it doesn’t say how a request should be forwarded or changed, which makes it hard to tell which vendors agree with each other. Similar to what was done in the delimiter section, the article gives different examples of how to detect discrepancies in decoding behavior. For example: the article gives a table that lists different cache proxies as well as HTTP servers and how each treats a request for /hello..%2fworld. NGINX resolves this to /world whereas Apache does not normalize it at all. Deception: Cache rules are used to determine if a resource is static and should be stored or not. The discrepancies mentioned in the last section can be leveraged to exploit cache rules possibly leading to dynamic content being stored. The article describes different data attributes that cache proxies may use to determine if a resource is static or not. These include static extensions, static directories, and static files. Static extensions may include file types such as .css, .js, .pdf, and more. Some proxies may have rules setup that cause these extensions to allow caching. An example given in the article is where the dollar sign is a delimiter on the backend server but not the proxy. This can cause the response to a specific path to be cached when it should not be. Normalization discrepancies can be used to exploit this as well by encoding a delimiter. Example: request for /account$static.css will be stored by the proxy due to the .css extension, but due to the delimiter, the response from the backend is for /account which may be a client's authorized account data. Static directory rules are those that match the path used for the request. Some common examples are /static, /shared, /media, and more.. This is similar to static extensions, where delimiter discrepancies and normalization discrepancies can be used for exploitation. This involves hiding a path traversal after a character that is a delimiter on the backend server. The static directory is then placed after the path traversal, causing the proxy to resolve it but not the backend server. Example: request: /account$/..%2Fstatic/any cache proxy sees: /static/any backend server sees: /account Static files are files that may not necessarily be in a static directory or have a static extension but are expected to stay static on every site. Examples of these files are /robots.txt or /favicon.ico. Exploiting these types of rules is similar to how static directories are exploited. In other words, this example would look like the previous except replace 'static/any' with 'robots.txt'. Poisoning: If the attacker can get a cache to store a specific response to the key that the cache is using, then they can steer users to that response when they visit. Delimiters and normalization can be exploited to carry out cache poisoning. By combining these with cache poisoning, it could be possible to modify a cache key to point to a highly visited site. There are many ways to combine these to try and use this. These include key normalization and delimiters used by both the backend server and the cache on the frontend. Key normalization may happen before the cache key is generated. This can allow for poisoning of the mapped resource if the backend server is interpreting the path differently. This is similar to our above example for static directories. If a path traversal is placed between the path for the backend server and the path you want cached, you may be able to map one to the other. Example: URL: /path/../../home Cache Key: /home Backend Server: /path As this shows, it is possible to create the cache with a key pointing to /home but returns the response for /path. So, when a user visits /home they will not receive the page expected, but instead they will get the page that the malicious actor wanted them to get. Server delimiters can be used for this when the cache is not using the same delimiter. This allows for the creation of a key for the response as the delimiter will prevent the backend server from fully resolving the path. This is similar to the last example, but with the delimiter placed before the path traversal. Example: URL: /path$/../home Cache Key: /home Backend Server: /path Cache delimiters are harder since special characters that the browser will allow are harder to find for web caches. The pound sign can do this, though, as some caches use it as a delimiter. This is similar to the previous example but would be the other way around as the backend server path would be last after the traversal. Example: URL: /path#../home Cache Key: /path Backend Server: /home Mitigation/Defense: The first thing to note is that none of this means that vendors are doing anything wrong with their products. The differences in how each handles normalization and delimiters is expected given the freedom to add their own options. Also, I mentioned that I would further discuss how NGINX could be involved in these kinds of attacks. Naturally, as NGINX can be used as a proxy and a web server, it can be involved in these types of transactions. So it really falls on how NGINX handles normalization and delimiters when compared to a web cache being used in the same path. The author of that article does a great job of comparing multiple vendors for backend servers, CDNs, and frameworks. The first defense would be to try and use products that will align in how they parse data to try and prevent as many opportunities as possible for this to happen. The next defense and probably the best design choice would be to add a cache control to your pages to prevent caching of pages that should never be cached. This would mean adding a 'Cache-Control' header with values of 'no-store' and 'private' to any dynamically generated responses. Then also ensure that any of the cache rules cannot override the header that is set. Another option would be to add a WAF into the path of the traffic. Just looking at a lot of the requests used in these examples, I can see that ASM/Advanced WAF or NGINX App Protect would be pretty effective at stopping a lot of these requests. Path traversal and meta-character One thing that was discussed in the article in regard to NGINX was how it handles the newline-encoded byte (%0A) in a rewrite rule. This byte is used as a path delimiter in NGINX. A common use of the rewrite rule is to use the regex of (.*) to write the rest of the path to then new location. For example: rewrite /path/.(*) /newpath/$1 break; This will work in most situations, but if the newline byte is added then it will stop at that delimiter. For example: /path/test%0abcde ---> /newpath/test You can see how it gets cut off after the encoded byte is hit. I did some research on this and found a similar situation with the return rule in NGINX.https://reversebrain.github.io/2021/03/29/The-story-of-Nginx-and-uri-variable/ This blog shows how the Carriage Return Line Feed (CRLF) can be used to inject a header into the response. I tested this by firing up an NGINX container, and adding a location configuration to my nginx.conf file like this: server { location /static/ { return 302 http://localhost$uri; } I then send a request with the encoded CRLF (%0D%0A) and then the header I want injected after that: curl "http://127.0.0.1:8081/static/%0d%0aX-Foo:%20CLRF" -v * Trying 127.0.0.1:8081... * Connected to 127.0.0.1 (127.0.0.1) port 8081 > GET /static/%0d%0aX-Foo:%20CLRF HTTP/1.1 > Host: 127.0.0.1:8081 > User-Agent: curl/8.6.0 > Accept: */* > < HTTP/1.1 302 Moved Temporarily < Server: nginx/1.27.0 < Date: Thu, 15 Aug 2024 18:15:46 GMT < Content-Type: text/html < Content-Length: 145 < Connection: keep-alive < Location: http://localhost/static/ < X-Foo: CLRF <-----header injected < <html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx/1.27.0</center> </body> </html> * Connection #0 to host 127.0.0.1 left intact That blog also describes how to avoid that happening by changing the return directive to use $request_uri instead of $uri or $document_uri. This made me wonder if it was possible to similarly modify the rewrite directive to avoid the issue with the newline-encoded byte being used as a path delimiter. After searching, I found this page in GitHub:https://github.com/kubernetes/ingress-nginx/issues/11607 Which then links to: https://trac.nginx.org/nginx/ticket/2452 These pages are discussing this issue with using the newline-encoded byte as a delimiter. The response in the ticket was to use this regex (?s) to enable single-line mode. I re-configured my NGINX container to add another couple of locations so I could test this: server { location /static/ { return 302 http://localhost$uri; } location /user/ { rewrite /user/(.*) /account/$1 redirect; } location /test/ { rewrite /test/(?s)(.*) /account/$1 redirect; } So now I have two rewrite directives, one for testing the issue and one for testing the workaround. Now send a request and see if it works. curl "http://127.0.0.1:8081/user/%0d%0aX-Foo:%20CLRF" -v * Trying 127.0.0.1:8081... * Connected to 127.0.0.1 (127.0.0.1) port 8081 > GET /user/%0d%0aX-Foo:%20CLRF HTTP/1.1 > Host: 127.0.0.1:8081 > User-Agent: curl/8.6.0 > Accept: */* > < HTTP/1.1 302 Moved Temporarily < Server: nginx/1.27.0 < Date: Thu, 15 Aug 2024 18:56:48 GMT < Content-Type: text/html < Content-Length: 145 < Location: http://127.0.0.1/account/%0D <---Newline delimiter was hit. < Connection: keep-alive < <html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx/1.27.0</center> </body> </html> * Connection #0 to host 127.0.0.1 left intact For the first test, it cutoff at the newline-encoded byte as expected. Now to test the workaround. curl "http://127.0.0.1:8081/test/%0d%0aX-Foo:%20CLRF" -v * Trying 127.0.0.1:8081... * Connected to 127.0.0.1 (127.0.0.1) port 8081 > GET /test/%0d%0aX-Foo:%20CLRF HTTP/1.1 > Host: 127.0.0.1:8081 > User-Agent: curl/8.6.0 > Accept: */* > < HTTP/1.1 302 Moved Temporarily < Server: nginx/1.27.0 < Date: Thu, 15 Aug 2024 19:32:50 GMT < Content-Type: text/html < Content-Length: 145 < Location: http://127.0.0.1/account/%0D%0AX-Foo:%20CLRF <-------Appears to have worked. < Connection: keep-alive < <html> <head><title>302 Found</title></head> <body> <center><h1>302 Found</h1></center> <hr><center>nginx/1.27.0</center> </body> </html> * Connection #0 to host 127.0.0.1 left intact Changing regular expressions to enable single-line mode prevents the possibility of any confusion being introduced by newline characters. This is just an FYI as I thought it was interesting to see issues raised in the past by others and what suggestions were given. Last Thoughts: First of all, I would like to thank Michael Hedges and Parker Green, both from F5 Networks for bringing this to our attention. As shown in the examples and the article written by the researcher, these types of attacks are not extremely difficult to carry out and can have very significant ramifications in specific scenarios. As such, taking this into account when setting up a site is key. This would include the configuration of pages to use cache controls and which vendors to use for both web servers as well as web caching proxies. The article I referenced at the beginning does a good job of breaking down how each vendor handles different scenarios. That makes for a great reference point to start with.29Views0likes0CommentsHow I Did it - Migrating Applications to Nutanix NC2 with F5 Distributed Cloud Secure Multicloud Networking
In this edition of "How I Did it", we will explore how F5 Distributed Cloud Services (XC) enables seamless application extension and migration from an on-premises environment to Nutanix NC2 clusters.117Views1like0CommentsScuba Gear from CISA, ROBLOX Malware Campaign, and RUST backdoo-rs
Hello, this week Jordan_Zebor is your editor looking at the notable security news for Scuba Gear from CISA, a ROBLOX Malware Campaign, & a Rust based meterpreter named Backdoo-rs. Scuba Gear from CISA ScubaGear is a CISA-developed tool designed to assess and verify whether a Microsoft 365 (M365) tenant’s configuration aligns with the Secure Cloud Business Applications (SCuBA) Security Configuration Baseline. This tool ensures that organizations are following CISA’s recommended security settings for cloud environments, helping to identify vulnerabilities or misconfigurations in their M365 setup. The value of running ScubaGear lies in its ability to enhance an organization’s cybersecurity posture, mitigate risks, and maintain compliance with security standards, which is crucial for protecting sensitive data in cloud-based systems. ScubaGear addresses the growing need for secure cloud deployments by automating the assessment process, making it easier for IT and security teams to identify gaps and take corrective actions. Regular assessments with this tool can help reduce the chances of data breaches, unauthorized access, and other security threats, thereby maintaining the integrity and confidentiality of business operations. Additionally, it supports organizations in staying ahead of compliance requirements by ensuring they meet the security baselines recommended by CISA. ROBLOX Malware Campaign Checkmarx recently discovered a year-long malware campaign targeting Roblox developers through malicious npm packages that mimic the popular “noblox.js” library. The attackers used tactics like brandjacking and typosquatting to create malicious packages that appeared legitimate, aiming to steal sensitive data like Discord tokens, deploy additional payloads, and maintain persistence on compromised systems. Despite efforts to remove these packages, new versions keep appearing on the npm registry, indicating an ongoing threat. RUST backdoo-rs The article "Learning Rust for Fun and backdoo-rs" describes the author's journey of learning Rust by developing a custom meterpreter. While Rust is designed to avoid common programming errors, ensuring software is secure from the outset, the choice of using it to create red teaming tools is also a great use case. A key aspectI covered recently is how Rust helps eliminate vulnerabilities like buffer overflows and use-after-free errors. These are traditionally common in C and C++, but Rust's ownership model prevents such risks by ensuring safe memory usage. In addition, Rust's growing adoption in the cybersecurity community, driven by companies like Google and Microsoft, emphasizes its role in secure software development, underscoring the "secure by design" principles that CISA advocates for. Projects like "backdoo-rs" demonstrate Rust’s potential for secure, reliable development in any context.152Views2likes0CommentsHow to Identify and Manage Scrapers (Pt. 2)
Introduction Welcome back to part two of the article on how to identify and manage scrapers. While part one focused on ways to identify and detect scrapers, part two will highlight various approaches to prevent, manage, and reduce scraping. 9 Ways to Manage Scrapers We'll start by highlighting some of the top methods used to manage scrapers to help you find the method best suited for your use case. 1. Robots.txt The robots.txt file on a website contains rules for bots and scrapers, but it lacks enforcement power. Often, scrapers ignore these rules, scraping data they want. Other scraper management techniques are needed to enforce compliance and prevent scrapers from ignoring these rules. 2. Site, App, and API Design to Limit Data Provided to Bare Minimum To manage scrapers, remove access to desired data, which may not always be feasible due to business-critical requirements. Designing websites, mobile apps, and APIs to limit or remove exposed data effectively reduces unwanted scraping. 3. CAPTCHA/reCAPTCHA CAPTCHAs (including reCAPTCHA and other tests) are used to manage and mitigate scrapers by presenting challenges to prove human identity. Passing these tests grants access to data. However, they cause friction and decrease conversion rates. With advancements in recognition, computer vision, and AI, scrapers and bots have become more adept at solving CAPTCHAs, making them ineffective against more sophisticated scrapers. 4. Honey Pot Links Scrapers, unlike humans, can see hidden elements on a web page, such as form fields and links. Security teams and web designers can add these to web pages, allowing them to respond to transactions performed by scrapers, such as forwarding them to a honeypot or providing incomplete results. 5. Require All Users to be Authenticated Most scraping occurs without authentication, making it difficult to enforce access limits. To improve control, all users should be authenticated before data requests. Less motivated scrapers may avoid creating accounts, while sophisticated scrapers may resort to fake account creation. F5 Labs published an entire article series focusing on fake account creation bots. These skilled scrapers distribute data requests among fake accounts, adhering to account-level request limits. Implementing authentication measures could discourage less-motivated scrapers and improve data security. 6. Cookie/Device Fingerprint-Based Controls To limit user requests, cookie-based tracking or device/TLS fingerprinting can be used, but they are invisible to legitimate users and can't be used for all users. Challenges include cookie deletion, collisions, and divisions. Advanced scrapers using tools like Browser Automation Studio (BAS) have anti-fingerprint capabilities including fingerprint switching, which can help them bypass these types of controls. 7. WAF Based Blocks and Rate Limits (UA and IP) Web Application Firewalls (WAFs) manage scrapers by creating rules based on user agent strings, headers, and IP addresses, but are ineffective against sophisticated scrapers who use common user agent strings, large numbers of IP addresses, and common header orders. 8. Basic Bot Defense Basic bot defense solutions use JavaScript, CAPTCHA, device fingerprinting, and user behavior analytics to identify scrapers. They don't obfuscate signals collection scripts, encrypt, or randomize them, making it easy for sophisticated scrapers to reverse engineer. IP reputation and geo-blocking are also used. However, these solutions can be bypassed using new generation automation tools like BAS and puppeteer, or using high-quality proxy networks with high reputation IP addresses. Advanced scrapers can easily craft spoofed packets to bypass the defense system. 9. Advanced Bot Defense Advanced enterprise-grade bot defense solutions use randomized, obfuscated signals collection to prevent reverse engineering and tamper protection. They use encryption and machine learning (ML) to build robust detection and mitigation systems. These solutions are effective against sophisticated scrapers, including AI companies, and adapt to varying automation techniques, providing long-term protection against both identified and unidentified scrapers. Scraper Management Methods/Controls Comparison and Evaluation Table 1 (below) evaluates scraper management methods and controls, providing a rating score (out of 5) for each, with higher scores indicating more effective control. Control Pros Cons Rating Robot.txt +Cheap +Easy to implement +Effective against ethical bots -No enforcement -Ignored by most scrapers 1 Application redesign +Cheap -Not always feasible due to business need 1.5 CAPTCHA +Cheap +Easy to implement -Not always feasible due to business need 1.5 Honey pot links +Cheap +Easy to implement -Easily bypassed by more sophisticated scrapers 1.5 Require authentication +Cheap +Easy to implement +Effective against less motivated scrapers -Not always feasible due to business need -Results in a fake account creation problem 1.5 Cookie/fingerprint based controls +Cheaper than other solutions +Easier to implement +Effective against low sophistication scrapers -High risk of false positives from collisions -Ineffective against high to medium sophistication scrapers 2 Web Application Firewall +Cheaper than other solutions +Effective against low to medium sophistication scrapers -High risk of false positives from UA, header or IP based rate limits -Ineffective against high to medium sophistication scrapers 2.5 Basic bot defense +Effective against low to medium sophistication scrapers -Relatively expensive -Ineffective against high sophistication scrapers -Poor long term efficacy -Complex to implement and manage 3.5 Advanced bot defense +Effective against the most sophisticated scrapers +Long term efficacy -Expensive -Complex to implement and manage 5 Conclusion There are many methods of identifying and managing scrapers, as highlighted above, each with its pros and cons. Advanced bot defense solutions, though costly and complex, are the most effective against all levels of scraper sophistication. To read the full article in its entirety, including more detail on all the management options described here, head over to our post on F5 Labs.18Views0likes0CommentsSSL Orchestrator Advanced Use Cases: Detecting Generative AI
Introduction Quick, take a look at the following list and answer this question: "What do these movies have in common?" 2001: A Space Odyssey Westworld Tron WarGames Electric Dreams The Terminator The Matrix Eagle Eye Ex Machina Avengers: Age of Ultron M3GAN If you answered, "They're all about artificial intelligence", yes, but... If you answered, "They're all about artificial intelligence that went terribly, sometimes horribly wrong", you'd be absolutely correct. The simple fact is...artificial intelligence (AI) can be scary. Proponents for, and opponents against will disagree on many aspects, but they can all at least acknowledge there's a handful of ways to do AI correctly...and a million ways to do it badly. Not to be an alarmist, but whileSkyNet was fictional, semi-autonomousguns on robot dogs is not... But then why am I talking about this on a technical forum you may ask? Well, when most of the above films were made, AI was largely still science fiction. That's clearly not the case anymore, and tools like ChatGPT are just the tip of the coming AI frontier. To be fair, I don't make the claim that all AI is bad, and many have indeed lauded ChatGPT and other generative AI tools as the next great evolution in technology. But it's also fair to say that generative AI tools, like ChatGPT, have a very real potential to cause harm. At the very least, these tools can be convincing, even when they're wrong. And worse, they could lead to sensitive information disclosures. One only has to do a cursory search to find a few examples of questionable behavior: Lawyers File Motion Written by AI, Face Sanctions and Possible Disbarment Higher Ed Beware: 10 Dangers of ChatGPT Schools Need to Know ChatGPT and AI in the Workplace: Should Employers Be Concerned? OpenAI's New Chatbot Will Tell You How to Shoplift and Make Explosives Giant Bank JP Morgan Bans ChatGPT Use Among Employees Samsung Bans ChatGPT Among Employees After Sensitive Code Leak But again...what does this have to do with a technical forum? And more important, what does this have to do with you? Simply stated, if you are in an organization where generative AI toolscould be abused, understanding, and optionally controlling how and when these tools are accessed, could help to prevent the next big exploit or disclosure. If you search beyond the above links, you'll find an abundance of information on both the benefits, and security concerns of AI technologies. And ultimately you'll still be left to decide if these AI tools are safe for your organization. It may simply be worthwhile to understand WHAT tools are being used. And in some cases, it may be important to disable access to these. Given the general depth and diversity of AI functions within arms-reach today, and growing, it'd be irresponsible to claim "complete awareness". The bulk of these functions are delivered over standard HTTPS, so the best course of action will be to categorize on known assets, and adjust as new ones come along. As of the publishing of this article, the industry has yet to define a standard set of categories for AI, and specifically, generative AI. So in this article, we're going to build one and attach that to F5 BIG-IP SSL Orchestrator to enable proactive detection and optional control of Internet-based AI tool access in your organization. Let's get started! BIG-IP SSL Orchestrator Use Case: Detecting Generative AI The real beauty of this solution is that it can be implemented faster than it probably took to read the above introduction. Essentially, you're going to create a custom URL category on F5 BIG-IP, populate that with known generative AI URLs, and employ that custom category in a BIG-IP SSL Orchestrator security policy rule. Within that policy rule, you can elect to dynamically decrypt and send the traffic to the set of inspection products in your security enclave. Step 1: Create the custom URL category and populate with known AI URLs - Access the BIG-IP command shell and run the following command. This will initiate a script that creates and populates the URL category: curl -s https://raw.githubusercontent.com/f5devcentral/sslo-script-tools/main/sslo-generative-ai-categories/sslo-create-ai-category.sh |bash Step 2: Create a BIG-IP SSL Orchestrator policy rule to use this data - The above script creates/re-populates a custom URL category named SSLO_GENERATIVE_AI_CHAT, and inthat category is a set of known generative AI URLs. To use, navigate to the BIG-IP SSL Orchestrator UI and edit a Security Policy. Click add to create a new rule, use the "Category Lookup (All)" policy condition, then add the above URL category. Set the Action to "Allow", SSL Proxy Action to "Intercept", and Service Chain to whatever service chain you've already created. With Summary Logging enabled in the BIG-IP SSL Orchestrator topology configuration, you'll also get Syslog reporting for each AI resource match - who made the request, to what, and when. The URL category is employed here to identify known AI tools. In this instance, BIG-IP SSL Orchestrator is used to make that assessment and act on it (i.e. allow, TLS intercept, service chain, log). Should you want even more granular control over conditions and actions of the decrypted AI tool traffic, you can also deploy an F5 Secure Web Gateway Services policy inside the SSL Orchestrator service chain. With SWG, you can expand beyond simple detection and blocking, and build more complex rules to decide who can access, when, and how. It should be said that beyond logging, allowing, or denying access to generative AI tools, SSL Orchestrator is also going to provide decryption and the opportunity to dynamically steer the decrypted AI traffic to any set of security products best suited to protect against any potential malware. Summary As previously alluded, this is not an exhaustive list of AI tool URLs. Not even close. But it contains the most common you'll see in the wild. The above script populates with an initial list of URLs that you are free to update as you become aware of new one. And of course we invite you to recommend additional AI tools to add to this list. References:https://github.com/f5devcentral/sslo-script-tools/tree/main/sslo-generative-ai-categories1.9KViews4likes1Comment