Securely Scale RAG - Azure OpenAI Service, F5 Distributed Cloud and NetApp
Arguably, the easiest and most massively scalable approach to harnessing Large Language Models (LLMs) is to consume leading services like OpenAI endpoints, the most well-known of cloud-based offering delivered to enterprises over the general Internet. Access to hardware, such as GPUs, and the significant skillset to run LLMs on your own become non-issues, consumption is simply an API call away. One concern, and a serious one, is that sensitive inferencing (AI prompts, both the requests and responses) travels "in the wild" to these LLMs found through DNS at public endpoints. Retrieval Augmented Generation (RAG) adds potentially very sensitive corporate data to prompts, to leverage AI for internal use cases, thus ratcheting up even further the uneasiness with using the general Internet as a conduit to reach LLMs. RAG is a popular method to greatly increase the accuracy and relevancy of generative AI for a company’s unique set of problems. Finally, to leverage sensitive data with RAG, the source documents must be vectorized with similarly remote “embedding” LLMS; once again sensitive, potentially proprietary sensitive data will leave the corporate premises to leverage the large AI solutions like OpenAI or Azure OpenAI. Unlike purveyors of locally executed models, say a repository like Huggingface.com, which allow downloading of binaries to be harnessed on local compute, industry leading solutions like OpenAI and Azure OpenAI Service are founded on the paradigm of remote compute. Beyond the complexity and resources of quickly and correctly setting up performant on-prem models one time, the choice to consume remote endpoints allows hassle-free management like models perpetually updated to latest revisions and full white-glove support available to enterprise customers consuming SaaS AI models. In this article, an approach will be presented where, using F5 Distributed Cloud (XC) and NetApp, Azure OpenAI Service can be leveraged with privacy, where prompts are carried over secured, encrypted tunnels over XC between on-premises enterprise locations and that enterprise’s Azure VNET. The Azure OpenAI models are then exclusively exposed as private endpoints within that VNET, nowhere else in the world. This means both the embedding LLM activity to vectorize sensitive corporate data, and the actual generative AI prompts to harness value from that data are encrypted in flight. All source data and resultant vector databases remain on-premises in well-known solutions like a NetApp ONTAP storage appliance. Why is the Azure OpenAI Service a Practical Enabler of AI Projects? Some of the items that distinguish Azure OpenAI Service include the following: Prompts sent to Azure OpenAI are not forwarded to OpenAI, the service exists within Microsoft Azure, benefiting from the performance of Microsoft’s enormous cloud computing platform Customer prompts are never used for training data to build new or refine existing models Simplified billing, think of the Azure OpenAI Service as analogous to an “all you can eat buffet”, simply harness the AI service and settle the charge incurred on a regular monthly billing cycle With OpenAI, models are exposed at universal endpoints shared by a global audience, added HTTP headers such as the OPENAI_API_KEY value distinguish users and allow billing to occur in accordance with consumption. Azure OpenAI Service is slightly different. No models actually exist to be used until they are setup under an Azure subscription. At this point, beyond receiving an API key to identify the source user, the other major difference is unique API "base" URL (endpoint) is setup for accessing LLMs an organization wishes to use. Examples would be a truly unique enterprise endpoint for GPT-3.5-Turbo, GPT4 or perhaps an embedding LLM used in vectorization, such as the popular text-embedding-ada-002 LLM. This second feature of Azure OpenAI Service presents a powerful opportunity to F5 Distributed Cloud (XC) customers. This stems from the fact that unlike traditional OpenAI, this per-organization, unique base URL for API communications does not have to be projected into the global DNS, reachable from anywhere on the Internet. Instead, Microsoft Azure allows the OpenAI service to be constrained to a private endpoint, accessible only from where the customer chooses. Leveraging F5 XC Multicloud Networking offers a way to secure and encrypt communications between on-premises locations and Azure subnets only available from within the organization. What does this add up to for the enterprise with generative AI projects? It means huge scalability for AI services and consuming the very much leading-edge modern OpenAI models, all in a simple manner an enterprise can now consume today with limited technical onus on corporate technology services. The sense of certainty that sensitive data is not cavalierly exposed on the Internet is a critical cog in the wheel of good data governance. Tap Into Secure Data from NetApp ONTAP Clusters for Fortified Access to OpenAI Models The F5 Distributed Cloud global fabric consists of points of presence in 26+ metropolitan markets worldwide, such as Paris, New York, Singapore, that are interconnected with high-speed links aggregating to more than 14 Tbps of bandwidth in total, it is growing quarterly. With the F5 multicloud networking (MCN) solution, customers can easily set up dual-active encrypted tunnels (IPSec or SSL) to two points on the global fabric. The instances connected to are referred to as RE’s (Regional Edge nodes) and the customer-side sites are made up of CE’s (Customer Edge nodes, scalable from one to a full cluster). The service is a SaaS solution and setup is turn-key based upon menu click-ops or Terraform. The customer sites, beyond being in bricks-and-mortar customer data centers and office locations, can also exist within cloud locations such as Microsoft Azure Resource Groups or AWS VPCs, among others. Enterprise customers with existing bandwidth solutions may choose to directly interconnect sites as opposed to leveraging the high-speed F5 global fabric. The net result of an F5 XC Layer 3 multicloud network is high-speed, encrypted communications between customer sites. By disabling the default network access provided by Azure OpenAI Service, and only allowing private endpoint access, one can instantiate a private approach to running workloads with well-known OpenAI models. With this deployment in place, customers may tap into years of data acquired and stored on trusted on-premises NetApp storage appliances to inject value into AI use cases, customized and enhanced inference results using well-regarded, industry-leading OpenAI models. A perennial industry leader in storage is ONTAP from NetApp, a solution that can safely expose volumes to file systems, through protocols such as NFS and SMB/CIFS. The ability to also expose LUNs, meaning block-level data that constitutes remote disks, is also available using protocols like iSCSI. In the preceding diagram, one can leverage AI through a standard Python approach, in the case shown harnessing an Ubuntu Linux server, and volumes provided by ONTAP. AI jobs, rather than calling out to an Internet-routed Azure OpenAI public endpoint can instead interact with a private endpoint, one which resolves through private DNS to an address on a subnet behind a customer Azure CE node. This endpoint cannot be reached from the Internet, it is restricted to only communicating with customer subnets (routes) located in the L3 multicloud deployment. In use cases that leverage one’s own data, a leading approach is Retrieval Augmented Generation (RAG) in order to empower Large Language Models (LLMs) to deliver niche, hyper-focused responses pertaining to specialized, sometimes proprietary, documents representing the corporate body of knowledge. Simple examples might include highly detailed, potentially confidential, company-specific information distilled from years of financial internal reporting. Another prominent early use case of RAG is to backstop frontline, customer helpdesk employees. With customers sensitive to delays in handling support requests, and pressure to reduce support staff research delays, the OpenAI LLM can harvest only relevant knowledge base (KB) articles, releases notes, and private engineering documents not normally exposed in their entirety. The net result is a much more effective helpdesk experience, with precise, relevant help provided to the support desk employee in seconds. RAG Using Microsoft Azure OpenAI, F5 and NetApp in Nutshell In the sample deployment, one of the more important items to recognize is that two OpenAI models will be harnessed, an embedding LLM and a generative transformer based GPT family LLM. A simple depiction of RAG would be as follows: Using OpenAI Embedding LLMs The OpenAI embedding modeltext-embedding-ada-002 is used first to vectorize data sourced from the on-premises ONTAP system, via NFS volumes mounted to the server hosting Python. The embedding model consumes “chunks” of text from each sourced document and converts the text to numbers, specifically long sequences of numbers, typically in the range of 700 to 1,500 values. These are known as vectors. The vectors returned in the private OpenAI calls are then stored in a vector database, in this case ChromaDB was used. It is important to note, the ChromaDB itself was directed to install itself within a volume supported by the on-premises ONTAP cluster, as such the content at rest is governed by the same security governance as the source content in its native format. Other common industry solutions for vector storage and searches include Milvus and for those looking to cloud-hosted vectors Pinecone. Vector databases are purpose-built to manage vector embeddings. Conventional databases can, in fact, store vectors but the art of doing a semantic search, finding similarities between vectors, would then require vector indices solutions. One of the best known in FAISS (Facebook AI Similarity Search) which is a library that allows developers to quickly search for embeddings of multimedia documents. These semantic searches would otherwise be inefficient or impossible with standard database engines (SQL). When a prompt is first generated by a client, the text in the prompt is vectorized by the very same OpenAI embedding model, producing a vector on the fly. The key to RAG, the “retriever” function, then compares the newly arrived query with semantically similar text chunks in the database. The actual semantic similarity of the query and previously stored chunks is arrived at through a nearest neighbor search of the vectors, in other words, phrases and sentences that might augment the original prompt can be provided to the OpenAI GPT model. The art of finding semantic similarities relies upon comparing the lengthy vectors. The objective, for instance, to find supportive text around the user query “how to nurture shrub growth” might reasonably align more closely with a previously vectorized paragraph that included “gardening tips for the North American spring of 2024” and less so with vectorized content stemming from a user guide for the departmental photocopy machine. The suspected closeness of vectors, are text samples actually similar topic wise, is a feature of semantic similarity search algorithms, many exist in themarketplace and two approaches commonly leveraged are cosine similarity and Euclidean distance; a brief description for those interested can be found here. The source text chunks corresponding to vectors are retained in the database and it is this source text that augments the prompt after the closest neighbor vectors are calculated. Using OpenAI GPT LLMs Generative Pre-trained Transformer (GPT) refers to a family of LLMs created by OpenAI that are built on atransformer architecture. The specific OpenAI model used in this model is not necessarily the latest, premium model, GPT-4o and GPT-4 Turbo are more recent, however the utilized gpt-35-turbo model is a good intersection of price versus performance and has been used extensively in deployed projects. With the retriever function helping to build an augmented prompt, the default use case documented included three text chunks to buttress the original query. The OpenAI prompt response will not only be infused with the provided content extracted from the customer but unlike normal GPT responses, RAG will have specific attributions to which documents and specific paragraphs led to the response. Brief Overview of Microsoft OpenAI Service Setup Microsoft Azure has a long history of adding innovative new functions as subscribed “opt in” service resources, the Azure OpenAI Service is no different. A thorough, step-by-step guide to setting up the OpenAI service can be found here. This screenshot demonstrates the rich variety of OpenAI models available within Azure, specifically showing the Azure OpenAI Studio interface, highlighting models such as gpt-4, gpt-4o and dall-e-3. In this article, two models are added, one embedding and the other GPT. The following OpenAI Service Resource screen shows the necessary information to actually use our two models. This information consists of the keys (use either KEY1 and KEY2, both can be seen and copied with the Show Keys button) and the unique, per customer endpoint path, frequently referred to as the base URL by OpenAI users. Perhaps the key Azure feature that empowers this article is the ability to disable network access to the configured OpenAI model, as seen below. With traditional network access disabled, we can then enable private endpoint access and set the access point to a network interface on the private subnet connected to the inside interface of our F5 Distributed Cloud CE node. The following re-visits the earlier topology diagram, with focus upon where the Azure OpenAI service interacts with our F5 Distributed Cloud multicloud network. The steps involved in setting up an Azure site in F5 Distributed Cloud are found here. The corresponding steps for configuring an on-premises Distributed Cloud site are found in this location. Many options exist, such as using KVM or a bare metal server, the link provided highlights the VMware ESXi approach to on-premises site creation. Demonstrating RAG in Action using OpenAI Models with a Secure Private Endpoint The RAG setup, in lieu of vectorizing actual private and sensitive documents, utilized the OpenAI embedding LLM to process chunks taken from the classic H.G. Wells 1895 science fiction novel “The Time Machine” in text or markdown format. The novel is one of many in the public domain through the Gutenberg Project. Two NFS folders supported by the NetApp ONTAP appliance in a Redmond, Washington office were used: one for source content and one for supporting the ChromaDB vector database. The NFS mounts are seen below, with the Megabytes consumed and remaining available seen per volume, the ONTAP address can be seen as 10.50.0.220. (Linux Host) #df -h 10.50.0.220:/RAG_Source_Documents_2024 1.9M 511M 1% /mnt/rag_source_files 10.50.0.220:/Vectors 17M 803M 3% /home/sgorman/langchain-rag-tutorial-main/chroma2 The creation of the vector database was handled by one Python script and the actual AI prompts generated against the OpenAI gpt-35-turbo model were housed in another script. This may often make sense, as the vector database creation may be an infrequently run script, only executed when new source content is introduced (/mnt/rag_source_files) whereas the generative AI tasks targeting gpt-3.5-turbo are likely run continuously for imperative business needs like helpdesk or code creations, as example purposes. Creating the vector database first entails preparing the source text, typically remove extraneous formatting or less than valuable text fields, think of boilerplate statements such as repetitive footnotes or perhaps copyright/privacy statements that might be found on every single page of some corporate documents. The next step is to create text chunks for embedding, the tradeoff of using too short chunks will be lack of semantic meaning in any one chunk and a growth in the vector count. Using overly long chunks, on the other hand, could lead to lengthy augmented prompts sent to gpt-35-turbo that significantly grow the token count for requests, although many models now support very large token counts a common value remains a total, for requests and responses, of 4,096 tokens. Token counts are the foundation for most billing formulae of endpoint-based AI models. Finally, it is important to have some degree of overlap of generated chunks such that meanings and themes within documents are not lost; if an idea is fragmented at the demarcation point of adjacent chunks the model may not pickup on its importance. The vectorization script for “The Time Machine” resulted in 978 chunks being created from the source text, with character counts per chunk not to exceed 300 characters. The text splitting function is loaded from LangChain and the pertinent code lines include: from langchain.text_splitter import RecursiveCharacterTextSplitter text_splitter = RecursiveCharacterTextSplitter( chunk_size=300, chunk_overlap=100, } The values of 100 characters of overlaps suggests each chunk will incorporate 200 characters of new text within the total of 300 grabbed. It is important to remember all characters, even white space, count towards totals. As per the following screenshot, the source novel, when split into increments of 200 new characters per chunk does indicate 978 chunks were indeed a correct total (double click to expand). With the source data vectorized and secure on the NetApp appliance, the actual use of the gpt-35-turbo OpenAI model could commence. The following shows an example, where the model is instructed in the system prompt to only respond via information it can glean from the RAG augmented prompt text, the response portions shown in red font. python3 create99.py “What is the palace of green porcelain?” <response highlights below, the response also included the full text chunks RAG suggested would potentially support the LLM in answering the posed question> Answer the question based on the above context: What is the palace of green porcelain? Response: content='The Palace of Green Porcelain is a deserted and ruined structure with remaining glass fragments in its windows and corroded metallic framework.' response_metadata={'token_usage': {'completion_tokens': 25, 'prompt_tokens': 175, 'total_tokens': 200}, 'model_name': 'gpt-35-turbo', The response of gpt-35-turbo is correct, and we see that the token consumption is heavily slanted towards the request (the “prompt”), with 175 tokens used whereas the response required only 25 tokens. The key takeaways are that the prompt and its response did not travel hop-by-hop over the Internet to a public endpoint, all traffic traveled with VPN-like security from the on-premises server and ONTAP to a private Azure subnet using F5 Distributed Cloud. The OpenAI model was utilized as a private endpoint, corresponding a network interface available only on that private subnet and not found within the global DNS, only the private corporate DNS or /etc/hosts files. Adding Laser Precision to RAG Using the default chunking strategy did lead to sub-optimal results, when ideas, themes and events were lost across chunk boundaries, even when including some degree of overlap. The following is one example: A key moment in the H.G. Wells book involves the protagonist meeting a character Weena, who provides strange white flowers which are pocketed. Upon returning to the present time, the time traveler relies upon the exotic and foreign look of the white flowers to attempt to prove to friends the veracity of his tale. # python3 query99.py “What did Weena give the Time Traveler?” As captured in the response below, the chunks provided by RAG do not provide all the details, only that something of note was pocketed, but gpt-35-turbo can therefore not return a sufficient answer as the full details are not provided in the augmented prompt. The screenshot shows first the three chunks and at the end the best answer the LLM could provide (double click to expand). The takeaway is that some effort will be required to adjust the vectorization process to pick optimally large chunk sizes, and sufficient numbers to properly empower the OpenAI model. In this demonstration, based upon vectors and their corresponding text, only three text chunks were harnessed to augment the user prompt. By increasing this number to 5 or 10, and increasing each of the chunk sizes, all of course at the expense of token consumption, one would expect more accurate results from the LLM. Summary This article demonstrated a more secure approach to using OpenAI models as a programmatic endpoint service in which proprietary company information can be kept secure by not using the general purpose, insecure Internet to provide prompts for vectorization and general AI inquiries. Instead, an approach was followed where the Azure OpenAI service was deployed as a private endpoint, exclusively available at an address on a private subnet within an enterprise’s Azure subscription, a subnet with no external access. By utilizing F5 Distributed Cloud Multicloud Networking, existing corporate locations and data centers can be connected to that enterprise’s Azure resource groups and private, encrypted communications can take place between these networks, the necessary routing and tunneling technologies are deployed in a turn-key manner without requiring advanced network skillsets. When leveraging NetApp ONTAP as the continued enterprise storage solution, RAG deployments based upon Azure OpenAI service can continue to be managed and secured with well-developed storage administration skills. In this example, ONTAP housed both the source, sensitive enterprise content and the actual vector database resulting from interactions with the Azure OpenAI embedding LLM. Subsequent to a discussion on vectors and optimal chunking strategies, RAG was utilized to answer questions on private documents using the well-known OpenAI chat-35-turbo model.252Views2likes1CommentF5 Distributed Cloud Customer Edge on F5 rSeries – Reference Architecture
Traditionally, to advertise an application to the internet or to connect applications across multi-cloud environments enterprises must configure and manage multiple networking and security devices from different vendors in the DMZ of the data center. CE on F5 rSeries is a single vendor, converged solution for all enterprise multi-cloud application connectivity and security needs.678Views2likes2CommentsF5 Distributed Cloud (XC) Global Applications Load Balancing in Cisco ACI
Introduction F5 Distributed Cloud (XC) simplify cloud-based DNS management with global server load balancing (GSLB) and disaster recovery (DR). F5 XC efficiently directs application traffic across environments globally, performs health checks, and automates responses to activities and events to maintain high application performance with high availability and robustness. In this article, we will discuss how we can ensure high application performance with high availability and robustness by using XC to load-balance global applications across public clouds and Cisco Application Centric Infrastructure (ACI) sites that are geographically apart. We will look at two different XC in ACI use cases. Each of them uses a different approach for global applications delivery and leverages a different XC feature to load balance the applications globally and for disaster recovery. XC DNS Load Balancer Our first XC in ACI use case is very commonly seen where we use a traditional network-centric approach for global applications delivery and disaster recovery. We use our existing network infrastructure to provide global applications connectivity and we deploy GSLB to load balance the applications across sites globally and for disaster recovery. In our example, we will show you how to use XC DNS Load Balancer to load-balance a global application across ACI sites that are geographically dispersed. One of the many advantages of using XC DNS Load Balancer is that we no longer need to manage GSLB appliances. Also, we can expect high DNS performance thanks to XC global infrastructure. In addition, we have a single pane of glass, the XC console, to manage all of our services such as multi-cloud networking, applications delivery, DNS services, WAAP etc. Example Topology Here in our example, we use Distributed Cloud (XC) DNS Load Balancer to load balance our global application hello.bd.f5.com, which is deployed in a hybrid multi-cloud environment across two ACI sites located in San Jose and New York. Here are some highlights at each ACI site from our example: New York location XC CE is deployed in ACI using layer three attached with BGP XC advertises custom VIP 10.10.215.215 to ACI via BGP XC custom VIP 10.10.215.215 has an origin server 10.131.111.88 on AWS BIG-IP is integrated into ACI BIG-IP has a public VIP 12.202.13.149 that has two pool members: on-premise origin server 10.131.111.161 XC custom VIP 10.10.215.215 San Jose location XC CE is deployed in ACI using layer three attached with BGP XC advertises custom VIP 10.10.135.135 to ACI via BGP XC custom VIP 10.10.135.135 has an origin server 10.131.111.88 on Azure BIG-IP is integrated into Cisco ACI BIG-IP has a public VIP 12.202.13.147 that has two pool members: on-premise origin server 10.131.111.55 XC custom VIP 10.10.135.135 *Note:Click here to review on how to deploy XC CE in ACI using layer three attached with BGP. DNS Load Balancing Rules A DNS Load Balancer is an ingress controller for the DNS queries made to your DNS servers. The DNS Load Balancer receives the requests and answers with an IP address from a pool of members based on the configured load balancing rules. On the XC console, go to "DNS Management"->"DNS Load Balancer Management" to create a DNS Load Balancer and then define the load balancing rules. Here in our example, we created a DNS Load Balancer and defined the load balancing rules for our global application hello.bd.f5.com (note: as a prerequisite, F5 XC must be providing primary DNS for the domain): Rule #1: If the DNS request tohello.bd.f5.com comes from United States or United Kingdom, respond with BIG-IP VIP 12.203.13.149 in the DNS response so that the application traffic will be directed to New York ACI site and forwarded to an origin server that is located in AWS or on-premise: Rule #2: If the DNS request tohello.bd.f5.com comes from United States or United Kingdom and if New York ACI site become unavailable, respond with BIG-IP VIP 12.203.13.147 in the DNS response so that the application traffic will be directed to San Jose ACI site and forwarded to an origin server that is located on-premise or in Azure: Rule #3: If the DNS request tohello.bd.f5.comcomes from somewhere outside of United States or United Kingdom, respond with BIG-IP VIP 12.203.13.147 in the DNS response so that the application traffic will be directed to San Jose ACI and forwarded to an origin server that is located on-premise or in Azure: Validation Now, let's see what happens. When a machine located in the United States tries to reach hello.bd.f5.comand if both ACI sites are up, the traffic is directed to New York ACI site and forwarded to an origin server that is located on-premise or in AWS as expected: When a machine located in the United States tries to reachhello.bd.f5.comand if the New York ACI site is down or becomes unavailable, the traffic is re-directed to San Jose ACI site and forwarded to an origin server that is located on-premise or in Azure as expected: When a machine tries to accesshello.bd.f5.com from outside of United States or United Kingdom, it is directed to San Jose ACI site and forwarded to an origin server that is located on-premise or in Azure as expected: On the XC console, go to"DNS Management"and select the appropriate DNS Zone to view theDashboardfor information such as the DNS traffic distribution across the globe, the query types etc andRequests for DNS requests info: XC HTTP Load Balancer Our second XC in ACI use case uses a different approach for global applications delivery and disaster recovery. Instead of using the existing network infrastructure for global applications connectivity and utilizing XC DNS Load Balancer for global applications load balancing, we simplify the network layer management by securely deploying XC to connect our applications globally and leveraging XC HTTP Load Balancer to load balance our global applications and for disaster recovery. Example Topology Here in our example, we use XC HTTP load balancer to load balance our global applicationglobal.f5-demo.com that is deployed across a hybrid multi-cloud environment. Here are some highlights: XC CE is deployed in each ACI site using layer three attached with BGP New York location: ACI advertises on-premise origin server 10.131.111.161 to XC CE via BGP San Jose location: ACI advertises on-premise origin server 10.131.111.55 to XC CE via BGP An origin server 10.131.111.88 is located in AWS An origin server 10.131.111.88 is located in Azure *Note:Click here to review on how to deploy XC CE in ACI using layer three attached with BGP. XC HTTP Load Balancer On the XC console, go to “Multi-Cloud App Connect” -> “Manage” -> “Load Balancers” -> “HTTP Load Balancers” to “Add HTTP Load Balancer”. In our example, we created a HTTPS load balancer named globalwith domain name global.f5-demo.com. Instead of bringing our own certificate, we took advantage of the automatic TLS certificate generation and renewal supported by XC: Go to “Origins” section to specify the origin servers for the global application. In our example, we included all origin servers across the public clouds and ACI sites for our global application global.f5-demo.com: Next, go to “Other Settings” -> “VIP Advertisement”. Here, select either “Internet” or “Internet (Specified VIP)” to advertise the HTTP Load Balancer to the Internet. In our example, we selected “Internet” to advertise global.f5-demo.com globally because we decided not to manage nor to acquire a public IP: In our first use case, we defined a set of DNS load balancing rules on the XC DNS Load Balancer to direct the application traffic based on our requirement: If the request toglobal.f5-demo.com comes from United States or United Kingdom, application traffic should be directed to an origin server that is located on-premise in New York ACI site or in AWS. If the request toglobal.f5-demo.com comes from United States or United Kingdom and if the origin servers in New York ACI site and AWS become unavailable, application traffic should be re-directed to an origin server that is located on-premise in San Jose ACI site or in Azure. If the request toglobal.f5-demo.com comes from somewhere outside of United States or United Kingdom, application traffic should be directed to an origin server that is located on-premise in San Jose ACI site or in Azure. We can accomplish the same with XC HTTP Load Balancer by configuring Origin Server Subset Rules. XC HTTP Load Balancer Origin Server Subset Rules allow users to create match conditions on incoming source traffic to the XC HTTP Load Balancer and direct the matched traffic to the desired origin server(s). The match condition can be based on country, ASN, regional edge (RE), IP address, or client label selector. As a prerequisite, we create and assign a label (key-value pair) to an origin server so that we can specify where to direct the matched traffic to in reference to the label in Origin Server Subset Rules. Go to “Shared Configuration” -> “Manage” -> “Labels” -> “Known Keys” and “Add Know Key” to create labels. In our example, we created a key named jy-key with two labels: us-uk and other : Now, go to "Origin pool" under “Multi-Cloud App Connect” and apply the labels to the origin servers: In our example, origin servers in New York ACI site and AWS are labeledus-uk while origin servers in San Jose ACI site and Azure are labeled other : Then, go to “Other Settings” to enable subset load balancing. In our example, jy-key is our origin server subsets class, and we configured to use default subset original pool labeled other as our fallback policy choice based on our requirement that is if the origin servers in New York ACI site and AWS become unavailable, traffic should be directed to an origin server in San Jose ACI site or Azure: Next, on the HTTP Load Balancer, configure the Origin Server Subset Rules by enabling “Show Advanced Fields” in the "Origins" section: In our example, we created following Origin Server Subset Rules based on our requirement: us-uk-rule: If the request toglobal.f5-demo.com comes from United States or United Kingdom, direct the application traffic to an origin server labeled us-uk that is either in New York ACI site or AWS. other-rule: If the request to global.f5-demo.comdoes not come from United States or United Kingdom, direct the application traffic to an origin server labeled other that is either in San Jose ACI site or Azure. Validation As a reminder, we use XC automatic TLS certificate generation and renewal feature for our HTTPS load balancer in our example. First, let's confirm the certificate status: We can see the certificate is valid with an auto renew date. Now, let’s run some tests and see what happens. First, let’s try to access global.f5-demo.com from United Kingdom: We can see the traffic is directed to an origin server located in New York ACI site or AWS as expected. Next, let's see what happens if the origin servers from both of these sites become unavailable: The traffic is re-directed to an origin server located in San Jose ACI site or Azure as expected. Last, let’s try to access global.f5-demo.com from somewhere outside of United States or United Kingdom: The traffic is directed to an origin server located in San Jose ACI site or Azure as expected. To check the requests on the XC Console, go to "Multi-Cloud App Connect" -> “Performance” -> "Requests" from the selected HTTP Load Balancer. Below is a screenshot from our example and we can see the request to global.f5-demo.com came from Australia was directed to the origin server 10.131.111.55 located in San Jose ACI site based on the configured Origin Server Subset Rules other-rule: Here is another example that the request came from United States was sent to the origin server 10.131.111.88 located in AWS based on the configured Origin Server Subset Rules us-uk-rule: Summary F5 XC simplify cloud-based DNS management with global server load balancing (GSLB) and disaster recovery (DR). By deploying F5 XC in Cisco ACI, we can securely deploy and load balance our global applications across ACI sites (and public clouds) efficiently while maintaining high application performance with high availability and robustness among global applications at all times. Related Resources *On-Demand Webinar*Deploying F5 Distributed Cloud Services in Cisco ACI Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Three Attached Deployment Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Two Attached Deployment361Views0likes0CommentsDeploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Two Attached Deployment
Introduction F5 Distributed Cloud (XC) Services are SaaS-based security, networking, and application management services that can be deployed across multi-cloud, on-premises, and edge locations. This article will show you how you can deploy F5 Distributed Cloud’s Customer Edge (CE) site in Cisco Application Centric Infrastructure (ACI) so that you can securely connect your application and distribute the application workloads in a Hybrid Multi-Cloud environment. F5 XC Layer Two Attached CE in Cisco ACI Besides Layer Three Attached deployment option, which we discussed in another article, a F5 Distributed Cloud Customer Edge (CE) site can also be deployed with Layer Two Attached in Cisco ACI environment using an ACI Endpoint of an Endpoint Group (EPG). As a reminder, Layer Two Attached is one of the deployment models to get traffic to/from a F5 Distributed Cloud CE site, where the CE can be a single node or a three-nodes cluster. F5 Distributed Cloud supports Virtual Router Redundancy Protocol (VRRP) for virtual IP (VIP) advertisement. When VRRP is enabled for VIPs advertisement, there is a VRRP Master for each of the VIPs and the VRRP Master for each of the VIPs can possibly be distributed across the CE nodes within the cluster. In this article, we will look at how we can deploy a Layer Two Attached CE site in Cisco ACI. F5 XC VRRP Support for VIPs Advertisement F5 XC Secure Mesh Sites are specifically engineered for non-cloud CE deployments, which support additional configurations that are not available using Fleet or regular Site management functionalities such as VRRP for VIPs advertisement. We recommend Secure Mesh Sites for non-cloud CE deployment and specifically, in Layer Two Attached CE deployment model, we recommend deploying CE site as a Secure Mesh Site to take advantage of the VRRPs support for VIPs advertisement. With VRRP enabled for VIPs advertisement, one of the CE nodes within the cluster will become the VRRP Master for a VIP and starts sending gratuitous ARPs (GARPS) while the rest of the CE nodes will become the VRRP Backup. Please note that in CE software, VRRP virtual MAC is not used for the VIP. Instead, the CE node, which is the VRRP Master for the VIP uses its physical MAC address in ARP responses for the VIP. When a failover happens, a VRRP Backup CE will become the new VRRP Master for the VIP and starts sending GARPs to update the ARP table of the devices in the broadcast domain. As of today, there isn't a way to configure the VRRP priority and the VRRP Master assignment is at random. Thus, if there are multiple VIPs, it is possible that a CE node within the cluster can be the VRRP Master for one or more VIPs, or none. F5 XC Layer Two Attached CE in ACI Example In this section, we will use an example to show you how to successfully deploy a Layer Two Attached CE site in Cisco ACI fabric so that you can securely connect your application and distribute the application workloads in a Hybrid Multi-Cloud environment. Topology In our example, CE is a three nodes cluster (Master-0, Master-1 and Master-2) which connects to the ACI fabric using an endpoint of an EPG namedexternal-epg: Example reference - ACI EPG external-epg endpoints table: HTTP load balancersite2-secure-mesh-cluster-app has a Custom VIP of 172.18.188.201/32 epg-xc.f5-demo.com with workloads 10.131.111.66 and 10.131.111.77 in the cloud (Azure) and it advertises the VIP to the CE site: F5 XC Configuration of VRRP for VIPs Advertisement To enable VRRP for VIPs advertisement, go to "Multi-Cloud Network Connect" -> "Manage" -> "Site Management" -> "Secure Mesh Sites" -> "Manage Configuration" from the selected Secure Mesh Site: Next, go to "Network Configuration" and select "Custom Network Configuration" to get to "Advanced Configuration" and make sure "Enable VRRP for VIP(s)" is selected for VIP Advertisement Mode: Validation We can now securely connect to our application: Note from above, after F5 XC is deployed in Cisco ACI, we also use F5 XC DNS as our primary nameserver: To check the requests on the F5 XC Console, go to"Multi-Cloud App Connect" -> "Overview: Applications" to bring out our HTTP load balancer, then go to "Performance Monitoring" -> "Requests": *Note: Make sure you are in the right namespace. As a reminder, VRRP for VIPs advertisement is enabled in our example. From the request shown above, we can see that CE node Master-2 is currently the VRRP Master for VIP 172.18.188.201 and if we go to the APIC, we can see the VIP is learned in the ACI endpoint table for EPG external-epgtoo: Example reference - a sniffer capture of GARP from CE node Master-2 for VIP 172.18.188.201: Summary A F5 Distributed Cloud Customer Edge (CE) site can be deployed with Layer Two Attached deployment model in Cisco ACI environment using an ACI Endpoint of an Endpoint Group (EPG). Layer Two Attached deployment model can be more desirable and easier for CE deployment when compared to Layer Three Attached. It is because Layer Two Attached does not require layer three/routing which means one less layer to take care of and it also brings the applications closer to the edge. With F5 Distributed Cloud Customer Edge (CE) site deployment, you can securely connect your on-premises to the cloud quickly and efficiently. Next Check out this video for some examples of Layer Two Attached CE use cases in Cisco ACI: Related Resources *On-Demand Webinar* Deploying F5 Distributed Cloud Services in Cisco ACI F5 Distributed Cloud (XC) Global Applications Load Balancing in Cisco ACI Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Three Attached Deployment Customer Edge Site - Deployment & Routing Options Cisco ACI Endpoint Learning White Paper420Views0likes0CommentsDeploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Three Attached Deployment
Introduction F5 Distributed Cloud (XC) Services are SaaS-based security, networking, and application management services that can be deployed across multi-cloud, on-premises, and edge locations. This article will show you how you can deploy F5 Distributed Cloud Customer Edge (CE) site in Cisco Application Centric Infrastructure (ACI) so that you can securely connect your application in Hybrid Multi-Cloud environment. XC Layer Three Attached CE in Cisco ACI A F5 Distributed Cloud Customer Edge (CE) site can be deployed with Layer Three Attached in Cisco ACI environment using Cisco ACI L3Out. As a reminder, Layer Three Attached is one of the deployment models to get traffic to/from a F5 Distributed Cloud CE site, where the CE can be a single node or a three nodes cluster. Static routing and BGP are both supported in the Layer Three Attached deployment model. When a Layer Three Attached CE site is deployed in Cisco ACI environment using Cisco ACI L3Out, routes can be exchanged between them via static routing or BGP. In this article, we will focus on BGP peering between Layer Three Attached CE site and Cisco ACI Fabric. XC BGP Configuration BGP configuration on XC is simple and it only takes a couple steps to complete: 1) Go to "Multi-Cloud Network Connect" -> "Networking" -> "BGPs". *Note: XC homepage is role based, and to be able to configure BGP, "Advanced User" is required. 2) "Add BGP" to fill out the site specific info, such as which CE Site to run BGP, its BGP AS number etc., and "Add Peers" to include its BGP peers’ info. *Note: XC supports direct connection for BGP peering IP reachability only. XC Layer Three Attached CE in ACI Example In this section, we will use an example to show you how to successfully bring up BGP peering between a F5 XC Layer Three Attached CE site and a Cisco ACI Fabric so that you can securely connect your application in Hybrid Multi-Cloud environment. Topology In our example, CE is a three nodes cluster(Master-0, Master-1 and Master-2) that has a VIP 10.10.122.122/32 with workloads, 10.131.111.66 and 10.131.111.77, in the cloud (AWS): The CE connects to the ACI Fabricvia a virtual port channel (vPC) that spans across two ACI boarder leaf switches. CE and ACI Fabric are eBGP peers via an ACI L3Out SVI for routes exchange. CE is eBGP peered to both ACI boarder leaf switches, so that in case one of them is down (expectedly or unexpectedly), CE can still continue to exchange routes with the ACI boarder leaf switch that remains up and VIP reachability will not be affected. XC BGP Configuration First, let us look at the XC BGP configuration ("Multi-Cloud Network Connect" -> "Networking" -> "BGPs"): We"Add BGP" of "jy-site2-cluster" with site specific BGP info along with a total of six eBGP peers (each CE node has two eBGP peers; one to each ACI boarder leaf switch): We "Add Item" to specify each of the six eBPG peers’ info: Example reference - ACI BGP configuration: XC BGP Peering Status There are a couple of ways to check the BGP peering status on the F5 Distributed Cloud Console: Option 1 Go to "Multi-Cloud Network Connect" -> "Networking" -> "BGPs" -> "Show Status" from the selected CE site to bring up the "Status Objects" page. The "Status Objects" page provides a summary of the BGP status from each of the CE nodes. In our example, all three CE nodes from "jy-site2-cluster" are cleared with "0 Failed Conditions" (Green): We can simply click on a CE node UID to further look into the BGP status from the selected CE node with all of its BGP peers. Here, we clicked on the UID of CE node Master-2 (172.18.128.14) and we can see it has two eBGP peers: 172.18.128.11 (ACI boarder leaf switch 1) and 172.18.128.12 (ACI boarder leaf switch 2), and both of them are Up: Here is the BGP status from the other two CE nodes - Master-0 (172.18.128.6) and Master-1 (172.18.128.10): For reference, here is an example of a CE node with "Failed Conditions" (Red) due to one of its BGP peers is down: Option 2 Go to "Multi-Cloud Network Connect" -> "Overview" -> "Sites" -> "Tools" -> "Show BGP peers" to bring up the BGP peers status info from all CE nodes from the selected site. Here, we can see the same BGP status of CE node master-2 (172.18.128.14) which has two eBGP peers: 172.18.128.11 (ACI boarder leaf switch 1) and 172.18.128.12 (ACI boarder leaf switch 2), and both of them are Up: Here is the output of the other two CE nodes - Master-0 (172.18.128.6) and Master-1 (172.18.128.10): Example reference - ACI BGP peering status: XC BGP Routes Status To check the BGP routes, both received and advertised routes, go to "Multi-Cloud Network Connect" -> "Overview" -> "Sites" -> "Tools" -> "Show BGP routes" from the selected CE sites: In our example, we see all three CE nodes (Master-0, Master-1 and Master-2) advertised (exported) 10.10.122.122/32 to both of its BPG peers: 172.18.128.11 (ACI boarder leaf switch 1) and 172.18.128.12 (ACI boarder leaf switch 2), while received (imported) 172.18.188.0/24 from them: Now, if we check the ACI Fabric, we should see both 172.18.128.11 (ACI boarder leaf switch 1) and 172.18.128.12 (ACI boarder leaf switch 2) advertised 172.18.188.0/24 to all three CE nodes, while received 10.10.122.122/32 from all three of them (note "|" for multipath in the output): XC Routes Status To view the routing table of a CE node (or all CE nodes at once), we can simply select "Show routes": Based on the BGP routing table in our example (shown earlier), we should see each CE node has two Equal Cost Multi-Path (ECMP) installed in the routing table for 172.18.188.0/24: one to 172.18.128.11 (ACI boarder leaf switch 1) and one to 172.18.128.12 (ACI boarder leaf switch 2) as the next-hop, and we do (note "ECMP" for multipath in the output): Now, if we check the ACI Fabric, each of the ACI boarder leaf switch should have three ECMP installed in the routing table for 10.10.122.122: one to each CE node (172.18.128.6, 172.18.128.10 and 172.18.128.14) as the next-hop, and we do: Validation We can now securely connect our application in Hybrid Multi-Cloud environment: *Note: After F5 XC is deployed, we also use F5 XC DNS as our primary nameserver: To check the requests on the F5 Distributed Cloud Console, go to"Multi-Cloud Network Connect" -> "Sites" -> "Requests" from the selected CE site: Summary A F5 Distributed Cloud Customer Edge (CE) site can be deployed with Layer Three Attached deployment model in Cisco ACI environment. Both static routing and BGP are supported in the Layer Three Attached deployment model and can be easily configured on F5 Distributed Cloud Console with just a few clicks. With F5 Distributed Cloud Customer Edge (CE) site deployment, you can securely connect your application in Hybrid Multi-Cloud environment quickly and efficiently. Next Check out this video for some examples of Layer Three Attached CE use cases in Cisco ACI: Related Resources *On-Demand Webinar*Deploying F5 Distributed Cloud Services in Cisco ACI F5 Distributed Cloud (XC) Global Applications Load Balancing in Cisco ACI Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Two Attached Deployment Customer Edge Site - Deployment & Routing Options Cisco ACI L3Out White Paper1.3KViews4likes0CommentsF5 Hybrid Security Architectures (Part 2 - F5's Distributed Cloud WAF and NGINX App Protect WAF)
Here in this example solution, we will be using Terraform to deploy an AWS Elastic Kubernetes Service cluster running the Arcadia Finance test web application serviced by F5 NGINX Kubernetes Ingress Controller and protected by NGINX App Protect WAF. We will supplement this with F5 Distributed Cloud Web App and API Protection to provide complimentary security at the edge. Everything will be tied together using GitHub Actions for CI/CD and Terraform Cloud to maintain state.5KViews4likes0CommentsAutomate Multicloud Networking w/ Terraform: routing and app connect on F5 Distributed Cloud
Use Terraform and GitHub Actions with F5 Distributed Cloud to deploy infrastructure in multi-cloud environments, establish site-to-site network connectivity, deploy an example Kubernetes micro-services based app, then secure and deliver it publicly. This is an end-to-end fully automated workflow and solution intended for NetOps and DevOps teams supporting large enterprise environments.133Views0likes0CommentsSimplify Network Segmentation for Hybrid Cloud
Introduction Enterprises have always had the need to maintain separate development and production environments. Operational efficiency, reduction of blast radius, security and compliance are generally the common objectives behind separating these environments. By dividing networks into smaller, isolated segments, organizations can enhance security, optimize performance, and ensure regulatory compliance. This article demonstrates a practical strategy for implementing network segmentation in modern multicloud environments that also connect on-prem infrastructure. This uses F5 Distributed Cloud (F5 XC) services to connect and secure network segments in cloud environments like Amazon Web Services (AWS) and on-prem datacenters. Need for Segmentation Network segmentation is critical for managing complex enterprise environments. Traditional methods like Virtual Routing and Forwarding (VRFs) and Multiprotocol Label Switching (MPLS) have long been used to create isolated network segments in on-prem setups. F5 XC ensures segmentation in environments like AWS and it can extend the same segmentation to on-prem environments. These techniques separate traffic, enhance security, and improve network management by preventing unauthorized access and minimizing the attack surface. Scenario Overview Our scenario depicts an enterprise with three different environments (prod, dev, and shared services) extended between on-prem and cloud. A 3rd party entity requires access to a subset of the enterprise's services. This article, covers the following two networking segmentation use-cases: Hybrid Cloud Transit Extranet (servicing external 3 rd party partners/customers) Hybrid Cloud Transit Consider an enterprise with three distinct environments: Production (Prod), Development (Dev), and Shared Services. Each environment requires strict isolation to ensure security and performance. Using F5 XC Cloud Connect, we can assign each VPC a network segment effectively isolating the VPC’s. Segments in multiple locations (or VPC’s) can traverse F5 XC to reach distant locations whether in another cloud environment or on-prem. Network segments are isolated by default, for example, our Prod segment cannot access Shared. A segment connector is needed to allow traffic between Prod and Shared. The following diagram shows the VPC segments, ensuring complete "ships in the night" isolation between environments. In this setup, Prod, Dev, and Shared Services environments operate independently and are completely isolated from one another at the control plane level. This ensures that any issues or attacks in one environment do not affect the others. Customer Requirement: Shared Services Access Many enterprises deploy common services across their organization to support internal workloads and applications. Some examples include DHCP, DNS, NTP, and NFS, services that need to be accessible to both Prod and Dev environments while keeping Prod and Dev separate from each other. Segment Connectors is a method to allow communication between two isolated segments by leaking the routes between the source and destination segments. It is important to note that segment connector can be of type Direct or SNAT. Direct allows bidirectional communication between segments whereas the SNAT option allows unidirectional communication from the source to the destination. Extending Segmentation to On-Premises Enterprises already use segmented networks within their on-premises infrastructure. Extending this segmentation to AWS involves creating similar isolated segments in the cloud and establishing secure communication channels. F5 XC allows you to easily extend this segmentation from on-prem to the cloud regardless of the underlay technology. In this scenario, communication between the on-premises Prod segment and its cloud counterpart is seamless, and the same also applies for the Dev segment. Meanwhile Dev and Prod stay separate ensuring that existing security and isolation is preserved across the hybrid environment. Extranet In this scenario an external entity (customer/partner) needs access to a few applications within our Prod segment. There are two different ways to enable this access, Network-centric and App-centric. Let’s refer to the external entity as Company B. In order to connect Company B we generally need appropriate cloud credentials, but Company B will not share their cloud credentials with us. To solve this problem, F5 XC recommends using AWS STS:AssumeRole functionality whereby Company B creates an AWS IAM Role that trusts F5 XC with the minimum privileges necessary to configure Transit Gateway (TGW) attachments and TGW route table entries to extend access to the F5 XC network or network segments. Section 1 – Network-centric Extranet Many times, partners & customers need to access a unique subset of your enterprise’s applications. This can be achieved with F5 XC’s dedicated network segments and segment connectors. With a segment connector for the external and prod network segments, we can give Company B access to the required HTTP service without gaining broader access to other non-Prod segments. Locking Down with Firewall Policies We can implement a Zero Trust firewall policy to lock down access from the external segment. By refining these policies, we ensure that third-party consumers can only access the services they are authorized to use. Our firewall policy on the CE only allows access from the external segment to the intended application on TCP/80 in Prod. [ec2-user@ip-10-150-10-146 ~]$ curl --head 10.1.10.100 HTTP/1.1 200 OK Server: nginx/1.24.0 (Ubuntu) Date: Thu, 30 May 2024 20:50:30 GMT Content-Type: text/html Content-Length: 615 Last-Modified: Wed, 22 May 2024 21:35:11 GMT Connection: keep-alive ETag: "664e650f-267" Accept-Ranges: bytes [ec2-user@ip-10-150-10-146 ~]$ ping -O 10.1.10.100 PING 10.1.10.100 (10.1.10.100) 56(84) bytes of data. no answer yet for icmp_seq=1 no answer yet for icmp_seq=2 no answer yet for icmp_seq=3 ^C --- 10.1.10.100 ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3153ms After applying the new policies, we confirm that the third-party access is restricted to the intended services only, enhancing security and compliance. This demonstrates how F5 Distributed Cloud services enable networking segmentation across on-prem and cloud environments, with granular control over security policies applied between the segments. Section 2 - App-centric Extranet In the scenario above, Company B can directly access one or more services in Prod with a segment connector and we’ve locked it down with a firewall policy. For the App-centric method, we’ll only publish the intended services that live in Prod to the external segment. App-centric connectivity is made possible without a segment connector by using load balancers within App Connect that target the application within the Prod segment and advertises its VIP address to the external segment. The following illustration shows how to configure each component in the load balancer. Visualization of Traffic Flows The visualization flow analysis tool in the F5 XC Console shows traffic flows between the connected environments. By analyzing these flows, particularly between third-party consumers and the Prod environment, we can identify any unintended access or overreach. The following diagram is for a Network-centric connection flow: This following diagram shows an App-centric connection flow using the load balancer: Product Feature Demo Conclusion Effective network segmentation is a cornerstone of secure and efficient cloud environments. We’ve discussed how F5 XC enables hybrid cloud transit and extranet communication. Extranet can be done with either a network centric or app-centric deployment. F5 XC is an end to end platform that manages and orchestrates end-to-end segmentation and security in hybrid-cloud environments. Enterprises can achieve comprehensive segmentation, ensuring isolation, secure access, and compliance. The strategies and examples provided demonstrate how to implement and manage segmentation across hybrid environments, catering to diverse requirements and enhancing overall network security. Additional Resources More features and guidance are provided in the comprehensive guide below, where showing exactly how you can use the power and flexibility of F5 Distributed Cloud and Cloud Connect to deliver a Network-centric approach with a firewall and an App-centric approach with a load balancer. Create and manage segmented networks inyour own cloud and on-prem environments, and achieve the following benefits: Ability to isolate environments within AWS Ability to extend segmentation to on-prem environments Ability to connect external partners or customers to a specific segment Use Enhanced Firewall Policies to limit access and reduce the blast radius Enhance the compliance and regulatory requirements by isolating sensitive data and systems Visualize and monitor the traffic flows and policies across segments and network domains Workflow Guide - Secure Network Fabric (Multi-Cloud Networking) YouTube: Using network segmentation for hybrid-cloud and extranet with F5 Distributed Cloud Services DevCentral:Secure Multicloud Networking Article Series GitHub: S-MCN Use-case Playbooks (Console, Automation) for F5 Distributed Cloud Customers F5.com: Product Information Product Documentation Network Segmentation Cloud Connect Network Segment Connectors App Security App Networking CE Site Management236Views0likes0CommentsCustomer driven Site Deployment Using AWS and F5 Distributed Cloud Terraform Modules
Introduction and Problem Scope F5 Distributed Cloud Mesh’s Secure Networking provides connectivity and security services for your applications running on the Edge, Private Clouds, or Public Clouds. This simplifies the deployment and configuration of connectivity and security services for your Multi-Cloud and Edge Cloud deployment needs across heterogeneous environments. F5 Distributed Cloud Services leverage the“Site” construct to deploy our Secure Mesh or AppStack Site instances to manage workloads. A Site could be a customer location like AWS, Azure, Google Cloud Platform (GCP), private cloud, or an edge site. To run F5 Distributed Cloud Services, the site needs to be deployed with one or more instances ofF5 Distributed Cloud Node, a software appliance that is managed by F5 Distributed Cloud Console. This site is where customer applications and F5 Distributed Cloud services are running. To deploy a Node, different options are available: Use F5 Distributed Cloud Services Console to deploy a site Leverage F5 Distributed Cloud Services Terraform provider to deploy a site following F5 Distributed Cloud Services Console user experience Use F5 Distributed Cloud Services Terraform modules Documentation of all the different deployment patterns found at https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/how-to/site-management A customer may not want to leverage the above two options since they rely on using F5 Distributed Cloud Services Console. Reasons not to use the mentioned two options could be: Security and Privacy Concerns Data Security: Reluctance to share sensitive data with another organization. Access Keys: Not willing to share cloud provider access keys or credentials. Compliance: Need to comply with specific regulatory requirements (e.g., GDPR (General Data Protection Regulation), HIPAA) that require control over data. Control and Customization Customization: Need for a highly customized orchestration solution tailored to specific requirements, to create networking and service topologies considering brownfield realities Cost and Resource Management: Resource Allocation: Better control over resource allocation and optimization Operational Considerations Support: Preference for internal support and troubleshooting over relying on external support. Uptime and SLAs (Service Level Agreements): Concerns about meeting service level agreements (SLAs) and uptime requirements To be able to roll out a site despite the points mentioned above, it is possible for the customer to manage the lifecycle of a site outside the F5 Distributed Cloud Services Console. F5 Distributed Cloud Services created a set ofterraform modules to help customers manage the lifecycle of a site outside of F5 Distributed Cloud Services Console. Those modules are available at: AWS module Azure module GCP module The F5 Distributed Cloud Services Site Management documentation provides an overview of all available site types and their documentation on the topic of provisioning. Though many topologies could be deployed via F5 Distributed Cloud Services Console, the following AWS, Azure, GCP topologies can only be realized using Terraform modules: Single Node Single NIC existing VPC / subnet and 3rd party NAT GW Single Node Multi NIC existing VPC / subnet and 3rd party NAT GW Three Node Single NIC existing VPC / subnet and 3rd party NAT GW Three Node Multi NIC existing VPC / subnet and 3rd party NAT GW Any other external resource and its attributes that are to be used, e.g. credentials from Vault systems, IAM policies, SSH keys Deployment Scenario in AWS The F5 DevCentral GitHubproject contains Terraform templates to provision greenfield and / or brownfield Customer Edge (CE) topologies in AWS, GCP, and Azure with multiple use case script templates in respective repositories. To exemplify one of the scenarios, in this article, we walk through the journey a customer would undertake to provision a CE site in AWS using Terraformmodules. High-level Sequence workflow All of the AWS, GCP, and Azure scenarios follow similar high-level steps, as shown in Fig.1. Step 1: F5 Distributed Cloud Services tenant needs to be ready and user to access tenant set up. Step 2: Clone the desired AWS, GCP, or Azure repo from F5 DevCentral GitHub project. For AWS, this is https://github.com/f5devcentral/terraform-xc-aws-ce. Each of these repositories contains multiple deployment scenarios called topologies. Each topology is described by its own readme "readme.md" file. The description includes The resource objects that are created Use instructions and all requirements to be able to create the topology. Especially in brownfield environments Step 3: Customize the “terraform.tfvars” file to the customer’s specific context. These include Distributed Cloud specific parameters. The parameters in this file are described in relation to the function it serves for the specific scenario. Step 4: Run through the Init/Plan/Deploy workflow of Terraform deployment and verify the status of the CE Site using F5 Distributed Cloud Services Console. The Terraform reconciliation functions ensure meeting the intended objectives. Fig. 1: High-Level Sequence Diagram Customer deployment topology description We will explain the above steps in the context of a greenfield deployment, the Terraform scripts of which are available here. The corresponding logical topology view of this deployment is shown in Fig.2. This deployment scenario instantiates the following resources: Single-node CE cluster AWS SLO interface AWS VPC AWS SLO interface subnet AWS route tables AWS Internet Gateway Assign AWS EIP to SLO The objective of this deployment is to create a Site with a single CE node in a new VPC for the provided AWS region and availability zone. The CE will be created as an AWS EC2 instance. An AWS subnet is created within the VPC. The CE Site Local Outside (SLO) interface will be attached to the VPC subnet and the created EC2 instance. SLO is a logical interface of a site (CE node) through which reachability is achieved to external (e.g. Internet or other services outside the public cloud site). To enable reachability to the Internet, the default route of the CE node will point to the AWS Internet gateway. Also, the SLO will be configured with an AWS External IP address (Elastic IP). Fig.2. Customer Deployment Topology in AWS Description of input parameters in Terraform vars file Parameters must be customized to adapt to the customer’s environment. The definition of the parameters in the “terraform.tfvars” is as follows: Parameters Definitions owner Identifies the email of the IT manager used to authenticate to the AWS system project_prefix Prefix that will be used to identify the resource objects in AWS and XC. project_suffix The suffix that will be used to identify the site resources in AWS and XC ssh_public_key_file Local file system’s path to ssh public key file f5xc_tenant Full F5XC tenant name f5xc_api_url F5XC API url f5xc_cluster_name Name of the Cluster f5xc_api_p12_file Local file system path to api_cert_file (downloaded from XC Console) aws_region AWS region for the XC Site aws_existing_vpc_id Existing VPC ID (brownfield) aws_vpc_cidr_block CIDR Block of the VPC aws_availability_zone AWS Availability Zone (a) aws_vpc_slo_subnet_node0 AWS Subnet in the VPC for the SLO subnet Configuring other environmental variables Export the following environment variables in the working shell, setting it to customer’s deployment context. Environment Variables Definitions AWS_ACCESS_KEY AWS Access key for authentication AWS_SECRET_ACCESS_KEY AWS Secret key for authentication VES_P12_PASSWORD XC P12 Password from Console TF_VAR_f5xc_api_p12_cert_password Same as VES_P12_PASSWORD Deploy Topology Deploy the topology with: terraform init terraform plan terraform deploy –auto-approve and monitor the status of the Sites on the F5 Distributed Cloud Services Console. Created site object will be available in Secure Mesh Site section of the F5Distributed CloudServices Console. Video-based description of the deployment Scenario This demonstration video shows the procedure for provisioning the deployment topology described above in three steps. <p><iframe src="https://www.youtube.com/watch?v=8_T3dQSEdhc" width="750" height="422" frameborder="0" allowfullscreen></iframe></p> References https://docs.cloud.f5.com/docs-v2/platform/services/mesh/secure-networking https://docs.cloud.f5.com/docs-v2/platform/concepts/site https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/how-to/site-management https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/how-to/site-management/deploy-aws-site-terraform https://docs.cloud.f5.com/docs-v2/multi-cloud-network-connect/troubleshooting/troubleshoot-manual-ce-deployment-registration-issues Note: This project is open source and actively monitored by F5 XC on a best-effort basis. While there is no formal commitment regarding service level agreements (SLA) or support assistance, we encourage the community to report any issues through GitHub. Customers and partners are warmly invited to contribute to the code, fostering a collaborative environment that enhances the project's development and usability.168Views0likes0Comments