Amazon MSK Serverless is a cluster sort of Amazon Managed Streaming for Apache Kafka (Amazon MSK) that makes it easy so that you can run Apache Kafka with out having to handle and scale cluster capability. MSK Serverless robotically provisions and scales compute and storage assets. With MSK Serverless, you should utilize Apache Kafka on demand and pay for the information you stream and retain on a utilization foundation.
Deploying infrastructure throughout a number of VPCs and a number of accounts is taken into account finest apply, facilitating scalability whereas sustaining isolation boundaries. In a multi-account surroundings, Kafka producers and shoppers can exist throughout the identical VPC—nonetheless, they’re typically situated in several VPCs, generally throughout the identical account, in a distinct account, and even in a number of completely different accounts. There’s a want for an answer that may prolong entry to MSK Serverless clusters to producers and shoppers from a number of VPCs throughout the identical AWS account and throughout a number of AWS accounts. The answer must be scalable and simple to take care of.
On this publish, we stroll you thru a number of resolution approaches that handle the MSK Serverless cross-VPC and cross-account entry connectivity choices, and we focus on the benefits and limitations of every strategy.
MSK Serverless connectivity and authentication
When an MSK Serverless cluster is created, AWS manages the cluster infrastructure in your behalf and extends personal connectivity again to your VPCs via VPC endpoints powered by AWS PrivateLink. You bootstrap your connection to the cluster via a bootstrap server that holds a document of all of your underlying brokers.
At creation, a completely certified area identify (FQDN) is assigned to your cluster bootstrap server. The bootstrap server FQDN has the final format of boot-ClusterUniqueID.xx.kafka-serverless.Area.amazonaws.com
, and your cluster brokers comply with the format of bxxxx-ClusterUniqueID.xx.kafka-serverless.Area.amazonaws.com,
the place ClusterUniqueID.xx
is exclusive to your cluster and bxxxx
is a dynamic dealer vary (b0001, b0037, and b0523 will be a few of your assigned brokers at some extent of time). It’s price noting that the brokers assigned to your cluster are dynamic and alter over time, however your bootstrap handle stays the identical for the cluster. All of your communication with the cluster begins with the bootstrap server that may reply with the listing of energetic brokers when required. For correct Kafka communication, your MSK shopper wants to have the ability to resolve the domains of your bootstrap server in addition to all of your brokers.
At cluster creation, you specify the VPCs that you prefer to the cluster to speak with (as much as 5 VPCs in the identical account as your cluster). For every VPC specified throughout cluster creation, cluster VPC endpoints are created together with a non-public hosted zone that features a listing of your bootstrap server and all dynamic brokers saved updated. The personal hosted zones facilitate resolving the FQDNs of your bootstrap server and brokers, from throughout the related VPCs outlined throughout cluster creation, to the respective VPC endpoints for every.
Cross-account entry
To have the ability to prolong personal connectivity of your Kafka producers and shoppers to your MSK Serverless cluster, you must take into account three essential facets: personal connectivity, authentication and authorization, and DNS decision.
The next diagram highlights the attainable connectivity choices. Though the diagram exhibits all of them right here for demonstration functions, generally, you’d use a number of of those choices relying in your structure, not obligatory all in the identical setup.
On this part, we focus on the completely different connectivity choices together with their professionals and cons. We additionally cowl the authentication and DNS decision facets related to the related connectivity choices.
Non-public connectivity layer
That is the underlying personal community connectivity. You may obtain this connectivity utilizing VPC peering, AWS Transit Gateway, or PrivateLink, as indicated within the previous diagram. VPC peering simplifies the setup, however it lacks the assist for transitive routing. Generally, peering is used when you might have a restricted variety of VPCs or in case your VPCs usually talk with some restricted variety of core providers VPCs with out the necessity of lateral connectivity or transitive routing. However, AWS Transit Gateway facilitates transitive routing and may simplify the structure when you might have a lot of VPCs, and particularly when lateral connectivity is required. PrivateLink is extra fitted to extending connectivity to a particular useful resource unidirectionally throughout VPCs or accounts with out exposing full VPC-to-VPC connectivity, thereby including a layer of isolation. PrivateLink is helpful when you have overlapping CIDRs, which is a case that isn’t supported by Transit Gateway or VPC peering. PrivateLink can be helpful when your related events are administrated individually, and when one-way connectivity and isolation are required.
Should you select PrivateLink as a connectivity possibility, you must use a Community Load Balancer (NLB) with an IP sort goal group with its registered targets set because the IP addresses of the zonal endpoints of your MSK Serverless cluster.
Cluster authentication and authorization
Along with having personal connectivity and having the ability to resolve the bootstrap server and brokers domains, in your producers and shoppers to have entry to your cluster, you must configure your shoppers with correct credentials. MSK Serverless helps AWS Id and Entry Administration (IAM) authentication and authorization. For cross-account entry, your MSK shopper must assume a job that has correct credentials to entry the cluster. This publish focuses primarily on the cross-account connectivity and identify decision facets. For extra particulars on cross-account authentication and authorization, confer with the next GitHub repo.
DNS decision
For Kafka producers and shoppers situated in accounts throughout the group to have the ability to produce and eat to and from the centralized MSK Serverless cluster, they want to have the ability to resolve the FQDNs of the cluster bootstrap server in addition to every of the cluster brokers. Understanding the dynamic nature of dealer allocation, the answer should accommodate such a requirement. Within the subsequent part, we handle how we are able to fulfill this a part of the necessities.
Cluster cross-account DNS decision
Now that we now have mentioned how MSK Serverless works, how personal connectivity is prolonged, and the authentication and authorization necessities, let’s focus on how DNS decision works in your cluster.
For each VPC related along with your cluster throughout cluster creation, a VPC endpoint is created together with a personal hosted zone. Non-public hosted zones allow identify resolve of the FQDNs of the cluster bootstrap server and the dynamically allotted brokers, from inside every respective VPC. This works properly when requests come from inside any of the VPCs that have been added throughout cluster creation as a result of they have already got the required VPC endpoints and related personal hosted zones.
Let’s focus on how one can prolong identify decision to different VPCs throughout the identical account that weren’t included throughout cluster creation, and to others which may be situated in different accounts.
You’ve already made your selection of the personal connectivity possibility that most closely fits your structure necessities, be it VPC peering, PrivateLink, or Transit Gateway. Assuming that you’ve got additionally configured your MSK shoppers to imagine roles which have the correct IAM credentials in an effort to facilitate cluster entry, you now want to handle the identify decision side of connectivity. It’s price noting that, though we listing completely different connectivity choices utilizing VPC peering, Transit Gateway, and PrivateLink, generally just one or two of those connectivity choices are current. You don’t essentially have to have all of them; they’re listed right here to show your choices, and you might be free to decide on those that finest suit your structure and necessities.
Within the following sections, we describe two completely different strategies to handle DNS decision. For every technique, there are benefits and limitations.
Non-public hosted zones
The next diagram highlights the answer structure and its elements. Observe that, to simplify the diagram, and to make room for extra related particulars required on this part, we now have eradicated among the connectivity choices.
The answer begins with creating a non-public hosted zone, adopted by making a VPC affiliation.
Create a non-public hosted zone
We begin by creating a non-public hosted zone for identify decision. To make the answer scalable and simple to take care of, you possibly can select to create this personal hosted zone in the identical MSK Serverless cluster account; in some instances, creating the personal hosted zone in a centralized networking account is most popular. Having the personal hosted zone created within the MSK Serverless cluster account facilitates centralized administration of the personal hosted zone alongside the MSK cluster. We are able to then affiliate the centralized personal hosted zone with VPCs throughout the identical account, or in several different accounts. Selecting to centralize your personal hosted zones in a networking account can be a viable resolution to contemplate.
The aim of the personal hosted zone is to have the ability to resolve the FQDNs of the bootstrap server in addition to all of the dynamically assigned cluster-associated brokers. As mentioned earlier, the bootstrap server FQDN format is boot-ClusterUniqueID.xx.kafka-serverless.Area.amazonaws.com
, and the cluster brokers use the format bxxxx-ClusterUniqueID.xx.kafka-serverless.Area.amazonaws.com
, with bxxxx
being the dealer ID. You must create the brand new personal hosted zone with the first area set as kafka-serverless.Area.amazonaws.com
, with an A-Alias document known as *.kafka-serverless.Area.amazonaws.com
pointing to the Regional VPC endpoint of the MSK Serverless cluster within the MSK cluster VPC. This ought to be ample to direct all site visitors concentrating on your cluster to the first cluster VPC endpoints that you just laid out in your personal hosted zone.
Now that you’ve got created the personal hosted zone, for identify decision to work, you must affiliate the personal hosted zone with each VPC the place you might have shoppers for the MSK cluster (producer or client).
Affiliate a non-public hosted zone with VPCs in the identical account
For VPCs which are in the identical account because the MSK cluster and weren’t included within the configuration throughout cluster creation, you possibly can affiliate them to the personal hosted zone created utilizing the AWS Administration Console by enhancing the personal hosted zone settings and including the respective VPCs. For extra info, confer with Associating extra VPCs with a non-public hosted zone.
Affiliate a non-public hosted zone in cross-account VPCs
For VPCs which are in a distinct account apart from the MSK cluster account, confer with Associating an Amazon VPC and a non-public hosted zone that you just created with completely different AWS accounts. The important thing steps are as follows:
- Create a VPC affiliation authorization within the account the place the personal hosted zone is created (on this case, it’s the identical account because the MSK Serverless cluster account) to authorize the distant VPCs to be related to the hosted zone:
- Affiliate the VPC with the personal hosted zone within the account the place you might have the VPCs with the MSK shoppers (distant account), referencing the affiliation authorization you created earlier:
- Delete the VPC authorization to affiliate the VPC with the hosted zone:
Deleting the authorization doesn’t have an effect on the affiliation, it simply prevents you from re-associating the VPC with the hosted zone sooner or later. If you wish to re-associate the VPC with the hosted zone, you’ll have to repeat steps 1 and a couple of of this process.
Observe that your VPC must have the enableDnsSupport
and enableDnsHostnames
DNS attributes enabled for this to work. These two settings will be configured beneath the VPC DNS settings. For extra info, confer with DNS attributes in your VPC.
These procedures work properly for all distant accounts when connectivity is prolonged utilizing VPC peering or Transit Gateway. In case your connectivity possibility makes use of PrivateLink, the personal hosted zone must be created within the distant account as an alternative (the account the place the PrivateLink VPC endpoints are). As well as, an A-Alias document that resolves to the PrivateLink endpoint as an alternative of the MSK cluster endpoint must be created as indicated within the earlier diagram. This can facilitate identify decision to the PrivateLink endpoint. If different VPCs want entry to the cluster via that very same PrivateLink setup, you must comply with the identical personal hosted zone affiliation process as described earlier and affiliate your different VPCs with the personal hosted zone created in your PrivateLink VPC.
Limitations
The personal hosted zones resolution has some key limitations.
Firstly, since you’re utilizing kafka-serverless.Area.amazonaws.com as the first area for our personal hosted zone, and your A-Alias document makes use of *.kafka-serverless.Area.amazonaws.com
, all site visitors to the MSK Serverless service originating from any VPC related to this personal hosted zone might be directed to the one particular cluster VPC Regional endpoint that you just specified within the hosted zone A-Alias document.
This resolution is legitimate when you have one MSK Serverless cluster in your centralized service VPC. If you must present entry to a number of MSK Serverless clusters, you should utilize the identical resolution however adapt a distributed personal hosted zone strategy versus a centralized strategy. In a distributed personal hosted zone strategy, every personal hosted zone can level to a particular cluster. The VPCs related to that particular personal hosted zone will talk solely to the respective cluster listed beneath the precise personal hosted zone.
As well as, after you determine a VPC affiliation with a non-public hosted zone resolving *.kafka-serverless.Area.amazonaws.com, the respective VPC will solely be capable to talk with the cluster outlined in that particular personal hosted zone and no different cluster. An exception to this rule is that if a neighborhood cluster is created throughout the identical shopper VPC, through which case the shoppers throughout the VPC will solely be capable to talk with solely the native cluster.
You can too use PrivateLink to accommodate a number of clusters by making a PrivateLink plus personal hosted zone per cluster, replicating the configuration steps described earlier.
Each options, utilizing distributed personal hosted zones or PrivateLink, are nonetheless topic to the limitation that for every shopper VPC, you possibly can solely talk with the one MSK Serverless cluster that your related personal hosted zone is configured for.
Within the subsequent part, we focus on one other attainable resolution.
Resolver guidelines and AWS Useful resource Entry Supervisor
The next diagram exhibits a high-level overview of the answer utilizing Amazon Route 53 resolver guidelines and AWS Useful resource Entry Supervisor.
The answer begins with creating Route 53 inbound and outbound resolver endpoints, that are related to the MSK cluster VPC. You then create a resolver forwarding rule within the MSK account that isn’t related to any VPC. Subsequent, you share the resolver rule throughout accounts utilizing Useful resource Entry Supervisor. On the distant account the place you must prolong identify decision to, you must settle for the useful resource share and affiliate the resolver guidelines along with your goal VPCs situated within the distant account (the account the place the MSK shoppers are situated).
For extra details about this strategy, confer with the third use case in Simplify DNS administration in a multi-account surroundings with Route 53 Resolver.
This resolution accommodates a number of centralized MSK serverless clusters in a extra scalable and versatile strategy. Due to this fact, the answer counts on directing DNS requests to be resolved by the VPC the place the MSK clusters are. A number of MSK Serverless clusters can coexist, the place shoppers in a selected VPC can talk with a number of of them on the identical time. This feature is just not supported with the personal hosted zone resolution strategy.
Limitations
Though this resolution has its benefits, it additionally has a couple of limitations.
Firstly, for a selected goal client or producer account, all of your MSK Serverless clusters have to be in the identical core service VPC within the MSK account. This is because of the truth that the resolver rule is ready on an account degree and makes use of.kafka-serverless.Area.amazonaws.com as the first area, directing its decision to 1 particular VPC resolver endpoint inbound/outbound pair inside that service VPC. If you must have separate clusters in several VPCs, take into account creating separate accounts.
The second limitation is that every one your shopper VPCs have to be in the identical Area as your core MSK Serverless VPC. The explanation behind this limitation is that resolver guidelines pointing to a resolver endpoint pair (in actuality, they level to the outbound endpoint that loops into the inbound endpoints) have to be in the identical Area because the resolver guidelines, and Useful resource Entry Supervisor will prolong the share solely throughout the identical Area. Nonetheless, this resolution is sweet when you might have a number of MSK clusters in the identical core VPC, and though your distant shoppers are in several VPCs and accounts, they’re nonetheless throughout the identical Area. A workaround for this limitation is to duplicate the creation of resolver guidelines and outbound resolver endpoint in a second Area, the place the outbound endpoint loops again via the unique first Area inbound resolver endpoint related to the MSK Serverless cluster VPC (assuming IP connectivity is facilitated). This second Area resolver rule can then be shared utilizing Useful resource Entry Supervisor throughout the second Area.
Conclusion
You may configure MSK Serverless cross-VPC and cross-account entry in multi-account environments utilizing personal hosted zones or Route 53 resolver guidelines. The answer mentioned on this publish means that you can centralize your configuration whereas extending cross-account entry, making it a scalable and straightforward-to-maintain resolution. You may create your MSK Serverless clusters with cross-account entry for producers and shoppers, maintain your concentrate on your online business outcomes, and achieve insights from sources of information throughout your group with out having to right-size and handle a Kafka infrastructure.
In regards to the Writer
Tamer Soliman is a Senior Options Architect at AWS. He helps Impartial Software program Vendor (ISV) prospects innovate, construct, and scale on AWS. He has over twenty years of trade expertise, and is an inventor with three granted patents. His expertise spans a number of know-how domains together with telecom, networking, utility integration, information analytics, AI/ML, and cloud deployments. He makes a speciality of AWS Networking and has a profound ardour for machine leaning, AI, and Generative AI.