Saturday, July 6, 2024

Simplify authentication with native LDAP integration on Amazon EMR

Many corporations have company identities saved inside id suppliers (IdPs) like Lively Listing (AD) or OpenLDAP. Beforehand, prospects utilizing Amazon EMR might combine their clusters with Lively Listing by configuring a one-way realm belief between their AD area and the EMR cluster Kerberos realm. For extra particulars, confer with Tutorial: Configure a cross-realm belief with an Lively Listing area.

This setup has been a key enabler to make company customers and teams out there inside EMR clusters and outline entry management insurance policies to manage their information entry (for instance, by way of the Amazon EMR native Apache Ranger integration).

Though this feature remains to be out there, Amazon EMR has launched assist for native LDAP authentication, a brand new safety characteristic that simplifies the combination with OpenLDAP and Lively Listing.

This characteristic permits the next:

  • automated configuration of safety for the supported functions (HiveServer2, Trino, Presto and Livy) to make use of the Kerberos protocol underneath the hood and LDAP as exterior authentication. This enables a extra easy integration from exterior instruments that, to attach with cluster endpoints, wouldn’t have anymore to setup kerberos authentication however, as an alternative, can merely be configured to offer an LDAP username and password
  • fine-grained entry management (FGAC) over who can entry your EMR clusters by way of SSH
  • fine-grained authorization insurance policies on high of Hive Metastore database and tables if utilized in mixture with the native Amazon EMR Apache Ranger integration.

On this publish, we dive deep into the Amazon EMR LDAP authentication, displaying how the authentication circulation works, find out how to retrieve and check the wanted LDAP configurations, and find out how to verify an EMR cluster is correctly LDAP built-in.

Utilizing the knowledge on this weblog:

  • Groups managing EMR clusters can improve coordination with their LDAP IdP directors in an effort to request the right info and correctly carry out pre-configuration checks
  • EMR cluster end-users can perceive how easy it’s to attach from exterior instruments to LDAP-enabled EMR clusters in comparison with the earlier Kerberos-based authentication

How Amazon EMR LDAP integration works

When speaking about authentication within the context of EMR frameworks, we will distinguish between two ranges:

  • Exterior authentication – Utilized by customers and exterior parts to work together with the put in frameworks
  • Inner authentication – Used inside the frameworks to authenticate the communications of inner parts

With this new characteristic, inner framework authentication remains to be managed by way of Kerberos, however that is clear to the end-users or exterior providers that, on the opposite aspect, use a consumer title and password to authenticate.

The supported EMR put in frameworks implement an LDAP-based authentication methodology that, given a set of consumer title and password credentials, validates them towards the LDAP endpoint and, within the case of success, permits using the framework.

The next diagram summarizes how the authentication circulation works.

The workflow consists of the next steps:

  1. A consumer connects with one of many supported endpoints (equivalent to HiveServer2, Trino/Presto Coordinator, or Hue WebUI) and offers their company credentials (consumer title and password).
  2. The contacted framework makes use of a customized authenticator that performs the authentication utilizing the EMR Secret Agent service operating contained in the cluster cases.
  3. The EMR Secret Agent service validates the offered credentials towards the LDAP endpoint.
  4. Within the case of success, the next happens:
    • A Kerberos principal is created for the precise consumer on the cluster MIT key distribution middle (MIT KDC) operating inside the first node.
    • The Kerberos principal keytab is created inside the house listing of the consumer on the first node.

After the authentication is full, the consumer can begin utilizing the framework.

Inside all of the cluster cases, the SSSD service is configured to retrieve customers and teams from the LDAP endpoint and make them out there as system customers.

The authentication circulation when connecting with SSH is a bit completely different, and is summarized within the following diagram.

The workflow consists of the next steps:

  1. A consumer connects with SSH to the EMR major occasion, offering the company credentials (consumer title and password).
  2. The contacted SSHD service makes use of the SSSD service to validate the offered credentials.
  3. The SSSD service validates the offered credentials towards the LDAP endpoint. Within the case of success, the consumer lands on the associated residence listing. At this level, the consumer can use the completely different CLIs (beeline, trino-cli, presto-cli, curl) to entry Hive, Trino/Presto, or Livy.
  4. To make use of the Spark CLIs (spark-submit, pyspark, spark-shell), the consumer has to invoke the ldap-kinit script and supply the requested consumer title and password.
  5. The authentication is carried out utilizing the EMR Secret Agent service operating contained in the cluster cases.
  6. The EMR Secret Agent service validates the offered credentials towards the LDAP endpoint.
  7. Within the case of success, the next happens:
    • A Kerberos principal is created for the precise consumer on the cluster MIT KDC operating inside the first node.
    • The Kerberos principal keytab is created inside the house listing of the consumer on the first node.
    • A kerberos ticket is obtained and saved on the consumer Kerberos ticket cache on the first node.

After the ldap-kinit script completes, the consumer can begin utilizing the Spark CLIs.

Within the following sections, we present find out how to retrieve the required LDAP setting values and examine find out how to launch a cluster with EMR LDAP authentication and check it.

Discover the right LDAP parameters

To configure LDAP authentication for Amazon EMR, step one is to retrieve the LDAP properties for use to arrange your cluster. You want the next info:

  • The LDAP server DNS title
  • A certificates in PEM format for use to work together over Safe LDAP (LDAPS) with the LDAP endpoint
  • The LDAP consumer search base, which is a path (or department) on the LDAP tree from the place to go looking customers (solely customers belonging to this department can be retrieved)
  • The LDAP teams search base, which is a path (or department) on the LDAP tree from the place to go looking teams (solely teams belonging to this department can be retrieved)
  • The LDAP server bind consumer credentials, that are the consumer title and password for a service consumer (normally known as a bind consumer) for use to set off LDAP queries and retrieve consumer info equivalent to consumer title and group membership.

With Lively Listing, an AD admin can retrieve this info immediately from the Lively Listing Customers and Computer systems device. Once you select a consumer on this device, you possibly can see the associated attributes (for instance, distinguishedName). The next screenshot exhibits an instance.

From the screenshot, we will see that the distinguishedName for the consumer john is CN=john,OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com, which implies that john belongs to the next search bases, ordered from essentially the most slim to essentially the most huge:

  • OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com
  • OU=italy,OU=emr,DC=awsemr,DC=com
  • OU=emr,DC=awsemr,DC=com
  • DC=awsemr,DC=com

Relying on the quantity of entries inside an organization LDAP listing, utilizing a large search base might result in lengthy retrieval occasions and timeouts. It’s a great apply to configure the search base to be as slim as attainable in an effort to embody all of the wanted customers. Within the previous instance, OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com could also be a great search base if all of the customers you need to present entry to the EMR cluster are a part of that Organizational Unit.

One other option to retrieve consumer attributes is by utilizing the ldapsearch device. You need to use this methodology for Lively Listing in addition to OpenLDAP, and it’s extraordinarily helpful to check the connectivity with the LDAP endpoint.

The next is an instance with Lively Listing (OpenLDAP is comparable).

The LDAP endpoint needs to be resolvable and reachable by Amazon Elastic Compute Cloud (Amazon EC2) EMR cluster cases through TCP on port 636. It’s urged to run the check from an Amazon Linux 2 EC2 occasion belonging to the identical subnet because the EMR cluster and having the identical EMR safety group related because the EMR cluster cases.

After you launch an EC2 occasion, set up the nc device and check the DNS decision and connectivity. Assuming that DC1.awsemr.com is the DNS title for the LDAP endpoint, run the next instructions:

sudo yum set up nc
nc -vz DC1.awsemr.com 636

If the DNS decision isn’t working correctly, you need to obtain an error like the next:

Ncat: Model 7.50 ( https://nmap.org/ncat )
Ncat: Couldn't resolve hostname "DC1.awsemr.com": Identify or service not recognized. QUITTING.

If the endpoint just isn’t reachable, you need to obtain an error like the next:

Ncat: Model 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.

In both of those circumstances, the networking and DNS crew needs to be concerned in an effort to troubleshot and clear up the problems.

In case of success, the output ought to seem like the next:

Ncat: Model 7.50 ( https://nmap.org/ncat )
Ncat: Linked to 10.0.1.235:636.
Ncat: 0 bytes despatched, 0 bytes obtained in 0.01 seconds.

If every little thing works, proceed with the testing and set up the openldap purchasers as follows:

sudo yum set up openldap-clients

Then run ldapsearch instructions to retrieve details about customers and teams from the LDAP endpoint. The next are pattern ldapsearch instructions:

#Customise these 6 variables
LDAPS_CERTIFICATE=/path/to/ldaps_cert.pem
LDAPS_ENDPOINT=DC1.awsemr.com
BINDUSER="CN=binduser,CN=Customers,DC=awsemr,DC=com"
BINDUSER_PASSWORD=binduserpassword
SEARCH_BASE=DC=awsemr,DC=com
USER_TO_SEARCH=john
FILTER=(sAMAccountName=${USER_TO_SEARCH})
INFO_TO_SEARCH="*"

#Search consumer
LDAPTLS_CACERT=${LDAPS_CERTIFICATE} ldapsearch -LLL -x -H ldaps://${LDAPS_ENDPOINT} -v -D "${BINDUSER}" -w "${BINDUSER_PASSWORD}" -b "${SEARCH_BASE}" "${FILTER}" "${INFO_TO_SEARCH}"

We use the next parameters:

  • -x – This allows easy authentication.
  • -D – This means the consumer to carry out the search.
  • -w – This means the consumer password.
  • -H – This means the URL of the LDAP server.
  • -b – That is the bottom search.
  • LDAPTLS_CACERT – This means the LDAPS endpoint SSL PEM public certificates or the LDAPS endpoint root certificates authority SSL PEM public certificates. This may be obtained from an AD or OpenLDAP admin consumer.

The next is a pattern output of the previous command:

filter: (sAMAccountName=john)
requesting: *
dn: CN=john,OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com
objectClass: high
objectClass: individual
objectClass: organizationalPerson
objectClass: consumer
cn: john
givenName: john
distinguishedName: CN=john,OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com
instanceType: 4
whenCreated: 20230804094021.0Z
whenChanged: 20230804094021.0Z
displayName: john
uSNCreated: 262459
memberOf: CN=data-engineers,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com
uSNChanged: 262466
title: john
objectGUID:: gTxn8qYvy0SVL+mYAAbb8Q==
userAccountControl: 66048
badPwdCount: 0
codePage: 0
countryCode: 0
badPasswordTime: 0
lastLogoff: 0
lastLogon: 0
pwdLastSet: 133356156212864439
primaryGroupID: 513
objectSid:: AQUAAAAAAAUVAAAAIKyNe7Dn3azp7Sh+rgQAAA==
accountExpires: 9223372036854775807
logonCount: 0
sAMAccountName: john
sAMAccountType: 805306368
userPrincipalName: john@awsemr.com
objectCategory: CN=Individual,CN=Schema,CN=Configuration,DC=awsemr,DC=com
dSCorePropagationData: 20230804094021.0Z
dSCorePropagationData: 16010101000000.0Z

As we will see from the pattern output, the consumer john is recognized by the distinguished title CN=john,OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com, and the data-engineers group to which the consumer belongs (memberOf worth) is recognized by the distinguished title CN=data-engineers,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com.

We are able to run our ldapsearch queries to retrieve the consumer and group info utilizing a narrowed search base:

#Customise these 9 variables
LDAPS_CERTIFICATE=/path/to/ldaps_cert.pem
LDAPS_ENDPOINT=DC1.awsemr.com
BINDUSER="CN=binduser,CN=Customers,DC=awsemr,DC=com"
BINDUSER_PASSWORD=binduserpassword
SEARCH_BASE=DC=awsemr,DC=com
USER_SEARCH_BASE=OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com
GROUPS_SEARCH_BASE=OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com
USER_TO_SEARCH=john
GROUP_TO_SEARCH=data-engineers

#Search Consumer
LDAPTLS_CACERT=${LDAPS_CERTIFICATE} ldapsearch -LLL -x -H ldaps://${LDAPS_ENDPOINT} -v -D "${BINDUSER}" -w "${BINDUSER_PASSWORD}" -b "${USER_SEARCH_BASE}" "(sAMAccountName=${USER_TO_SEARCH})" "*"

#Search Group
LDAPTLS_CACERT=${LDAPS_CERTIFICATE} ldapsearch -LLL -x -H ldaps://${LDAPS_ENDPOINT} -v -D "${BINDUSER}" -w "${BINDUSER_PASSWORD}" -b "${GROUPS_SEARCH_BASE}" "(sAMAccountName=${GROUP_TO_SEARCH})" "*"

You can too apply different filters whereas looking. For extra details about find out how to create LDAP filters, confer with LDAP Filters.

By operating ldapsearch instructions, you possibly can check the LDAP connectivity and LDAP properties, and decide the wanted setup.

Check the answer

After you might have verified that the connectivity to the LDAP endpoint is open and the LDAP configurations are right, proceed with organising the surroundings to launch an EMR LDAP-enabled cluster.

Create AWS Secret Supervisor secrets and techniques

Earlier than you create the EMR safety configuration, it’s worthwhile to create two AWS Secret Supervisor secrets and techniques. You employ these credentials to work together with the LDAP endpoint and retrieve consumer particulars equivalent to consumer title and group membership.

  1. On the Secrets and techniques Supervisor console, select Secrets and techniques within the navigation pane.
  2. Select Retailer a brand new secret.
  3. For Secret sort, choose Different sort of secret.
  4. Create a brand new secret specifying the binduser distinguished title as the important thing and the binduser password as the worth.
  5. Create a second secret specifying in plaintext the LDAPS endpoint SSL public certificates or the LDAPS root certificates authority public certificates.
    This certificates is trusted, permitting a safe communication between the EMR cluster and the LDAPS endpoint.

Create the EMR safety configuration

Full the next steps to create the EMR safety configuration:

  1. On the Amazon EMR console, select Safety configurations underneath EMR on EC2 within the navigation pane.
  2. Select Create.
  3. For Safety configuration title, enter a reputation.
  4. For Safety configuration setup choices, choose Select customized settings.
  5. For Encryption, choose Activate in-transit encryption.
  6. For Certificates supplier sort¸ choose PEM.
  7. For Select PEM certificates location, enter both a PEM bundle positioned in Amazon Easy Storage Service (Amazon S3) or a Java customized certificates supplier.
    Observe that in-transit encryption is obligatory in an effort to use the LDAP authentication characteristic. For extra details about in-transit encryption, confer with Offering certificates for encrypting information in transit with Amazon EMR encryption.
  8. Select Subsequent.
  9. Choose LDAP for Authentication protocol.
  10. For LDAP server location, enter the LDAPS endpoint (ldaps://<ldap_endpoint_DNS_name>).
  11. For LDAP SSL certificates, enter the second secret you created in Secrets and techniques Supervisor.
  12. For LDAP entry filter, enter an LDAP filter that’s utilized in an effort to prohibit entry to a subset of customers retrieved from the LDAP consumer search base. If the sector is left empty, no filters are utilized and all customers belonging to the LDAP consumer search base can entry the EMR LDAP-protected endpoints with their company credentials. The next are instance filters and their capabilities:
    • (objectClass=individual) – Filter customers with the attribute objectClass set as individual
    • (memberOf=CN=admins,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com) – Filter customers belonging to the admins group
    • (|(memberof=CN=data-engineers,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com)(memberof=CN=admins,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com)) – Filter customers belonging both to the data-engineers or the admins group (which we use for this publish)
  13. Enter values for LDAP consumer search base and LDAP group search base. Observe that the 2 search bases don’t assist inline filters (for instance, the next just isn’t supported: OU=customers,OU=italy,OU=emr,DC=awsemr,DC=com?subtree?(|(memberof=CN=data-engineers,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com)(memberof=CN=admins,OU=teams,OU=italy,OU=emr,DC=awsemr,DC=com))).
  14. Choose Activate SSH login. That is wanted solely in order for you your LDAP customers to have the ability to SSH inside cluster cases with their company credentials. If SSH login is enabled, the LDAP entry filter is required—in any other case, SSH authentication will fail.
  15. For LDAP server bind credentials, enter the primary secret you created in Secrets and techniques Supervisor.
  16. Within the Authorization part, preserve the defaults chosen:
    • For IAM function for functions, choose Occasion profile.
    • For Positive-grained entry management methodology, choose None.
  17. Select Subsequent.
  18. Evaluation the configuration abstract and select Create.

Launch the EMR cluster

You possibly can launch the EMR cluster utilizing the AWS Administration Console, the AWS Command Line Interface (AWS CLI), or any AWS SDK.

Once you’re creating the EMR on EC2 cluster, make sure to specify the next configurations:

  • EMR model – Use Amazon EMR 6.12.0 or above.
  • Purposes – Choose Hadoop, Spark, Hive, Hue, Livy and Presto/Trino.
  • Safety configuration – Specify the safety configuration you created within the earlier step.
  • EC2 key pair – Use an present key pair.
  • Community and safety teams – Use a configuration that enables the EMR EC2 cases to work together with the LDAPS endpoint. Within the Discover the right LDAP parameters part, you need to have confirmed a legitimate setup.

Affirm the LDAP authentication is working

When the cluster is up and operating, you possibly can verify the LDAP authentication is working correctly.

If SSH login was enabled as a part of LDAP authentication contained in the EMR SecurityConfiguration, you possibly can SSH into your cluster by specifying an LDAP consumer, prompting the associated password when requested:

ssh myldapuser@<emr_primary_node>

If SSH login was disabled, you possibly can SSH contained in the cluster by utilizing the EC2 key pair specified throughout cluster creation:

ssh -i mykeypair.pem ec2-user@<emr_primary_node>

Another option to entry the first occasion, when you desire, is to make use of Session Supervisor, a functionality of AWS Programs Supervisor. For extra info, confer with Hook up with your Linux occasion with AWS Programs Supervisor Session Supervisor.

Once you’re inside the first occasion, you possibly can check that the LDAP customers and teams are correctly retrieved by utilizing the id command. The next is a pattern command to verify if the consumer john is correctly retrieved with the associated teams:

[ec2-user@ip-10-0-2-237 ~]# id john
uid=941601122(john) gid=941600513(users-group) teams=941600513(users-group),941601123(data-engineers)

You possibly can then check authentication on the completely different put in frameworks.

First, let’s retrieve the frameworks’ public certificates and retailer it inside a truststore. All of the frameworks share the identical public certificates (the one we used to arrange in-transit encryption), so you should use any of the SSL protected endpoints (Hive port 10000, Presto/Trino port 8446, Livy port 8998) to retrieve it. Take the certificates from the HiveServer2 endpoint (port 10000):

#Export Hive Server 2 public SSL certificates to a PEM file
openssl s_client -showcerts -connect $(hostname -f):10000 </dev/null 2>/dev/null|openssl x509 -outform PEM > certificates.pem

#Import the PEM certificates inside a truststore
echo "sure" | keytool -import -alias hive_cert -file certificates.pem -storetype JKS -keystore truststore.jks -storepass myStrongPassword

Then use this truststore to securely talk with the completely different frameworks.

Use the next code to check HiveServer2 authentication with beeline:

#Use the truststore to connect with the Hive Server 2
beeline -u "jdbc:hive2://$(hostname -f):10000/default;ssl=true;sslTrustStore=truststore.jks;trustStorePassword=myStrongPassword" -n john -p johnPassword 

If utilizing Presto, check Presto authentication with the presto CLI (present the consumer password when requested):

#Use the truststore to connect with the Presto coordinator
presto-cli 
--user john 
--password 
--catalog hive 
--server https://$(hostname -f):8446 
--truststore-path truststore.jks 
--truststore-password myStrongPassword

If utilizing Trino, check Trino authentication with the trino CLI (present the consumer password when requested):

#Use the truststore to connect with the Trino coordinator
trino-cli 
--user john 
--password 
--catalog hive 
--server https://$(hostname -f):8446 
--truststore-path truststore.jks 
--truststore-password myStrongPassword

Check Livy authentication with curl:

#Belief the PEM certificte to connect with the Livy server

#Begin session
curl --cacert certificates.pem -X POST 
-u "john:johnPassword" 
--data '{"variety": "spark"}' 
-H "Content material-Kind: utility/json" 
https://$(hostname -f):8998/periods 
-c cookies.txt

#Instance of output
#{"id":0,"title":null,"appId":null,"proprietor":"john","proxyUser":"john","state":"beginning","variety":"spark","appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["stdout: ","nstderr: ","nYARN Diagnostics: "]}

Check Spark instructions with pyspark:

#SSH inside the first occasion with the precise consumer
ssh john@<emr-primary-node>
#Or impersonate the consumer
sudo su - john

#Create a keytab and procure a kerberos ticket operating the ldap-kinit device
$ ldap-kinit
Username: john
Password: 

#Output
{"message":"okay","contents":{"username":"john","expirationTime":"2023-09-14T15:24:06.303Z[UTC]"}}

#Verify the kerberos ticket has been created
$ klist

# Check spark CLIs
$ pyspark

>>> spark.sql("present databases").present()
>>> give up()

Observe that right here we examined the authentication from inside the cluster, however we will work together with Trino, Hive, Presto and Livy even from exterior the cluster so far as connectivity and DNS decision are correctly configured. Spark CLIs are the one ones which can be utilized solely from contained in the cluster.

To check Hue authentication, full the next steps:

  1. Navigate to the Hue internet UI hosted on http://<emr_primary_node>:8888/ and supply an LDAP consumer title and password.
  2. Check SQL queries contained in the Hive and Trino/Presto editors.

To check with an exterior SQL device (equivalent to DBeaver connecting to Trino), full the next steps. Remember to configure the EMR major node safety group in order that it permits TCP visitors from the DBeaver IP to the specified framework endpoint port (for instance, 10000 for HiveServer2, 8446 for Trino/Presto) and to correctly configure DNS decision on the DBeaver consumer machine to correctly resolve the EMR major node hostname.

  1. Out of your EMR cluster major occasion, copy to an S3 bucket the information truststore.jks (beforehand created) and /usr/lib/trino/trino-jdbc/trino-jdbc-XXX-amzn-0.jar (change the model XXX relying on the EMR model).
  2. Obtain in your DBeaver consumer machine the truststore.jks and trino-jdbc-XXX-amzn-0.jar information.
  3. Open DBeaver and select Database, then select Driver Supervisor.
  4. Select New to create a brand new driver.
  5. On the Settings tab, present the next info:
    • For Driver Identify, enter EMR Trino.
    • For Class Identify, enter io.trino.jdbc.TrinoDriver.
    • For URL Template, enter jdbc:trino://{host}:{port}.
  6. On the Libraries tab, full the next steps:
    • Select Add File.
    • Select the Trino JDBC driver JAR file from the native file system (trino-jdbc-XXX-amzn-0.jar).
  7. Select OK to create the driving force.
  8. Select Database and New Database Connection.
  9. On the Principal tab, specify the next:
    • For Join by, choose Host.
    • For Host, enter the EMR major node.
    • For Port, enter the Trino port (8446 by default).
  10. On the Driver properties tab, add the next properties:
    • Add SSL with True as the worth.
    • Add SSLTrustStorePath with the truststore.jks file location as the worth.
    • Add SSLTrustStorePassword with the truststore.jks password that you simply used to create it as the worth.
  11. Select End.
  12. Select the created connection and select the Join icon.
  13. Enter your LDAP consumer title and password, then select OK.

If every little thing is working, you need to be capable of browse the Trino catalogs, databases, and tables within the navigation pane. To run queries, select SQL Editor, then select Open SQL Editor.

From the SQL Editor, you possibly can question your tables.

Subsequent steps

The brand new Amazon EMR LDAP authentication characteristic simplifies the way in which customers can achieve entry to EMR put in frameworks. When customers are utilizing a framework, you might need to govern the info they’ll entry. For this particular subject, you should use LDAP authentication together with the native EMR Apache Ranger integration. For extra info, confer with Combine Amazon EMR with Apache Ranger.

Clear up

Full the next cleanup actions to take away the assets you created following this publish and keep away from incurring further prices. For this publish, we clear up utilizing the AWS CLI. You can too clear up utilizing comparable actions through the console.

  1. When you launched an EC2 occasion to verify the LDAP connectivity and don’t want it anymore, delete it with the next command (specify your occasion ID):
    aws ec2 terminate-instances 
    --instance-ids i-XXXXXXXX 
    --region <your-aws-region>

  2. When you launched an EC2 occasion to check DBeaver and don’t want it anymore, you should use the previous command to delete it.
  3. Delete the EMR cluster with the next command (specify your EMR cluster ID):
    aws emr terminate-clusters 
    --cluster-ids j-XXXXXXXXXXXXX 
    --region <your-aws-region>

    Observe that if the EMR cluster has Termination Safety enabled, earlier than you run the previous terminate-clusters command, it’s a must to disable it. You are able to do so with the next command (specify your EMR cluster ID):

    aws emr modify-cluster-attributes 
    --cluster-ids j-XXXXXXXXXXXXX 
    --no-termination-protected 
    --region eu-west-1

  4. Delete the EMR safety configuration with the next command:
    aws emr delete-security-configuration 
    --name <your-security-configuration> 
    --region <your-aws-region>

  5. Delete the Secrets and techniques Supervisor secrets and techniques with the next instructions:
    aws secretsmanager delete-secret 
    --secret-id <first-secret-name> 
    --force-delete-without-recovery 
    --region <your-aws-region>
    
    aws secretsmanager delete-secret 
    --secret-id <second-secret-name> 
    --force-delete-without-recovery 
    --region <your-aws-region>

Conclusion

On this publish, we mentioned how one can configure and check LDAP authentication on EMR on EC2 clusters. We mentioned find out how to retrieve the wanted LDAP settings, check connectivity with the LDAP endpoint, configure your EMR safety configuration, and check that the LDAP authentication is correctly working. This publish additionally highlighted how the authentication circulation is simplified in comparison with the usual Lively Listing cross-realm belief configuration. To be taught extra about this characteristic, confer with Use Lively Listing or LDAP servers for authentication with Amazon EMR.


In regards to the Authors

Stefano Sandona is a Senior Huge Knowledge Resolution Architect at AWS. He loves information, distributed techniques and safety. He helps prospects world wide architecting safe, scalable and dependable massive information platforms.

Adnan Hemani is a Software program Improvement Engineer at AWS working with the EMR crew. He focuses on the safety posture of functions operating on EMR clusters. He’s concerned with trendy Huge Knowledge functions and the way prospects work together with them.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles