Transfer knowledge from provisioned domains to Serverless
Setup Amazon OpenSearch Ingestion
To get began, you should have an energetic OpenSearch Service area (supply) and OpenSearch Serverless assortment (sink). Full the next steps to arrange an OpenSearch Ingestion pipeline for migration:
- On the OpenSearch Service console, select Pipeline below Ingestion within the navigation pane.
- Select Create a pipeline.
- For Pipeline identify, enter a reputation (for instance,
octank-migration
).
- For Pipeline capability, you may outline the minimal and most capability to scale up the assets. For now, you may depart the default minimal as 1 and most as 4.
- For Configuration Blueprint, choose
AWS-OpenSearchDataMigrationPipeline
.
- Replace the next info for the supply:
- Uncomment
hosts
and specify the endpoint of the prevailing OpenSearch Service endpoint.
- Uncomment
distribution_version
in case your supply cluster is an OpenSearch Service cluster with compatibility mode enabled; in any other case, depart it commented.
- Uncomment
indices
, embody
, index_name_regex
, and add an index identify or sample that you simply wish to migrate (for instance, octank-iot-logs-2023.11.0*
).
- Replace
area
below aws
the place your supply area is (for instance, us-west-2
).
- Replace
sts_role_arn
below aws
to the function that has permission to learn knowledge from the OpenSearch Service area (for instance, arn:aws:iam::111122223333:function/osis-pipeline
). This function ought to be added as a backend function throughout the OpenSearch Service safety roles.
- Replace the next info for the sink:
- Uncomment
hosts
and specify the endpoint of the prevailing OpenSearch Serverless endpoint.
- Replace
sts_role_arn
below aws
to the function that has permission to write down knowledge into the OpenSearch Serverless assortment (for instance, arn:aws:iam::111122223333:function/osis-pipeline
). This function ought to be added as a part of the info entry coverage within the OpenSearch Serverless assortment.
- Replace the
serverless
flag to be true
.
- For
index
, you may depart it as default, which is able to get the metadata from the supply index and write to the identical identify within the vacation spot as of the sources. Alternatively, if you wish to have a distinct index identify on the vacation spot, modify this worth along with your desired identify.
- For
document_id
, you may get the ID from the doc metadata within the supply and use the identical within the goal. Word that customized doc IDs are supported just for the SEARCH sort of assortment; when you have your assortment as TIMESERIES or VECTORSEARCH, you must remark this line.
- Subsequent, you may validate your pipeline to test the connectivity of supply and sink to verify the endpoint exists and is accessible.
- For Community settings, select your most well-liked setting:
- Select VPC entry and choose your VPC, subnet, and safety group to arrange the entry privately.
- Select Public to make use of public entry. AWS recommends that you simply use a VPC endpoint for all manufacturing workloads, however this walkthrough, choose Public.
- For Log Publishing Possibility, you may both create a brand new Amazon CloudWatch group or use an present CloudWatch group to write down the ingestion logs. This gives entry to details about errors and warnings raised throughout the operation, which may help throughout troubleshooting. For this walkthrough, select Create new group.
- Select Subsequent, and confirm the main points you specified in your pipeline settings.
- Select Create pipeline.
It ought to take a few minutes to create the ingestion pipeline.
The next graphic offers a fast demonstration of making the OpenSearch Ingestion pipeline by way of the previous steps.
Confirm ingested knowledge within the goal OpenSearch Serverless assortment
After the pipeline is created and energetic, log in to OpenSearch Dashboards in your OpenSearch Serverless assortment and run the next command to checklist the indexes:
GET _cat/indices?v
The next graphic offers a fast demonstration of itemizing the indexes earlier than and after the pipeline turns into energetic.
Conclusion
On this submit, we noticed how OpenSearch Ingestion can ingest knowledge into an OpenSearch Serverless assortment with out the necessity to use the third-party options. With minimal knowledge producer configuration, it routinely ingested knowledge to the gathering. OSI additionally permits you to rework or reindex the info from ES7.x model earlier than ingestion to an OpenSearch Service area or OpenSearch Serverless assortment. OSI eliminates the necessity to provision, scale, or handle servers. AWS provides varied assets so that you can rapidly begin constructing pipelines utilizing OpenSearch Ingestion. You should use varied built-in pipeline integrations to rapidly ingest knowledge from Amazon DynamoDB, Amazon Managed Streaming for Apache Kafka (Amazon MSK), Amazon Safety Lake, Fluent Bit, and lots of extra. The next OpenSearch Ingestion blueprints allow you to construct knowledge pipelines with minimal configuration adjustments.
Concerning the Authors
Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search functions and options. Muthu is within the matters of networking and safety, and is predicated out of Austin, Texas.
Prashant Agrawal is a Sr. Search Specialist Options Architect with Amazon OpenSearch Service. He works carefully with clients to assist them migrate their workloads to the cloud and helps present clients fine-tune their clusters to attain higher efficiency and save on value. Earlier than becoming a member of AWS, he helped varied clients use OpenSearch and Elasticsearch for his or her search and log analytics use instances. When not working, you’ll find him touring and exploring new locations. Briefly, he likes doing Eat → Journey → Repeat.
Rahul Sharma is a Technical Account Supervisor at Amazon Net Companies. He’s passionate concerning the knowledge applied sciences that assist leverage knowledge as a strategic asset and is predicated out of NY city, New York.