OpenSearch Searchable Snapshots
Overview
OpenSearch Searchable Snapshots is a feature introduced with OpenSearch 2.7.0 where users will be able to search through remotely stored indices without having to restore them to the local storage. This means that snapshot data can be searched much more efficiently, saving customers time and money when managing snapshots. All of this functionality can be quickly and easily configured on the Instaclustr Managed Platform.
Pre-requisites
- Searchable snapshots is only currently supported for Run-In-Your-Own-Account (RIYOA) customers. Support for Run-In-Instaclustr-Account (RIIA) will be added in coming releases. If you are unsure if you have a RIYOA account or a RIIA account, contact our support team ([email protected])
- In order to use searchable snapshots, you will need to set up a new bucket without a data expiration policy applied to it. This “without auto-expiry” storage bucket will be required to store data for longer periods. To set up and connect a without auto-expiry bucket to your cluster, please contact our support team ([email protected]).
Provision clusters with searchable snapshots enabled
After choosing the eligible OpenSearch version (OpenSearch 2.7.0 onwards), the option will be available on the console to provision clusters with the searchable snapshots enabled. You will have to have a cloud provider account managed by you and will also need to have backup storage without any automatic expiry policies (such as lifecycle management policies) set in order to use this feature. Contact Instaclustr Support if you want help in setting this up.
The clusters with searchable snapshots can also be provisioned from API or Terraform V1. More information about API request can be found: https://instaclustr.redoc.ly/#operation/extendedProvisionRequestHandler.
Create a secondary snapshot repository to store your searchable snapshots
In order to utilise searchable snapshots, a secondary snapshot repository is required. This is because you will want a separate specific repository to store, search and organise your searchable snapshots. You can create a secondary repository in which the snapshots are stored using OpenSearch Dashboards Index State Management Plugin or using curl commands.
1 2 3 4 5 6 7 8 9 |
Example: PUT /_snapshot/secondary-repository { "type": "s3", "settings": { "bucket": "<bucker-without-automatic-expiry>", "base_path": "<clusterID>/<custom-path>" } } |
For more information about setting up snapshot repositories, see your RIYOA set up guide or contact [email protected].
Take a snapshot of the index
Now you can take a snapshot of the index in order to store it and use it as a searchable snapshot in the secondary repository that we created in the step above. There is no difference between a searchable snapshot and a regular OpenSearch snapshot, so we can use the normal method of taking a snapshot or you can you a snapshot created by a snapshot management policy.
1 2 3 4 5 6 7 8 9 |
Example: PUT _snapshot/secondary-repository/test-snapshot { "indices": "test_index", "ignore_unavailable": true, "include_global_state": false, "partial": false } |
Restore and search the snapshot
Once we create the snapshot and store it in the secondary repository as we did in the step above, we can restore the snapshot as a searchable snapshot so that it can be searched like any other index. Note that while we are doing a restore of the snapshot, it is not a full restore as OpenSearch will only restore the required data, not the entire index. This means it is much smaller than a regular snapshot file.
- Restore the snapshot as a searchable snapshot:
-
123456789Example:POST /_snapshot/secondary-repository/test-snapshot/_restore{"indices": "test_index","storage_type": "remote_snapshot","rename_pattern": "(.+)","rename_replacement": "restored_$1"}
Notice that the “storage_type” field specifies that this will be restored as a “remote_snapshot” or searchable snapshot, not a regular one. Also, we rename the restored searchable snapshot so we don’t get mixed up if that index already exists in the cluster using the “rename_replacement” command.
-
- Search the searchable snapshot:
-
123456Example:GET/restored_test_index/_search{<Enter search query here>}
Notice that you can search a searchable snapshot as you would any regular index in the cluster once it has been restored.
-
Searchable snapshots will not consume much disk in your cluster as the snapshot data actually resides in remote storage.
Configure snapshots using ISM
Ideally you would not take and store a searchable snapshot manually as we have in the example above. You would most likely want to automate snapshots using the Index State Management plugin. You can set up a policy in ISM to automatically take a snapshot at the desired intervals (eg. Daily) and store those automatically in your secondary snapshot repository. You can find out more about how to set up an ISM policy here, or you can alternatively contact our support team ([email protected]).