Migration of data is an inevitable process in website hosting. In conventional methods, we will do the migrations manually across the servers, which is hectic in a way. In AWS we have a new feature called AWS DataSync which will transfer the data in S3 buckets between two AWS instances, regions, accounts, etc. With DataSync we can do a one-time data transfer and also can schedule the migration whenever we needed.
We will see some examples of data transfer below, thus we can have a proper understanding of the AWS DataSync.
Transferring data within the same Amazon S3 account
Select the DataSync page in the AWS console and create a task using the ‘Task’ option in the left menu bar. Then choose ‘Create a new location’ and for location type select ‘Amazon S3’ from the source location option.
After that, we have to select the AWS region, S3 bucket. Storage class and the directory from which we have to transfer the data. Then for the IAM role, we have to click on the Autogenerate button.
The same is followed in destination location options. Choose the destination accordingly with the Location type as ‘Amazon S3’
After that we need to name the task that we are going to execute.
After the last check of the configurations, we have input earlier, we need to start the task by clicking on the ‘Create Task’ button.
Transferring data across accounts
The first step we need to do is to create the necessary IAM role in the destination AWS account. This is to access the source S3 bucket, then we need to attach a new policy for the created IAM role in the source S3 bucket.
{ “Version”: “2012-10-17”, “Statement”: [ { “Action”: [ “s3:GetBucketLocation”, “s3:ListBucket”, “s3:ListBucketMultipartUploads” ], “Effect”: “Allow”, “Resource”: “arn:aws:s3:::SOURCEBUCKET” }, { “Action”: [ “s3:AbortMultipartUpload”, “s3:DeleteObject”, “s3:GetObject”, “s3:ListMultipartUploadParts”, “s3:PutObjectTagging”, “s3:GetObjectTagging”, “s3:PutObject” ], “Effect”: “Allow”, “Resource”: “arn:aws:s3:::SOURCEBUCKET/*” } ]} |
After that, we need to add the trust relationship in the IAM role.
{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Principal”: { “Service”: “datasync.amazonaws.com” }, “Action”: “sts:AssumeRole” } ]} |
From the AWS command line, we need to create S3 bucket location in the source for DataSync. Please have a note on the Amazon resource names(ARN) while creating the DataSync location and then use it on the following S3 bucket policy.
{ “Version”: “2012-10-17”, “Statement”: [ { “Sid”: “BucketPolicyForDataSync”, “Effect”: “Allow”, “Principal”: { “AWS”: [ “arn:aws:iam::DEST-ACCOUNT-ID:role/DEST-ACCOUNT-ROLE”, “arn:aws:iam::DEST-ACCOUNT-ID:role/DEST-ACCOUNT-USER”] }, “Action”: [ “s3:GetBucketLocation”, “s3:ListBucket”, “s3:ListBucketMultipartUploads”, “s3:AbortMultipartUpload”, “s3:DeleteObject”, “s3:GetObject”, “s3:ListMultipartUploadParts”, “s3:PutObject”, “s3:GetObjectTagging”, “s3:PutObjectTagging” ], “Resource”: [ “arn:aws:s3:::SOURCEBUCKET”, “arn:aws:s3:::SOURCEBUCKET/*” ] } ]} |
Now using the IAM role specified in the source S3 bucket policy, open the command line and run the following for the ARN.
aws sts get-caller-identity |
aws datasync create-location-s3 –s3-bucket-arn arn:aws:s3:::SOURCEBUCKET –s3-config ‘{“BucketAccessRoleArn”:”arn:aws:iam::DEST-ACCOUNT-ID:role/DEST-ACCOUNT-ROLE”}’ |
This will give a similar output to the following, which means the source S3 bucket has been created.
{“LocationArn”: “arn:aws:datasync:Region:DEST-ACCOUNT-ID:location/loc-xxxxxxxxxxxxxx”} |
Except for the following location choose option the other steps are the same as we did on the data transfer on the same account. Since we have already created the source and destination locations we need to choose the location as the existing source location.
After configuring the source and destination, click on the ‘Create task’ which will start the data transfer.
If you require help, contact SupportPRO Server Admin