MySQL to PostgreSQL: A Hands-On AWS DMS Migration Guide with AWS RDS Aurora

Database migration

Databases are important part of any applications. But due to various reasons like cost, compliance, application compatibility etc. database migrations are required. In this blog we will discuss importance of doing database migration in the right way and using AWS DMS for this purpose with a hands-on example.

Why a live database migration is challenge

During Live migrations it is required to keep the database operational during the transition, which risks data inconsistency, extended downtime, and increased manual effort. Database migration is challenge due to various reasons like:

Data Type Differences: Handling discrepancies between different data types like MySQL and PostgreSQL has certain difference in schema and conventions.
Schema Conversion: Adapting schema definitions and constraints to match PostgreSQL requirements.
Downtime: Potential service disruption if the database is live and critical.
Manual Effort: Increased risk of human error and additional time in manual configuration.
Data Integrity: Ensuring consistency and accuracy during data transformation and transfer.

Different ways to perform database migration

There are different ways to perform database migration. All of these approach have their own pros and cons.

One of the most popular approach is using SQL Dumps. Here we Use mysqldump to export data and import into PostgreSQL but this is manual and error prone. Similarly, various ETL tools like Talend or Apache NiFi can be used and there are various specialized utilities like pgloader for automated schema and data conversion as well.

But all these approaches have significant risks of data loss, downtime and human error. This is where AWS DMS comes in picture.

What is AWS DMS?

AWS DMS is a managed service from AWS that facilitates database migration to AWS. It supports both homogeneous (same database engine) and heterogeneous (different database engines) migrations.

When to Use AWS DMS

Migrating on-premises databases to the cloud.
Modernizing or consolidating databases across different environments.
Enabling continuous data replication during live migrations with minimal downtime.

Benefits of AWS DMS

Cost-Efficient: Reduces overhead with a managed solution.
Minimal Downtime: Supports continuous data replication, ensuring business continuity.
Simplified Management: Automates replication and offers robust monitoring tools.
Flexibility: Adapts to various migration scenarios, whether homogeneous or heterogeneous.

How to setup DMS

Migrating databases between different engines or environments can be challenging. As discussed earlier, out of various approaches AWS DMS provides the most robust way to perform this task. Let’s setup a DMS service using terraform.

Automating AWS DMS Cluster Deployment with Terraform

AWS Database Migration Service (DMS) setup with Terraform makes the process very highly automated and reproducible. In this lab, we’ll create a full Terraform configuration that provisions a DMS replication instance, configures source and target endpoints, defines a replication task for migrating an entire schema, and sets up the necessary IAM roles for seamless operation.

1. AWS Provider Setup

Terraform starts by configuring the AWS provider. This tells Terraform where to deploy your resources. Adjust the region as needed:

provider "aws" {
  region = "us-east-1"  # Adjust region as needed
}

2. Creating the DMS Replication Instance

The replication instance is the core compute resource for DMS. It handles the data migration process. Here’s the complete resource configuration:

resource "aws_dms_replication_instance" "dms_instance" {
  replication_instance_id   = "dms-instance"
  replication_instance_class = "dms.t3.medium"  # We are using Smallest available instance class for DMS to control cost.
  allocated_storage         = 50 # This is also smallest possible value
  publicly_accessible       = true # You can keep it false in prod env
}

In this code we are creating dms-instance and uses the dms.t3.medium class. We are allocatting 50 GB of storage.

3. Configuring DMS Endpoints

Endpoints specify where data is coming from (source) and where it is going (target). This example uses an Aurora MySQL database as the source and an Aurora PostgreSQL database as the target.

Source Endpoint – Aurora MySQL

resource "aws_dms_endpoint" "source_mysql" {
  endpoint_id   = "source-mysql-endpoint"
  endpoint_type = "source"
  engine_name   = "mysql"

  username      = "admin"
  password      = "SUPERSECRET"
  server_name   = "rds-mysql-url.us-east-1.rds.amazonaws.com"
  port          = 3306
  database_name = "sample_db"
}

Target Endpoint – Aurora PostgreSQL

resource "aws_dms_endpoint" "target_postgres" {
  endpoint_id   = "target-postgres-endpoint"
  endpoint_type = "target"
  engine_name   = "postgres"

  username      = "master"
  password      = "SUPERSECRET"
  server_name   = "rds-postgres-url.us-east-1.rds.amazonaws.com"
  port          = 5432
  database_name = "sample_db"
}

Here we are setting the source and target. We specify the PostgreSQL connection details ensuring that DMS knows where to load the data.

4. Defining the DMS Replication Task

The replication task orchestrates the data migration. This task defines what to migrate, the migration type, and includes detailed settings to control the behavior during the migration.

resource "aws_dms_replication_task" "dms_task" {
  replication_task_id      = "mysql-to-postgres-migration"
  replication_instance_arn = aws_dms_replication_instance.dms_instance.replication_instance_arn
  source_endpoint_arn      = aws_dms_endpoint.source_mysql.endpoint_arn
  target_endpoint_arn      = aws_dms_endpoint.target_postgres.endpoint_arn

  migration_type = "full-load"  # Use "full-load-and-cdc" for ongoing changes

  table_mappings = <<EOF
{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "IncludeAllTables",
      "object-locator": {
        "schema-name": "sample_db",
        "table-name": "%"
      },
      "rule-action": "include"
    }
  ]
}
EOF

  replication_task_settings = <<EOF
{
  "TargetMetadata": {
      "TargetSchema": "",
      "SupportLobs": true,
      "FullLobMode": false,
      "LobChunkSize": 64,
      "LimitedSizeLobMode": true,
      "LobMaxSize": 32,
      "InlineLobMaxSize": 0,
      "LoadMaxFileSize": 0,
      "ParallelLoadThreads": 0,
      "ParallelLoadBufferSize": 0,
      "BatchApplyEnabled": false,
      "TaskRecoveryTableEnabled": false
  },
  "FullLoadSettings": {
      "TargetTablePrepMode": "DO_NOTHING",
      "CreatePkAfterFullLoad": false,
      "StopTaskCachedChangesApplied": false,
      "StopTaskCachedChangesNotApplied": false,
      "BatchApplyEnabled": false,
      "BatchApplyPreserveTransaction": false,
      "BatchApplyTimeoutMin": 0,
      "BatchApplySkipErrorTables": false
  },
  "Logging": {
      "EnableLogging": true
  }
}
EOF
}

Important details:

Task Identification: The replication_task_id uniquely labels the task.
Migration Type: It uses full-load to migrate the entire schema and data at once.
Table Mappings: The JSON mapping rule ensures that all tables in the CompanyDB schema are included.
Task Settings: These JSON blocks provide detailed configurations like LOB handling and logging options, ensuring that the task behaves as expected.

5. Configuring IAM Roles for DMS

AWS DMS requires specific IAM roles to interact with your VPC and CloudWatch logs. Two roles are created—one for VPC management and another for CloudWatch logging.

IAM Role for VPC Management

resource "aws_iam_role" "dms_vpc_role" {
  name = "dms-vpc-role"
  assume_role_policy = jsonencode({
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Service": "dms.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "dms_vpc_role_attach" {
  role       = aws_iam_role.dms_vpc_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole"
}

Here, we are granting the DMS service permission to assume the role.

IAM Role for CloudWatch Logs

Here we are handling the logging permissions. Policy Attachment ensures DMS can write logs to CloudWatch for monitoring and troubleshooting.

resource "aws_iam_role" "dms_cloudwatch_logs_role" {
  name = "dms-cloudwatch-logs-role"
  assume_role_policy = jsonencode({
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Service": "dms.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "dms_cloudwatch_logs_role_attach" {
  role       = aws_iam_role.dms_cloudwatch_logs_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole"
}

Creating the DMS cluster

Let’s apply the terraform configuration and observe the changes.

terraform apply

Observe the created DMS service

If everything is correct then it should start creating the DMS replication instance.

Pre-migration

Pre-migration assessment is important process to analyze the source database to identify compatibility issues, estimate migration effort, and plan the migration strategy.

Why is it important?

Detects schema and data type mismatches.
Estimates required changes and risks.
Helps choose the right tools and migration path.
Reduces surprises during migration.

Once you hit create migration assessment, it will start the process and give you the migration assessment result.

Reviewing migration results

After migration, AWS DMS can store detailed logs and migration reports in an S3 bucket. When you are reviewing them you can focus on these points:

Verify which tables and rows were migrated.
Identify any skipped or failed records.
Troubleshoot issues using detailed error logs.
Ensure data consistency between source and target.

This is how a sample report looks like

Finally, let’s start migration

Now comes the exciting part—starting the migration!

With everything configured and validated, kicking off the DMS task sets your data in motion. Watch as your tables begin flowing from MySQL to PostgreSQL, live and in real time. It’s the moment where planning turns into action, all with minimal downtime and full control via DMS.

Once this task is over, then your database has successfully migrated from Mysql to Postgres Sql. You can celebrate now! Yeyyy!

MySQL to PostgreSQL: A Hands-On AWS DMS Migration Guide with AWS RDS Aurora

Database migration

Why a live database migration is challenge

Different ways to perform database migration

What is AWS DMS?

When to Use AWS DMS

Benefits of AWS DMS

How to setup DMS

Automating AWS DMS Cluster Deployment with Terraform

Creating the DMS cluster

Observe the created DMS service

Pre-migration

Reviewing migration results

Finally, let’s start migration

Comments

More from this blog

The Power of Career Compounding: Lessons from 25 Years in Finance by Saurabh Jhalaria

The Four Dimensions of Great Leadership: Beyond Academic Excellence by Shilpa Rangaswamy

Exploration & Discovery: Career paths when the future is uncertain by Arijit Sarkar

Inside the AWS Community with Jason: A Candid Conversation in Bengaluru

Command Palette

Database migration

Why a live database migration is challenge

Different ways to perform database migration

What is AWS DMS?

When to Use AWS DMS

Benefits of AWS DMS

How to setup DMS

Automating AWS DMS Cluster Deployment with Terraform

Creating the DMS cluster

Observe the created DMS service

Pre-migration

Reviewing migration results

Finally, let’s start migration

Comments

More from this blog