AWS DMS Setup

How to setup AWS DMS:

This is assuming the PostgreSQL DB is already created

SETUP SCHEMA:

A. Get the sqlserver2pgsql.pl script from https://github.com/dalibo/sqlserver2pgsql/blob/master/INSTALL.md 

B. Install perl

C. In MSSQ Studio, right click on the database required to migrate and select Task -> Generate Scripts

D. In the terminal, go to where the sqlserver2pgsql.pl script is kept and run the following:
~/sqlserver2pgsql.pl -f <THE NAME OF THE FILE DUMP THAT CAME FROM PART C> -b before.sql -a after.sql -u unsure.sql -keep_identifier_case -stringtype_unspecified

E. Connect to the PostgreSQL DB and run the before script (this will build the tables, types, and columns)

F. Migrate your data over as listed in the 'SETUP AND RUN AWS DMS' section

G. Make sure that everything is migrated over

H. Connect to the PostgreSQL DB and run the after and unsure scripts
- Do so after you've reviewed them to make sure they're accurate and the constraint names are 63 characters or less

I. Go play some Starcraft, you've earned it

SETUP AND RUN AWS DMS:

  1. Create your Replication Instance on AWS DMS
    • Not needed anymore, can just use the one that's already built
    • Make sure that the Replication Instance's IP address is allowed to enter the VPC security group of both the Source enpoint's DB and the Target endpoint's DB, if not, add it to both
  2. Create your endpoints (source and target)
    • Test the connection with the Replication Instance before creating the endpoint (it will create even if the Replication instance can't connect to it)
  3. Do everything here on the MSSQ DB
    • https://aws.amazon.com/premiumsupport/knowledge-center/dms-migrate-rds-sqlserver/ 
    • I know that it sucks to need to enable CDC for each table sooo just run the AutoWriter program if the tables are different then what's there, make sure you replace the strings in the file
    • I've noticed better results (in that the task would run without errors) by just keeping 'exec sys.sp_cdc_stop_job;' and not turning it back on, though keep in mind that you probably won't be able to get your logs, so turn it back on if you need those
    • You may get a table that fails, that's fine as it's probably a table that MSSQ made and isn't in the DB
      • Run these to get the tables if needed
        For the dbo SCHEMA:
		SELECT TABLE_NAME
		FROM INFORMATION_SCHEMA.TABLES
		WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_SCHEMA='dbo'
			
	For the workflow SCHEMA:
		SELECT TABLE_NAME
		FROM INFORMATION_SCHEMA.TABLES
		WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_SCHEMA='workflow'
  1. Create a DB migration task
    • Task Identifier: Something unique
    • Descriptive Amazon Resource Name (ARN): Skip this
    • Replication instance: Choose the Replication Instance from step 1
    • Select your Endpoints
    • Migration Type - I recommend 'Migrate existing data and replicate ongoing changes'
    • Editing mode: Wizard
    • Target table preparation mode: Do Nothing
    • Include LOB columns in replication: Full LOB mode
    • Maximum LOB size (KB): I set it to 10,000 but you can set it to whatever is needed
    • Enable validation: true
    • Enable CloudWatch logs: true
    • Advanced task settings
      • Create control table in target using schema: dms
      • Enable control table - enable all
    • Table mappings
      • Create two new selection rules and include SCHEMAs 'dbo' and 'workflow'
      • Create one new transformation rule with Target -> Schema, Schema name -> 'dbo', Action -> 'Rename to', Value -> 'public'
    • Migration task startup configuration: Automatically start
    • Save n start it

Revision #2
Created 18 July 2022 23:24:53 by Bryce Holloway
Updated 9 February 2023 20:20:49 by Bryce Holloway