Seeding Initial Data Using Docker Compose and SQL Scripts

Table of Contents

Introduction
#

When setting up a new application, especially in development and testing environments, having a consistent and repeatable process for initializing your database with essential data is crucial. This article explores how the Contact Management Application uses Docker Compose and SQL scripts to automate the seeding process, ensuring that every instance of the application starts with the necessary baseline data.

This guide will cover:

Why data seeding is important.
Using Docker Compose to run MS SQL Server and execute SQL scripts for data seeding.
Writing SQL seed scripts for initial data population.
Automating the data seeding process with Docker Compose and ensuring the application is always ready with required data.

Part of the Series
#

This article is part of a series on building and understanding the Contact Management Application, a sample project showcasing clean architecture principles. Below are the other articles in the series:

Clean Architecture: Introduction to the Project Structure{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Implementing AutoMapper for DTO Mapping and Audit Logging{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Validating Inputs with FluentValidation{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Dependency Injection Setup Across Layers{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Handling Authorization and Role-Based Access Control (RBAC){:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Implementing Activity Logging with Custom Attributes{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Unit of Work Pattern and Its Role in Managing Transactions{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Using Dapper for Data Access and Repository Pattern{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Best Practices for Creating and Using DTOs in the API{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Error Handling and Exception Management in the API{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Dockerizing the .NET Core API, Angular and MS SQL Server{:target="_blank" rel=“noopener noreferrer”}
Clean Architecture: Seeding Initial Data Using Docker Compose and SQL Scripts (You are here)

1. Why Data Seeding is Important
#

Data seeding is the process of pre-populating the database with initial data that is necessary for the application to function, or for testing purposes. Some of the main reasons for using data seeding include:

Initial configuration data: Populating lookup tables or default values (e.g., roles, permissions).
Test data: Providing sample data for developers or testers to interact with.
Consistent environments: Ensuring the database is in a known state every time the application starts, making it easier to debug or run tests.

2. Using Docker Compose to Seed Data
#

Docker Compose allows you to orchestrate containers, such as MS SQL Server and your API, and automate the data seeding process. To seed the database with initial data, you can use SQL scripts that run when the MS SQL Server container starts.

2.1 Extending the Docker Compose File for Data Seeding
#

We will update the docker-compose.yml file to include a volume that will mount the SQL seed scripts to the MS SQL Server container.

services:
  frontend:
    build:
      context: ./frontend
      dockerfile: debug.dockerfile
    command: ["npm", "run", "start:debug"]
    ports:
      - 4200:4200
      - 49153:49153
    volumes:
      - ./frontend:/app
      - /app/node_modules
    stdin_open: true
    tty: true
    depends_on:
      - api
    networks:
      - mssql_network
 
  api:
    build:
      context: ./backend/src
      dockerfile: Debug.Dockerfile
    command: ["dotnet", "watch", "--project", "Contact.Api/Contact.Api.csproj", "run", "--urls", "http://0.0.0.0:5000"] 
    ports:
      - 5000:5000

      
    environment:
      - ASPNETCORE__ENVIRONMENT=${ENVIRONMENT}
      - DOTNET_SKIP_POLICY_LOADING=false
      - AppSettings__ConnectionStrings__DefaultConnection=Server=${SQL_SERVER};Database=${SQL_DATABASE};User ID=${SQL_USER};Password=${SQL_PASSWORD};Trusted_Connection=False;Encrypt=False;
      - AppSettings__Secret=${JWT_SECRET}
      - AppSettings__Issuer=${JWT_ISSUER}
      - AppSettings__Audience=${JWT_AUDIENCE}
      - AppSettings__PasswordResetUrl=${PASSWORD_RESET_URL}
      - SmtpSettings__SmtpServer=${SMTP_SERVER}
      - SmtpSettings__Port=${SMTP_PORT}
      - SmtpSettings__Username=${SMTP_USERNAME}
      - SmtpSettings__Password=${SMTP_PASSWORD}
      - SmtpSettings__FromEmail=${SMTP_FROM_EMAIL}
      - SmtpSettings__EnableSsl=${SMTP_ENABLE_SSL}
    volumes:
      - ./backend/src:/app
      - ~/.vsdbg:/remote_debugger:rw
    depends_on:
      - mssql
    networks:
      - mssql_network
      
  mssql:
    image: mcr.microsoft.com/mssql/server:2022-latest
    container_name: sqlserver_express
    environment:
      - ACCEPT_EULA=Y
      - MSSQL_PID=Express   # Specifies the edition to run as Express
      - MSSQL_SA_PASSWORD=${SQL_PASSWORD}   # Set the SA (System Administrator) password
    ports:
      - "1433:1433"  # Expose SQL Server port 1433
    volumes:
      - mssql_data:/var/opt/mssql  # Persist database data outside of the container
      - ./backend/scripts:/scripts  # Mount for SQL scripts
    entrypoint:
      - /bin/bash
      - -c
      - |
        /opt/mssql/bin/sqlservr & sleep 15s;
        /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P ${SQL_PASSWORD} -d master -i /scripts/seed-data.sql; wait
    networks:
      - mssql_network

volumes:
  mssql_data: # Named volume to persist data

networks:
  mssql_network:
    driver: bridge

Explanation:
#

volumes: The db service uses a volume to mount the ./scripts directory from your host machine to the container directory seed-data.sql. This directory will contain SQL scripts that will be executed when the container starts, it is configured in entrypoint, here it is waiting for 15 s and then running the command to seed the data.

3. Writing SQL Seed Scripts
#

The SQL seed scripts are responsible for inserting the initial data into your database. For example, we can create scripts to insert data into roles, permissions, and other necessary tables.

Create the directory ./scripts/ and add the SQL script files.

Example seed-data.sql
#

USE ContactDb;

-- Insert default roles
INSERT INTO Roles (Id, Name, CreatedOn)
VALUES (NEWID(), 'Admin', GETDATE()),
       (NEWID(), 'User', GETDATE());

-- Insert default permissions
INSERT INTO Permissions (Id, Name, CreatedOn)
VALUES (NEWID(), 'Contacts.Create', GETDATE()),
       (NEWID(), 'Contacts.Update', GETDATE()),
       (NEWID(), 'Contacts.Delete', GETDATE()),
       (NEWID(), 'Contacts.Read', GETDATE());

-- Map Admin role to permissions
DECLARE @AdminRoleId UNIQUEIDENTIFIER;
SET @AdminRoleId = (SELECT Id FROM Roles WHERE Name = 'Admin');

INSERT INTO RolePermissions (Id, RoleId, PermissionId, CreatedOn)
SELECT NEWID(), @AdminRoleId, p.Id, GETDATE()
FROM Permissions p;

Explanation:
#

The script seeds the Roles and Permissions tables with default data.
The Admin role is associated with all permissions for contacts (e.g., create, update, delete, read).

For full script for this project please refer seed-data.sql

4. Automating the Seeding Process
#

When the MS SQL Server container starts, it will automatically execute the SQL scripts located in /scripts/ (mounted via the volume). This ensures that the database is always seeded with the initial data.

4.1 Adding Seed Scripts to Docker Compose Workflow
#

To automate this process:

Ensure that all the SQL scripts are located in ./scripts/.
Updated the enterypoint command

 entrypoint:
      - /bin/bash
      - -c
      - |
        /opt/mssql/bin/sqlservr & sleep 15s;
        /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P ${SQL_PASSWORD} -d master -i /scripts/seed-data.sql; wait

Use the docker-compose up command to start both the API and the database containers. The database container will automatically execute the seed scripts upon startup.

4.2 Running Docker Compose

docker-compose up --build

Once the containers are running, the seed scripts will be executed, and you can access the MS SQL Server database with the pre-populated data.

5. Verifying the Seed Data
#

You can verify that the data has been correctly seeded by accessing the MS SQL Server container and querying the database.

5.1 Connecting to the MS SQL Server Container
#

To connect to the MS SQL Server container, use the following command:

docker exec -it <db-container-name> /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P YourPassword

Once connected, you can run queries to verify the data has been seeded:

USE ContactDb;
SELECT * FROM Roles;
SELECT * FROM Permissions;
SELECT * FROM RolePermissions;

These queries will return the data that was inserted by the seed scripts.

6. Benefits of Seeding Data Using Docker
#

6.1 Consistency Across Environments
#

Using Docker Compose to seed data ensures that every environment (local, testing, staging, or production) is initialized with the same data, reducing the chances of inconsistencies.

6.2 Automation
#

The seed data is automatically applied whenever the database container starts, simplifying the process of setting up the database. This is particularly useful when working in CI/CD environments or when multiple developers are working on the same project.

6.3 Reproducibility
#

By defining the seed data in SQL scripts, you can easily reproduce the same initial data set across multiple environments, ensuring that tests and development are done on identical data.

Conclusion
#

In this article, we explored how to use Docker Compose and SQL Scripts to seed initial data in the MS SQL Server database for the Contact Management Application. This process ensures that the database is always pre-populated with the necessary data whenever the application is run, making the development process more efficient and consistent.

In the next article, we will discuss error handling and exception management in the Contact Management API.

For more details, check out the full project on GitHub:

GitHub Repository: Contact Management Application{:target="_blank" rel=“noopener noreferrer”}

Introduction#

Part of the Series#

1. Why Data Seeding is Important#

2. Using Docker Compose to Seed Data#

2.1 Extending the Docker Compose File for Data Seeding#

Explanation:#

3. Writing SQL Seed Scripts#

Example seed-data.sql#

Explanation:#

4. Automating the Seeding Process#

4.1 Adding Seed Scripts to Docker Compose Workflow#

5. Verifying the Seed Data#

5.1 Connecting to the MS SQL Server Container#

6. Benefits of Seeding Data Using Docker#

6.1 Consistency Across Environments#

6.2 Automation#

6.3 Reproducibility#

Conclusion#

Related

Introduction
#

Part of the Series
#

1. Why Data Seeding is Important
#

2. Using Docker Compose to Seed Data
#

2.1 Extending the Docker Compose File for Data Seeding
#

Explanation:
#

3. Writing SQL Seed Scripts
#

Example seed-data.sql
#

Explanation:
#

4. Automating the Seeding Process
#

4.1 Adding Seed Scripts to Docker Compose Workflow
#

5. Verifying the Seed Data
#

5.1 Connecting to the MS SQL Server Container
#

6. Benefits of Seeding Data Using Docker
#

6.1 Consistency Across Environments
#

6.2 Automation
#

6.3 Reproducibility
#

Conclusion
#