The size of the worldwide IaC market is expected to increase at a compound annual growth rate (CAGR) of 20.3%, reaching USD 3.3 billion by 2030. Infrastructure as Code (IaC) remains a key solution for bringing automation for both infrastructure and apps, as industries continue leveraging cloud tech for improving DevOps processes. Automation in migration, thus, has also taken a precedence in demand among businesses looking to move to cloud. Since everything on the cloud is exposed as APIs, we can take advantage of cloud migration automation tools by accelerating the implementation of the right cloud practices, such as Cloud Security Posture Management (CSPM), DevOps with IaC, CaC and Continuous Integration and Continuous Deployment (CI/CD), etc. In this blog, we will look at the challenges in large scale migrations, how automation enablers and key practices help, as well as the solutions and tools that have worked for us in migrating large workloads.
Migrate Faster while Building Better with Niveus
Automated cloud migration techniques and tools have become an invaluable resource for managing large-scale cloud migrations effectively. Recent statistics indicate that 45% of companies are now employing Infrastructure as Code (IaC) tools, reflecting an increasing embrace of this method. In addition, 74% of IT executives are convinced that IaC will play a critical role in their future cloud strategies, underscoring a high level of confidence in its prospective benefits. Streamlining migrations and lowering risks can be achieved by establishing a structured framework, standardizing the procedures and by using the best practices in cloud migration automation.
Key pillars from the Google Cloud Architecture Framework, such as reliability, operational excellence, cost optimization, performance efficiency, and sustainability, act as guiding principles to ensure a seamless and effective migration process. The objective of automating cloud migration is to establish automation and standards that accelerate implementation of the platform’s setup, migration, and support services. Let’s look at some of the common challenges businesses face when migrating large workloads to the cloud.
Prevalent Hurdles with Large Scale Migrations
With our extensive experience in migration, we have found the issues mentioned below for large-scale migrations:
- Lack of Architecture and Design Framework: There is a notable lack of architecture and design framework for the different applications and databases that are being migrated by businesses. This can result in multiple issues including operational inefficiencies, scalability hurdles, unplanned downtime and increased costs.
- Manual Migrations or Using Tools Without Planning: Migrations are often done manually or by using migration tools, which can lead to increased chance of human error, downtime, and inconsistent processes.
- Fragmented Automation Practices: Migrations done with Terraform (IaC), Ansible(CaC), and CI/CD are often isolated and do not constitute standard solutions. Reusability is not considered when devising the automation artifacts and solutions.
- Organisation Preparedness for Automation: Understanding the current state of automation and expertise and skills is important otherwise the team won’t be able to move with full capacity.
Key Practices that Enable Efficient Migrations
There are certain enablers that help to accelerate the tedious process of large scale migrations. Below are some enablers that we have found to have worked well for our large migrations:
- Detailed Design Document: Having a detailed design document encompassing the migration approach, automation strategy, and building of reusable components can enable better implementation of the migration strategy.
- Grouping Category Applications into Buckets with Similar Technology Stacks: Organizing category applications into clusters that share similar technology stacks streamlines the automated cloud migration process by implementing uniform procedures and configurations. Uniform solutions guarantee readiness for production through integrated features for security, monitoring, and support, which minimizes variability and the risk of errors.
- Standard Solutions: This involves a pointed approach to infrastructure, application and configuration, though it can be tailored further to meet specific needs. The default configuration should be ready for production, addressing key elements such as security, monitoring, and support, all of which should be included.
- Infrastructure as Code (IaC) and Configuration as Code (CaC): Infrastructure can be created with Terraform or its equivalent. Migration of the application and database should be covered under IaC with Terraform and CaC with Ansible or Ad-hoc script.
- DevOps and CI/CD for Application Migration: Embracing DevOps methodologies and implementing CI/CD pipelines for transitioning VMs, GKE, Cloud Run, or Cloud Functions speeds up deployment, guarantees consistency, and adheres to best practices. These pipelines facilitate automated builds, testing, and deployments, decreasing manual effort.
- SRE Practices :- Site reliability as a practice is to ensure we create SLO, SLA and SLI for application journeys and more importantly ensure we devote our time to automate and create runbooks for mundane tasks.
- Automated Testing and Validation: Testing and Validation should be automated, including network testing, component testing for validation of configuration and best practices, integration testing, etc.
- Minimal and Efficient Cutover and Rollback Steps: Few steps in the cutover process and clearly outlined rollback methods minimize downtime and recovery duration, facilitating smoother transitions.
- Open-Source Tooling: Tools like Terraform and Ansible provide flexibility, cost-efficiency, and broad community support, enabling scalable and adaptable migration workflows.
- Comprehensive CI/CD Pipeline: Establishing CI/CD for infrastructure and applications, complete with approval workflows, security assessments, and testing, promotes a regulated and secure migration process. This guarantees that every change is reviewed and confirmed, reducing the chances of introducing vulnerabilities or errors.
- DevSecOps Pipeline Integration: Incorporating DevSecOps guarantees that security is embedded at every stage of the migration process, safeguarding sensitive data and infrastructure against possible threats.
- Tech debt for Manual Tasks: The idea is to have end to end automation but we will often ourselves be fighting against timelines and challenges to automate features or tasks which are not yet supported by Terraform or will take time to script or build Ansible Playbooks. For such cases, we need to use tech debt to track these manual steps and automate when necessary.
Cloud Migration Automation Solutions
For effective large-scale migrations, the following instruments and techniques have proven to be vital:
- Documentation of Design and Architecture: Detailed documentation of every component guarantees understanding and alignment.
- Assessment: We use assessment tools like Velostrata and Ad Hoc script to perform assessment of the current infrastructure and build a plan for gcp migration.
- Setting Up Infrastructure: Terraform is used to provision infrastructure efficiently and reliably. Depending on the client maturity we will use terraform cloud or enterprise, which ensure code is scalable and reusable with use of modules, data sources, outputs, locals and other Terraform best practices.
- Managing Configurations as Code (CaC): Ansible is utilized for the seamless management of configurations.
- Scripting and Validation: Python, Bash, or PowerShell are leveraged for creating temporary scripts and performing tests. Terratest provides automated testing to maintain system integrity.
- Task Automation: Bash scripts are employed to automate various tasks, thereby enhancing efficiency.
- Security Evaluations: Tools such as Checkout are incorporated for comprehensive security evaluations.
- Deployment and Management: Implementing a DevOps methodology guarantees the effective deployment of applications. Monitoring and observability are crucial for assessing performance and pinpointing issues.
- Enterprise or Premium Solutions: Open source tooling with Jenkins, Gitlab etc for CI/CD and Terraform OSS however premium or enterprise edition of these DevOps tooling can help. These enterprise tooling can help with advanced features of managing the code, integrate with existing tooling like ITSM, SMTP etc. to bring end to end automation possible.
- Cloud Operations and Ongoing Support: Detailed documentation aids the Cloud Operations team with post-deployment tasks.
Automated Cloud Migration Tools
Below are some common tools we use to automate all phases of a large migration and infrastructure setup:
- IaC with Terraform: Terraform creates and oversees infrastructure resources (such as VMs, networks, storage) in a uniform, repeatable way using declarative configuration files. It also automates the establishment of the necessary target infrastructure for migration, facilitating a seamless transition by ensuring consistent environments across development, staging, and production stages. We will build modules or solutions for each application, ensuring minimal inputs and is adaptable for different environments.
- Ansible for CaC: Ansible automates the setup and management of systems, guaranteeing that servers and applications are configured uniformly. It also manages post-provisioning actions, including installing dependencies, applying system settings, and preparing the environment for application deployment. Playbooks and Collections can be used from Ansible Galaxy and we do create Ad Hoc ones as per client requirements. These should be invoked via terraform or via Cicd pipeline post infrastructure creation.
- Jenkins for CI: Jenkins or equivalent CI DevOps tool orchestrates the CI pipeline by automating the testing, building, and packaging processes of applications to ensure code reliability prior to deployment. It enables the testing and validation of migrated workloads to confirm their functionality in the new setting.
- Deployment to Kubernetes with ArgoCD: ArgoCD facilitates continuous delivery by syncing Kubernetes manifests from Git repositories to clusters. It also oversees the deployment of applications to Kubernetes clusters, ensuring that the configurations align with the source of truth in the repository.
- Bash or Python Script for Ad Hoc Tasks: These scripts cater to specific, one-time tasks that are not covered by pre-configured tools (e.g., data transfers, file changes, or environment-specific adjustments). They also manage unique situations and quick fixes during migration, such as transferring data files or executing health checks.
Coding Best Practices
- All codes should be uploaded to a source code repository like GitLab.
- Organize the code in proper structure for example three subfolders:
- A. Landing zone. B. Migration C. CloudOps.
- Use a proper branching strategy.
- For Terraform, ensure the state is on Terraform.
- Use existing Terraform modules from a Cloud provider or the community and then customize them as per the requirement.
- Ensure Ansible Playbook and Collections are well organised and well integrated with the pipeline.
Documentation
- For each solution, there should be a README file.
- A Wiki page should be provided on how to implement the solution.
Setup and Migration – The Stages and Approach to Migrate to GCP
Migration involves different stages and approaches which can be automated or facilitated with correct tooling. Below, we showcase the approach specifically tailored for Google Cloud Platform:
For Lift and Shift Migration of VM
Re-hosting
Replatform
For re-platforming an application, here are some notable methods that can help migrate efficiently:
- VMware to GCP VM with autoscaling (MIG)
- VMware to GCVE with HCX.
In this journey, migration tools can help migrate the VM and orchestrate via migration runbooks and help minimize the downtime. We need to ensure once the VM is in GCP, we use Terraform import or use the image or disk to recreate the application on GCP and equivalent approach in GCVE.
Modernizing Applications
For modernizing an application with cloud, we have seen the following methodologies work well for different applications and scenarios:
- App on VM to MIG
- App on VM to GKE (microservices)
- App on VM to App Engine for webapps
- App on VM to Cloud Run (microservices)
- App on VM to Cloud Function (serverless)
During this journey, infrastructure CI/CD pipeline and application CI/CD pipeline can be built. Tools like Google Cloud Deployment Manager, Terraform, and Cloud Build automate the creation and management of infrastructure and application CI/CD pipelines. For infrastructure, automation handles provisioning resources like VMs, networks, and storage. For applications, it automates builds, tests, and deployments, enabling continuous delivery with minimal manual intervention. These automated workflows ensure consistency, reduce errors, and accelerate the overall modernization journey.
Automation of App Infra Setup
When automating the setup of application infrastructure, we have noted the following to work well:
- Setup of GCP VMs with Terraform and packer for golden image creation.
- Setup of GKE with Terraform
- Setup of GCVE with Terraform
- Setup of Cloud Run with Terraform
- Setup of Cloud Functions with Terraform
- Setup of App Engine with Terraform
Automation of Deployments on VM
Deploying VMs can be automated efficiently with the following:
- App deployment on standalone VM via CI/CD.
- App deployment on standalone VM in blue/green deployment way with use of load balancer and DevOps tooling.
- App deployment on MIG via CD.
- App deployment on MIG as blue/green deployment via CD.
Automation of App Configuration Changes
While automating application configuration changes, the following method can help build a robust process for deployment:
- When setting up an app on a GCP VM, check the app configuration and update it to the relevant ones.
- When setting up an app on GCVE, find the relevant app configuration and change it accordingly.
- TLS validation changes:
- Creation and TLS validation between app and the database
- Creation and TLS validation for load balancer setup on GCP
This automation will work with Bash or PowerShell Scripts. We can use Google Secret Manager or Hashicorp’s Vault which help to keep the configuration outside of the application which is easier to manage and secure as well. Application code changes will be required here.
Infra Testing – Validation and Components Testing
For Validation and Components Testing, businesses can use the following steps to improve their infra testing processes:
- Automation of infra best practices – testing using Checov and Steampipe.
- Test plan for validation of all different components and their features used. This can be automated using an Ad Hoc script.
- Test plan for integration testing of all different components. This is automated using an Ad Hoc script.
Automation of Adoption of Microservices to GKE
The containerization platform from Niveus can help with most of the below activities:
- Utility to create microservices
- Manual Docker file creation
- CI automation of building apps
- Automation (CI) of Kubernetes manifest (Helm) for deployment
- Automation of deployment via CD like Jenkins, ArgoCD or similar
- Blue/green deployment
Databases
DMS utility is already used and we use sop to ensure it is well understood. Use Terraform import command as :
- MySQL/PostgreSQL on-prem to Cloud SQLl MySQL/PostgreSQL (DMS)
- RDS MySQL/PostgreSQL to Cloud SQL MySQL/PostgreSQL (DMS)
- Azure MySQL/PostgreSQL to Cloud SQL MySQL/PostgreSQL (DMS)
There will be databases which will be migrated using live migration or dump for below databases.
- Elasticsearch, Logstash/Filebeat, Elastic agent, and Kibana
- SOLR.
- Mongodb.
- Redis.
Monitoring
For automating monitoring, we suggest the following tools for comprehensive coverage
- Prometheus and Grafana
- Cloud Monitoring
- ELK Dashboards
- Dynatrace (Third Party)
- AppDynamics (Third Party)
Conclusion
As businesses move to the cloud, streamlining the migration process is becoming an essential requirement for cloud partners. As discussed, we have seen how automation helps reduce human error and accelerates the implementation of cloud best practices. Managing the large-scale migrations requires a methodical approach and the creation of reusable automated artefacts. Scalability, security and reliability are improved by implementing tools like Terraform, Ansible, Jenkins and Kubernetes. Automation offers many benefits but taking one step at a time and creating curated reusable automated artifacts will help in large migrations. We at Niveus use the strategies to streamline and optimise cloud migrations, helping the businesses to focus on growth while we manage the complexities.