Job Description
As a Data Architect at SAVYMINDS, you will design, deploy, and manage scalable data architectures in the cloud and on-premises that serve as the backbone of our AI-driven solutions. Your focus will be on implementing scalable, secure, highly available cloud architectures & integrating automation, ensuring data quality, and enabling cloud-based infrastructure that supports advanced analytics and machine learning & bridging the gap between development and operations, automating workflows, and enabling rapid deployment of applications in cloud environments.
Key Responsibilities
- Design and deploy cloud-based infrastructure using platforms like AWS, Azure, or Google Cloud.
- Implement serverless computing solutions to optimize performance and reduce costs.
- Manage cloud resources, ensuring scalability, security, and efficiency across all systems.
- Ensure compliance with security best practices and industry regulations within the cloud environment.
- Develop and maintain comprehensive data architecture frameworks that ensure data consistency and scalability.
- Design roadmaps for infrastructure upgrades and integration with CI/CD pipelines.
- Assess and improve data quality, ensuring that data pipelines are robust and resilient.
- Design, implement, and maintain CI/CD pipelines for automated deployment of applications and infrastructure.
- Automate infrastructure provisioning and scaling using infrastructure-as-code (IaC) tools.
- Monitor system performance and troubleshoot any issues to ensure high availability and reliability.
- Implement security best practices within CI/CD workflows to ensure compliance and protect against threats.
- Act as a strategic advisor on data governance and compliance, ensuring that all data practices align with best industry standards.
- Build and manage scalable ETL (Extract, Transform, Load) pipelines to process and integrate data from various sources.
- Design comprehensive data flow solutions that ensure data quality and efficiency.
- Implement both real-time and batch processing pipelines, optimizing them for performance and reliability.
- Collaborate with Data Scientists and ML Engineers to provide clean, well-structured data for modeling and analysis & integrate cloud services with data pipelines and applications
Qualifications
- Proficiency in cloud platforms such as AWS, Azure, or Google Cloud.
- Experience with infrastructure-as-code tools like Terraform, CloudFormation
- Strong understanding of networking, security, and cloud architecture principles.
- Knowledge of serverless computing, containers (e.g., Docker, Kubernetes), and microservices architecture.
- Proven experience in designing cloud-based and on-premises data architectures.
- Proficient in data modeling, ETL design, and data warehousing.
- Experience with CI/CD tools and data pipeline automation.
- Strong understanding of data governance, security, and compliance regulations.
- Experience with cloud-based ETL tools such as Apache NiFi, AWS Glue, or Azure Data Factory.
- Strong programming skills in Python, SQL, and other relevant languages.
- Knowledge of big data frameworks like Apache Kafka, Spark, and Hadoop.
- Proficiency in DataOps practices for automating data pipelines and ensuring observability.
- Familiarity with both on-premises and cloud data storage solutions, including data lakes and warehouses.
- Hands-on experience with tools such as Apache Kafka, Spark, AWS, Azure, and GCP.
- Experience with CI/CD tools such as Jenkins, GitLab
- Familiarity with infrastructure-as-code tools like Terraform or CloudFormation.
- Understanding of security practices and tools (e.g., Vault, IAM policies).
- Ability to work in a fast-paced environment and manage multiple projects simultaneously.
- Strong communication skills, capable of translating complex technical concepts for business stakeholders and work with cross-functional teams
- Problem-solving mindset with a keen eye for optimizing data infrastructure, automation and optimization.
