Wednesday, August 6, 2025

SRE/ DevOps Engineer- Atlanta,GA

Title: Site Reliability Engineer (SRE) Profile

Client-Equifax

Location: Alpharetta, GA (Onsite)

Job Description

 Seeking an experienced, Site Reliability Engineer who can operate independently with limited guidance and oversight. This individual will be passionate about end-user experience and will be part of a tight-knit, distributed engineering team developing and delivering a comprehensive dat a operations management solution. SRE is a critical role in the entire SDLC from coding, scaling, and ensuring production stability that includes responding to on-call incidents.

Data operations management solution consists of:

 

A web portal UI/UX that provides a single point of access to all data management and data reliability engineering

A suite of backend API services that services the UI and integrates with low-level Data Fabric and other third-party system APIs Modern data lakehouse (data lake, data warehouse, batch and streaming ELT pipelines)

The data operations roadmap envisions a set of rich management capabilities including:

 Serves a large community of geographically dispersed data operations stakeholders Data quality and observability management to detect, alert, and prevent data anomalies Troubleshooting, triaging and resolving data and data pipeline issues

OLAP, batch and streaming big data processing, and BI reporting MLOps

Real-time dashboards, alerting and notifications, case management, user/group management, AuthZ, and many other foundational capabilities

Tech Stack

 Frontend: Angular 17+, JavaScript, TypeScript, HTML, SCSS, Webpack Module Federation, Tailwinds CSS, Angular Material, Angular Elements

Backend: Java (JDK 17+), Spring Framework 6.X.X, Spring Boot 3.X.X, NestJS 10.X.X, REST and GraphQL microservices, NodeJS

Tools & Frameworks: Nx build management, Monorepo architecture, Jenkins CI/CD, Fortify, Sonar, GitHub

Cloud & Data: GCP (GKE, Composer + Airflow, Dataflow + Apache Beam, BigQuery, BigTable, Firestore, GCS, PubSub, Vertex AI), Terraform, Helm Charts, GitOps

Other Technologies: Websockets, SSE, event-driven architecture


 

Environment

 Culture: Fast-paced, creative, results-oriented

Team Structure: Agile, working in 2-week sprints using Aha and Jira for project management

Expectations: Self-starters who can work independently with limited guidance, delivering solutions that end-users value and love

General Responsibilities

 Contribute to Development Activities: SRE is expected to participate in SDLC activities that include design, develop, test, deploy, and operate, covering both frontend and backend

Cross-Functional Work: Collaborate with global teams to integrate with existing internal systems and GCP cloud Issue Resolution: Triage and resolve product or system issues, ensuring quality and performance Documentation: Write technical documentation, support guides, and run books

Agile Practices: Participate in sprint planning, retrospectives, and other agile activities

Compliance: Ensure software meets secure development guidelines and engineering standards

SRE Accountability

 General: Use coding, automation, and software engineering principles to ensure scalability, performance, and reliability efficiently and toil-free

IAC: Build infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK)

CI/CD: Build CI/CD pipelines for build, test and deployment of application and cloud architecture patterns, using platform (Jenkins) and cloud- native toolchains


Labels: , ,