Why this role exists
Keep ClubSpot reliable at scale while we accelerate delivery. This is an operations-heavy SRE/DevOps role (not a “build tools only”), preferably based in Budapest, Hungary, with 1 day/week in the office.
Responsibilities
- Own day-to-day infrastructure operations: observability, incident response, capacity, backups/restore, runbooks.
- Maintain and improve CI/CD reliability (fast pipelines, caching, rollback paths), partnering with senior leads who drive weekly releases.
- Harden production: auth/secrets, change control, infra-as-code, environment parity, zero-downtime deploys.
- Lead post-incident reviews and turn outcomes into automation (reduce toil).
- Own and drive platform hardening efforts.
What you’ve done
- 4+ years in DevOps/SRE/Production Engineering with AWS including on-call ownership .
- Built observability stacks (metrics/logs/traces), defined SLOs/SLIs, familiar with establishing incident drills.
- You are familiar with PostgreSQL (or similar) and used it daily.
- Shipping via modern CI/CD (e.g., GitHub Actions/GitLab), with cache-savvy pipelines and one-click rollbacks.
- Hands on experience building Green/Blue or Canary
- Infra-as-code (Terraform or similar), networking basics, Linux, containers.
- Nice to have: security testing exposure/coordination, data retention/GDPR operationalization.
Ways of working
- Hybrid in Budapest, Hungary preferred; in-office 1 day/week.
- Collaborates with product engineers + senior leads who own the weekly release rhythm.
Why this role exists
Keep ClubSpot reliable at scale while we accelerate delivery. This is an operations-heavy SRE/DevOps role (not a “build tools only”), preferably based in Budapest, Hungary, with 1 day/week in the office.
Responsibilities
- Own day-to-day infrastructure operations: observability, incident response, capacity, backups/restore, runbooks.
- Maintain and improve CI/CD reliability (fast pipelines, caching, rollback paths), partnering with senior leads who drive weekly releases.
- Harden production: auth/secrets, change control, infra-as-code, environment parity, zero-downtime deploys.
- Lead post-incident reviews and turn outcomes into automation (reduce toil).
- Own and drive platform hardening efforts.
Requirements: Terraform, Docker, AWS, Bash, JavaScript, PostgreSQL Tools: Github, Confluence, GIT, Agile, Kanban. Additionally: Free coffee, Shower, Bike parking.