Site Reliability Engineer
NEAR is a sharded, developer-friendly, proof-of-stake public blockchain, built by a world-class team that has built some of the world's only sharded databases at scale.
We are looking for a person who loves automating manual work, solving complicated problems, says no to downtime, working in an energetic and free-thinking environment, feels comfortable challenging opinions, and most importantly who shares with us the same desire to build the distributed Web.
The NEAR Protocol Engineering team is looking for a Site Reliability Engineer to work as part of all core engineering teams to help cope with the operational load of a fast growing team. At NEAR Protocol, we must deliver availability, performance, efficiency, monitoring and emergency response, all while enabling decentralization of the NEAR Protocol's Open Web infrastructure. We are looking for a person to join our distributed on-call rotational team and help us create a self-sustaining blockchain infrastructure.
This is a high-productivity and highly dynamic startup environment so you will need to be comfortable operating quickly but precisely amidst changing needs. There is opportunity to inject your creativity in almost any aspect of blockchain development.
- Excellent written and verbal communication skills in English
- Proven ability to be effective on a distributed team
- Passion for open source
- Advanced Python coding skills
- Solid understanding of UNIX internals
- Sharp troubleshooting skills, no problem is impossible to solve
- Experience with cloud provisioning tooling like Terraform, Packer, Ansible, Docker
- Experience with monitoring infrastructure like Grafana, Prometheus, Datadog
- Experience with CI infrastructure such as Travis, CircleCI, or Jenkins
- Experience in keeping services up 24/7
- Expertise in large-scale distributed systems.
Nice to Have:
- Experience with the Rust programming language
- Experience with multiple cloud providers AWS, Azure and Google Cloud Platform
- Knowledge of blockchain technologies.
- Together with the engineering team you will share the 24/7 oncall rotation
- Help build self-driving services which run and repair themself
- Help define SLOs and mission critical metrics
- Drive our incident management response processes
- Build an emergency response playbook with monitoring and alerting
- Work with our core blockchain, middleware, and apps teams to deliver secure and high availability services
- Collaborate with a geographically distributed team, work in the open as part of the NEAR Protocol open source project, and engage with NEAR Protocol's global community