Who we are Balena is a highly distributed company that has embraced a remote-first approach since 2013. We are a group of individuals from across the globe working together to achieve our mission: “reduce friction for fleet owners and unlock the power of physical computing”. For us, this means removing the barriers to entry for developing IoT products, whether that’s easing software deployments with balenaCloud, simplifying image flashing with balenaEtcher, or offering our own hardware based on our experience seeing thousands of devices running in production. We are developing an end-to-end solution that makes it easy for developers to build applications at the Edge. Our cultureWe place trust and autonomy in our team to own the outcome of their work. We practice radical candor and transparency with open, honest, and clear communications. We embrace first-principles thinking and constantly challenge our assumptions. We organize ourselves based on the best use of our collective abilities to solve our highest priority problems at any given time, rather than by a strict hierarchy. We’re not afraid to fail as long as we learn from our mistakes. We’re always looking for common patterns that allow us to reduce complexity. We embrace short term pain for long term gain, building products that will stand the test of time.

Balena

Fleet Reliability Engineer (Remote)

What you will do

The ‘balena fleet’ is ever-growing and heterogeneous, with hundreds of thousands of devices of different types and architectures distributed across the globe. The mission of our Fleet Reliability Engineers is to enable our users to safely deploy, monitor, and manage the health of their IoT devices and ensure they continue to scale their own fleets and succeed with balena.

As a member of the team, you will be at the cutting-edge of support-driven development. You will be part operator and part product engineer. You will investigate issues, assist users with solving immediate problems and work at all levels of the stack to help us build compatibility between previous and new versions of our components and sustainably scale the devices connected to our backend and the backend itself.

You will also develop solutions to high-impact, high-complexity challenges affecting the entire meta-fleet and contribute to the platform roadmap with data from the field. On-device metrics, monitoring, data visualization, and debugging are all common territory. Examples of past projects include balenahup — our solution for managing host OS updates; and configizer — a solution for safely adjusting on-device configuration remotely.

Responsibilities

  • Identify user needs and patterns in feedback and understand the root causes of friction while keeping a global view of all of our customer's fleets
  • Lead the shift away from reactive support to preventative maintenance by making existing tools more robust and scalable and building new ones
  • Help brainstorm long-term solutions and own the implementation of new features and products for balena fleet owners including development, testing, deployment, and maintenance
  • Contribute to documentation and user-facing guides for your implementations
  • Be a source of advice for peers, learning and teaching how to best help users and customer monitor and debug their fleets of devices
  • Participate in customer support – educate balena users on best practices for going to production and scaling and managing their fleets

Requirements

  • Background in software development, infrastructure, and/or system operations
  • Experience writing high-quality, production-ready code and debugging complex issues
  • Working knowledge of Linux operating system internals and scripting
  • Ability to manage ambiguity, make critical trade-off decisions, and push projects to completion
  • Continuous improvement mindset, and desire to make self and others more effective
  • Willingness to constantly build on your product knowledge (through projects, tutorials, support shifts, etc.)
  • Excellent verbal and written communication skills, and fluency in English

Bonus points

  • Firm grasp of technologies like Typescript, Node.js, Bash, Go, and Docker
  • Strong understanding of networking concepts (load balancers, routers, etc.)
  • Experience developing internal tooling and automation
  • Familiarity with IoT, embedded computing, or the balena platform as a user/contributor
  • Contributions to OSS projects and community involvement
  • Background in leading projects and working across functions to build reliable products

Make sure to let us know if any of these items apply to you!