We are seeking a highly motivated IT Operations Center Engineer II to be part of an exciting team located in Seattle, WA. The ITOC Engineer II is a vital role to ensuring a healthy technology environment for our customers. The Chewy ITOC Engineer will be the level 2 escalation point of the ITOC analysts, performing advanced troubleshooting and operational tasks to support the Chewy IT environment. This role will be heavily concentrated on our server environment to support the infrastructure engineers and architects. The ITOC Engineer II will have access to perform advanced daily tasks such as disk cleanup, extends, migrations, resource allocations, etc. Daily operations and troubleshooting of networking related events are also expected of this role. The ITOC team is constantly engaging with the core IT teams to find ways to proactively monitor the environment and develop skills to handle vital operations. The shift will be Monday-Friday 8:00am-5:00pm PST.
What You'll Do:
- Act as an escalation point for ITOC analysts as they identify issues regarding our technology environment.
- Proactively react to alerts by performing correctional tasks to remediate issues and prevent larger issues from occurring.
- Provide daily support for the infrastructure team by performing daily operations that will overall optimize the systems environment. This includes but does not limit to resource allocation, VM migrations, storage analysis, system upgrades, etc.
- Provide daily support for the infrastructure team by performing daily operations that will overall optimize the networking and security environment. This includes but does not limit to adding additional networking equipment to the environment, fail-over and maintenance testing, OS upgrades, etc.
- Support various teams to ensure stable operations. This includes, but not limited to DevOps, Database team, Application teams, etc.
- Effectively escalate service impacting issues to the correct teams and collaborate to ensure the issue is resolved as soon as possible. Preliminary troubleshooting before escalating is required.
- Ensure monitoring tools are tweaked and configured properly to effectively send proactive alerts. This tool integrations, engineering monitoring capabilities, Splunk knowledge, etc.
What You'll Need:
- At least 5 years of experience in an IT Operation Center, or similar environment.
- Datadog, Solarwinds or similar system administration experience.
- AWS Cloud infrastructure experience.
- Advanced knowledge of network technologies, connectivity, protocols, and security.
- Extensive experience with Linux CLI.
- Splunk search query experience.
- Excellent organizational and troubleshooting skills.
- Ability to handle multiple tasks in a fast-paced environment.
- Effective communicator and collaborative worker at all levels of the organization.
- On-call availability outside of business hours to resolve urgent system issues.
- Ability to manage multiple projects with competing priorities.
- Application development knowledge and experience with associated tools such as Ansible, Terraform, and Jenkins
- Position may require travel
- Certified Cisco, CompTIA, or Microsoft network professional preferred.
- AWS or similar cloud services certification.
- Linux certification.
- ITIL v4 certified.
- Splunk Core certified.
- Scripting in Python, Bash, PowerShell, or similar.