A Data engineer in Cookpad is responsible for improving the value of employees' work by providing accurate and timely data and the ability to process it.
Data engineers' jobs start with planning (identifying issues and thinking out possible solutions) which may require discussion with other teams or detailed analysis of current systems. In this planning phase, a data engineer should be self-driven, a good communicator, and good at problem solving.
Design, development, and operation is carried out by the team or individual depending on scope of the system. Abilities to design scalable and reliable systems to process massive data, to implement with relevant programming languages, and to deploy/operate it on cloud platforms like AWS are required in this phase.
People from any gender, background and nationality are welcome and not discriminated against. We welcome engineers who can build better data platforms for Cookpad with us.
- Plan, build, operate, optimise the company-wide data platform which is used by multiple teams.
- Provide a holistic view of users' on our service by ensuring consistent data collection in multiple touch points.
- Provide one-stop service for all data by collecting and integrating data from multiple sources.
- -Help employees to locate and understand data of interest by preparing metadata and documentation.
- Provide data and ability to process it to the people or applications at the right timing, by supporting both low latency streaming processing and high volume batch processing.
- Help employees to find and share insights by providing tools for analytics (Business Intelligence tools).
- Keep the quality of data consistent to ensure correct analysis results by data monitoring.
- Ensure that personal and sensitive information is processed in accordance with the GDPR and other applicable regulations.
- Extensive experience with developing and operating data pipelines which are highly efficient, scalable and reliable.
- Extensive experience with operating and using MPP database or distributed data processing platform, like Redshift, BigQuery, Spark. We are using Redshift.
- Experience with operating company-wide data platform which is used by multiple teams.
- Experience with event streaming data platforms like Apache Kafka.
- Experience with deploying and operating a system on AWS or other cloud platforms.
- Experience with Git and code review with Github.
- Experience with basic Linux commands, MySQL/PostgreSQL administration.
- Experience with web application development with frameworks like RoR.
- Experience with Infrastructure as Code. We are using Terraform, Docker on AWS ECS.
- Self-Driving attitude to identify issues on current data platform.
- Problem solving skill backed by logical reasoning and task prioritisation.
- Good communication skill to work closely with other teams.