Foundational Data Reliability Engineer
Upload My Resume
Drop here or click to browse · PDF, DOCX, DOC, RTF, TXT
Requirements
• Hands-on engineer with strong proficiency in Python and SQL. • Experienced in profiling and optimizing ETL systems and large-scale data pipelines. • Strong experience with IT infrastructure, working experience in datacenter or any cloud platform (AWS, Azure, etc.). • Comfortable working with distributed systems / big data projects (SMB clusters, Palantir Foundry, Databricks, etc. are a plus). • Has worked with monitoring and observability tools (Datadog or similar). • Strong problem-solver with the ability to dive deep into complex technical issues. • Financial services or exposure to data-intensive domains is preferred. • Degree in Computer Science, Engineering, or equivalent practical experience. • Only selected candidates will be contacted for interviews. We appreciate your understanding. Thank you for considering a career with us. • Rimes is committed to promote the values of diversity and inclusion throughout the business. Whether it’s through recruitment, retention, career progression or training and development, we are committed to improving opportunities for people regardless of their background or circumstances. • Visit our Careers page to see our complete listings.
Responsibilities
• Own and maintain the core data infrastructure supporting global data delivery. • Maintain and optimize the health of servers supporting data workloads, including monitoring file systems, permissions, logs, and system I/O. • Profile, debug, and optimize large-scale ETL and data processing pipelines for performance and reliability. • Write and maintain Python and SQL code to support data workflows, modules, and archiving processes. • Use Datadog and other observability tools to proactively monitor, detect, and resolve system bottlenecks. • Collaborate with Data Developers, SRE, and Infrastructure teams to ensure system scalability and disaster recovery readiness. • Contribute to projects involving big data platforms, SMB clusters, and Palantir Foundry. • Continuously improve automation, processes, and system resilience. • Troubleshoot performance issues across compute, storage, and network layers, escalating as needed and driving permanent fixes. • Support the development, maintenance, and reliability of ETL/ELT data pipelines in collaboration with Data Engineering teams.