Azure Data Engineer (Azure Databricks & Pyspark)
MediaRadar
N/A
Job Details
Contract
Full Job Description
12 month contract Azure Data Engineer (Azure Databricks & PySpark)
Job Description
We’re hiring a talented Azure Data Engineer enthusiast to work in our platform to help
ensure that our data quality is flawless. As a company, we have millions of new data
points every day that come into our system. You will be working with a passionate team
of engineers to solve challenging problems and ensure that we can deliver the best data
to our customers, on-time. You will be using the latest cloud data lake technology to
build robust and reliable data pipelines.
Job Responsibilities
● Develop expertise in the different upstream data stores and systems across the
company.
● Design, develop and maintain data integration pipelines for the organization growing
data sets and product offerings.
● Build unit testing and QA plans for data processes.
● Build data validation testing frameworks to ensure high data quality and integrity.
● Write and maintain documentation on data processes.
● Developing and maintaining data models and schemas.
● Strong analytical experience with database in writing complex queries, query
optimization, debugging, user defined functions, views, indexes etc.
● Write code that adheres to coding standards, procedures, and techniques. Maintain
the integrity of existing program logic according to specifications.
● Actively participate in the code review process to ensure development work adheres
to standards and specifications (including peer review and code review external to
team).
● Respond to all inquiries and issues in a timely manner as developed code/program
moves through the testing process.
● Participate in scrum, sprints, and backlog grooming meetings.
● Evaluate interrelationships between applications to determine whether a change in
one part of a project would impact or cause undesirable results in related applications
and design for effective interfaces between interrelated applications.
● Improve the health of system assets by identifying enhancements to improve
performance through tuning and monitoring, reliability, and resource consumption.
● Evaluate and troubleshoot root-cause analysis for production issues and
system failures; determine corrective action(s) and propose improvements to
prevent their recurrence.
● Maintain up-to-date business domain knowledge and technical skills in software
development technologies and methodologies.
● Provide input in the selection, implementation and use of development tools and best
practices.
Requirements
Technical:
● BS or MS in Computer Science or equivalent experience.
● 4 + years of experience in Databricks/Apache Spark with Azure data storage
solutions handling large datasets.
● Expert in SQL and Spark SQL, including advanced analytical queries.
● Proficiency in Python (data structures, algorithms, object-oriented programming,
using API’s) and familiarity with PySpark.
● Experience working with Databricks Delta tables, unity catalog, Dataframe API, read
and write from/to various data sources and data formats.
● Experience with both Batch and streaming data pipeline.
● Knowledge of Azure Data Factory, Azure Data Lake, Azure SQL DW, Azure SQL is a
plus.
Nice to haves
● Understanding of PostgreSQL and MS SQL
● Experience working in a fast-paced environment.
● Experience in an Agile software development environment.
● Ability to work with large datasets and perform data analysis
● Worked on migration project to build Unified data platform.
● Experience working with JIRA, Azure DevOps CI/CD pipelines.
Benefits
This is a 12 month contract position, 40 working hours per week and not benefits eligible. The base salary range for this position is $100,000-$120,000. A final compensation offer will ultimately be based on the candidate's location, skill level and experience.
If you need assistance or an accommodation, you may contact us at [email protected]