Position Details:
Data Engineer
|
|
Job Duties:
Design, build, and maintain scalable data pipelines for
ingesting, processing, and transforming large volumes of data from various
sources. Implement ETL (Extract, Transform, Load) processes using Apache Spark,
Apache Flink, or Apache Beam to ensure data quality and consistency. Utilize
streaming frameworks including Apache Kafka or Amazon Kinesis for realtime data
processing and analysis. Perform Data Analysis, Profiling, Cleansing, Data
Quality Checks and Validation. Prepare and build Ingestion, STTM, Data Lineage,
Data Flow documents for the new Target System Azure Data Lake using Azure
Databricks, Azure Data Factory. Define data transformation and ETL (Extract,
Transform, Load) processes to prepare data for analysis and reporting. Prepare
Data Mapping, Data Flow, Data Lineage and Entity Relationship Diagrams from
Source to Target till BI Reporting. Work on capturing and documenting
data-related tasks within JIRA, Confluence, and Bitbucket, ensuring that all
data engineering activities are effectively tracked, documented, and shared
across the organization. Develop and optimize data models for storage and
retrieval in data warehouses such as Amazon Redshift, Google BigQuery, or
Snowflake. Implement data partitioning, indexing, and optimization techniques to
improve query performance and reduce latency. Ensure data security and
compliance by implementing access controls, encryption, and auditing
mechanisms. Administer and maintain databases including SQL and NoSQL
databases. Perform database tuning, monitoring, and troubleshooting to optimize
performance and reliability. Automate routine database tasks using scripting
languages like Python or Bash. Integrate data from internal and external
sources through RESTful APIs, web scraping, or data connectors. Develop and
maintain data ingestion processes using Apache NiFi, Talend, or Informatica.
Implement data quality checks and validation rules to ensure accuracy,
completeness, and consistency of data. Establish data governance policies,
metadata management, and data lineage tracking to maintain data integrity and
compliance. Will work in Glastonbury, CT and/or various unanticipated client
sites throughout the U.S. Must be willing to travel and/or relocate.
|
|
|
|
|