Position Details:

Data Engineer
 

Job Duties:

Design, build, and maintain scalable data pipelines for ingesting, processing, and transforming large volumes of data from various sources. Implement ETL (Extract, Transform, Load) processes using Apache Spark, Apache Flink, or Apache Beam to ensure data quality and consistency. Utilize streaming frameworks including Apache Kafka or Amazon Kinesis for realtime data processing and analysis. Perform Data Analysis, Profiling, Cleansing, Data Quality Checks and Validation. Prepare and build Ingestion, STTM, Data Lineage, Data Flow documents for the new Target System Azure Data Lake using Azure Databricks, Azure Data Factory. Define data transformation and ETL (Extract, Transform, Load) processes to prepare data for analysis and reporting. Prepare Data Mapping, Data Flow, Data Lineage and Entity Relationship Diagrams from Source to Target till BI Reporting. Work on capturing and documenting data-related tasks within JIRA, Confluence, and Bitbucket, ensuring that all data engineering activities are effectively tracked, documented, and shared across the organization. Develop and optimize data models for storage and retrieval in data warehouses such as Amazon Redshift, Google BigQuery, or Snowflake. Implement data partitioning, indexing, and optimization techniques to improve query performance and reduce latency. Ensure data security and compliance by implementing access controls, encryption, and auditing mechanisms. Administer and maintain databases including SQL and NoSQL databases. Perform database tuning, monitoring, and troubleshooting to optimize performance and reliability. Automate routine database tasks using scripting languages like Python or Bash. Integrate data from internal and external sources through RESTful APIs, web scraping, or data connectors. Develop and maintain data ingestion processes using Apache NiFi, Talend, or Informatica. Implement data quality checks and validation rules to ensure accuracy, completeness, and consistency of data. Establish data governance policies, metadata management, and data lineage tracking to maintain data integrity and compliance. Will work in Glastonbury, CT and/or various unanticipated client sites throughout the U.S. Must be willing to travel and/or relocate.