This hybrid Data Solutions Engineer role at Citi offers the chance to shape next generation data platforms while working across cloud migration initiatives and cross-functional teams. The role is based in Irving, Texas with a listed opportunity in Jacksonville, Florida, and offers a competitive salary range of USD 107,120 to 160,680 per year. Citi provides a comprehensive benefits package that supports health, retirement, wellness, and work-life balance through discretionary and formulaic incentive programs, medical/dental/vision coverage, a 401(k), life and disability insurance, wellness programs, and paid time off.

Responsibilities

As a core member of the Data Engineering team, design and build scalable Big Data solutions.
Collaborate with domain experts, product managers, analysts, and data scientists to create robust pipelines in Hadoop or Snowflake environments.
Deliver a data as a service framework to enable data accessibility and governance across the organization.
Lead and execute the migration of all legacy workloads to cloud platforms, coordinating across stakeholders.
Engage with stakeholders to elicit and document requirements, including detailed data flow specifications.
Assess solution options and work with cross-functional teams to drive optimal implementations.
Partner with data scientists to build pipelines from heterogeneous data sources and provide engineering services for data science applications.
Research and evaluate open-source technologies, recommending and integrating suitable components into designs.
Serve as a technical expert, mentoring teammates on Big Data and Cloud technology stacks.
Define requirements for maintainability, testability, performance, security, quality, and usability across the data platform.
Drive the implementation of consistent patterns, reusable components, and coding standards across data engineering processes.
Convert SAS based pipelines into modern languages such as PySpark and Scala for Hadoop and non-Hadoop ecosystems.
Optimize Big Data applications on Hadoop and non-Hadoop platforms for peak performance.
Evaluate new IT developments and evolving business needs, recommending system enhancements aligned with industry standards.
Assess risk and ensure compliance in decision making, safeguarding Citi, its clients, and assets while escalating and addressing control issues with transparency.

Requirements

5+ years of experience with Hadoop and Big Data technologies.
Proficiency in Python, PySpark, and Scala, including hands-on experience with fundamental machine learning libraries.
Experience building robust data solutions on Google Cloud or AWS; relevant certifications preferred.
Experience with SAS.
Experience with containerization and related technologies such as Docker and Kubernetes.
Comprehensive understanding of software engineering and data analytics.
Hands-on knowledge of the Hadoop ecosystem and Big Data technologies (HDFS, MapReduce, Hive, Pig, Impala, Kafka, Kudu, Solr).
Knowledge of Agile (Scrum) development methodologies.
Strong development and automation skills.
System-level understanding of data structures, algorithms, distributed storage, and compute.
A proactive approach to solving complex business problems, with strong interpersonal and teamwork skills.
Bachelor’s degree in a related field.

Technologies

Hadoop
Snowflake
Python
PySpark
Scala
Google Cloud
AWS
SAS
Docker
Kubernetes
HDFS
MapReduce
Hive
Pig
Impala
Kafka
Kudu
Solr
Java
Apache Beam

Data Solutions Engineer

Job Description

Responsibilities

Requirements

Technologies

Similar Jobs

Data Engineer

Senior Data Engineer

Data Engineer

Data Engineer Hadoop, HIVE & Python

Data Engineer

Data Engineer