Codoxo is seeking a Data Engineer to design, build, and maintain scalable data pipelines that power analytics, reporting, and machine learning initiatives. This onsite role in Duluth, GA offers an opportunity to work under the guidance of senior engineers and help shape the data infrastructure supporting the company’s analytics needs.

Responsibilities

Contribute to the design, construction, and upkeep of scalable ETL and ELT pipelines.
Build and optimize batch and streaming workflows using AWS Glue, Spark, and Airflow.
Facilitate data integration across multiple structured and unstructured sources.
Deliver clean, efficient code in Python, PySpark, and SQL.
Monitor, troubleshoot, and enhance pipeline reliability and performance.
Improve database performance with an emphasis on PostgreSQL and cloud environments.
Maintain and support AWS-based infrastructure (EC2, S3, Glue, and related services).
Implement data validation, quality checks, and monitoring processes.
Ensure adherence to data governance, security, and regulatory standards.
Collaborate with data scientists and analysts to translate data requirements into scalable solutions.
Document data flows, architecture decisions, and technical processes.
Leverage AI-assisted development tools to accelerate work, improve testing coverage, and enhance code quality.

Requirements

Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a related technical field, or equivalent practical experience.
2+ years of experience in data engineering, software engineering, or related roles (internships included).
Proficiency in Python, PySpark, and SQL.
Familiarity with ETL/ELT concepts and data pipeline architecture.
Experience with relational databases, particularly PostgreSQL.
Basic understanding of cloud computing concepts, preferably AWS.
Exposure to distributed data processing frameworks such as Spark.
Experience working in Linux environments and basic shell scripting.
Strong analytical and problem-solving abilities.
Ability to collaborate effectively in a team under mentorship.
Strong written and verbal communication skills.

Technologies

Python
PySpark
SQL
AWS
AWS Glue
Spark
Airflow
PostgreSQL
Linux
Shell scripting
Git

Benefits

Health, dental, and vision insurance with 100% employee premium coverage starting day one
Unlimited PTO
Annual professional development stipend
Annual home office stipend
401K match after 90 days

Preferred Qualifications

Experience working with medical claims data is strongly preferred.
Hands-on experience with AWS services such as EC2, S3, Glue, and IAM.
Experience with workflow orchestration tools like Apache Airflow.
Exposure to data warehousing concepts and dimensional modeling.
Familiarity with CI/CD pipelines and version control (e.g., Git).
Understanding of data security, governance, and compliance best practices.
Experience supporting machine learning pipelines or analytics platforms.
Demonstrated use of AI tools to improve development efficiency.
Physical requirements: work is performed in an office environment (office or remote) and requires the ability to work at a desk, use a computer, and operate standard office equipment.

Data Engineer

Job Description

Responsibilities

Requirements

Technologies

Benefits

Preferred Qualifications

Similar Jobs

Data Engineer

Data Engineer, OIS/CXI Analytics

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Get Job Alerts