← All Jobs
Posted Jun 8, 2026

Data Engineering Intern(Spring/Summer 2026)

Description: • Support the development and maintenance of data pipelines using Databricks, Spark, and similar technologies. • Write and optimize SQL and Python scripts for data transformation, integration, and automation tasks. • Develop automation scripts that populate metadata and comments across Databricks tables using structured definitions such as CSV files. • Assist in building a proof-of-concept for an automated data dictionary maintained with existing Databricks metadata. • Contribute to prototyping an AI-powered knowledge agent that uses internal data and documentation to answer common questions. • Collaborate with team members to improve data quality, cataloging, and metadata management across the ecosystem. • Participate in code reviews, design discussions, and sprint ceremonies to learn engineering best practices. • Document findings, workflows, and automation processes for future reuse. • Perform other duties as assigned. Requirements: • Actively pursuing a Bachelor’s or Master’s degree in Computer Science, Software Engineering, Information Systems, or a related technical field. • Foundational knowledge of Python and SQL for data manipulation and analysis. • Familiarity with ETL concepts and structured data formats such as CSV, JSON, and Parquet. • Interest in cloud-based data platforms, with Azure preferred. • Strong analytical and problem-solving skills with an eagerness to learn. • Effective communication and teamwork skills. • Exposure to Databricks, Apache Spark, or other distributed data frameworks is preferred. • Familiarity with Git or version control practices is preferred. • Interest in AI/LLM-based automation, data documentation, or metadata management is preferred. • Prior project or internship experience in data engineering or cloud technologies is preferred.