Data Engineering/Data Scientist Summer Intern
About our group:
Our Longmont facility serves as the base of operations for storage product development in hard disk drive and datacenter solutions markets. The Field Quality Analytics team is focused on delivering data-based solutions to improve decision making for product quality and manufacturing issues. The team protects Seagate’s customers from quality issues by applying cutting-edge machine learning (ML) tools and analytics to solve complex Big Data problems, detect and prevent quality issues, and enable data-driven decisions. Our team works within a creative environment and is an energetic and fast-paced team of computer scientists, mathematicians, and engineers.
About the role - you will:
You will work closely with our team of Data Scientists and Engineers to help create, enhance, improve, deploy, and maintain various workflows used to generate data for our machine learning processes.
- Design and build processes for deploying and monitoring spark ML jobs
- Design queries and workflows for assembling large data sets in Hive and Presto
- Facilitate in building scripts for automating general data engineering workflows
- Participate in code reviews and analysis
- Help develop and design general python packages
- Translate business needs into software requirements and execute on them
- Gain real world experience in software engineering, machine learning engineering, cloud computing, data engineering, and big data.
- Strong interpersonal and communication skills in order to effectively contribute to technical teams and make presentations to a variety of technical and business personnel
- Interest in statistics or applied mathematics including machine learning concepts
Your experience includes:
- Pursuing a Master’s or Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Math, Physics, or other related degree and enrolled in Fall 2021 courses
- 3 years of programming or database experience or coursework
- Some experience with statistics or applied mathematics including machine learning concepts
- Understands basic software engineering practices like algorithmic runtime analysis, functional programming, concurrent programming, database design, scripting, and data structures
- Demonstrates skills in at least one of the following programming languages: Python, SQL, Scala, Bash, or another scripting language.
- Demonstrates skills in at least one of the following software frameworks and tools: Spark, Spark SQL, Presto, Hive, Spark MLlib, H20, or Apache Airflow
- Experience with ETL, EL, and ELT pipelines
Our Longmont product-design campus is nestled against the foothills with exceptional views of the Rocky Mountains. Here at work, you can grab breakfast and lunch in the on-site cafeteria or get an afternoon espresso, prepared by a professional barista. Our 1,500+ employees enjoy an active on-site experience from sporting activities (get in a few laps at lunch on our 1-mile walking path around campus, play ping-pong or volleyball, or stop in our 24- hour fitness center for a group or individual workout) to community service and many employee resource groups including Pride!, Women’s Leadership Network and a Young Professionals Network.
Location: Remote US Colorado