Big Data Engineer
Raleigh, NC Posted: 9/1/2022
Location: Raleigh NC
Typle: Direct Hire in the office a least 2 days a week (Hybrid)
- Build distributed, high-volume data pipelines to power new solutions based on analytics and machine-learning
- Develop models and perform ad-hoc data analysis to identify patterns and insights
- Optimize performance of the company's big data ecosystem
- Write complex queries to ensure data is easy to access and manage
- Work all over the stack, moving fluidly between programming languages
- Collaborate with new, cross-functional teams on accelerated projects to scale data architecture, build digital products, and execute data science solutions
- Accountable for the design and development of new Big Data applications for evaluating data for actionable insights
- Provide mentorship to other team members and partake in cross-training.
- Deploy current project planning techniques to simplify very complex projects into tasks and manageable scope to ensure deliverables are achieve within deadlines.
- Be up to date on current technology trends as regards Big Data and Analytics
- Make sure there is platform integrity conceptually and architecturally
- Can take on large-scale, multi-tier big data projects
- Design and build improved architectural models for scalable data processing along with scalable data storage
- Contribute to collective software and system design and advancement of the new platform.
- Perform all other duties and special projects as assigned.
- Bachelor's degree or MS/PhD in computer science, software engineering, applied mathematics, statistics, or related field.
- Experience in structuring databases and working with Big Data technologies and tools accompanied with a capacity to communicate ideas within a team.
- Hands on experience in big data frameworks and tools (e.g. Hadoop, Spark, MapReduce, Hive, Pig, Luigi/Airflow, Kafka, Data streaming, NoSQL, SQL)
- US citizen or green card holder (due to compliance requirements)
Preferred for the Big Data Engineer
- Ability to build and operate data pipelines and reusable data ingestion code packages
- Working experience of building applications on cloud computing clusters (e.g. Azure, AWS, GCP)
- Command strong knowhow of commercial/manufacturing IT infrastructures, comprising networking, storage, security, and systems management with familiarity and awareness of industry standards and trends
- Expertise in Parallel Processing (MPP) Database along with strong conceptual and analytical thinking
- Competence in multiple scripting languages (Python/Java/ Shell/Perl/ Scala/Go/ JVM/C++, etc.)
- Must have hands on experience of usage and execution at scale of NoSQL solutions like MongoDB, CouchDB, HBase, HIVE, and Cassandra
- Command expertise in open source Big data technologies comprising Hadoop eco-system (Spark Scala, Impala, Spark, HBase, MapReduce, Streaming, Pig)
- Access to commercial platform in Big data technologies from providers such as IBM (Netezza), Cloudera, or Oracle (Exadata)
- Proficiency in Relational SQL databases (e.g. MySQL, Oracle, SQL Server, etc.), Analytics platforms (e.g. DataStage or similar) and OLAP technologies
Already have an account? Log in here