Big Data / Hadoop Developer
Roles and Responsibilities
- Design and develop code for large scale Hadoop Migration project from IBM BigInsights to Hortonworks Cluster
- Design and Develop applications using Spark Core, Spark SQL and core abstraction layer API’s like RDD and Data frame
- Develop code to analyze large volume of streaming data using streaming processes like Kafka and Spark Streaming
- Design source to target data mappings and develop business rules associated with the ETL processes
- Develop database routines using Snowflake databases
- Develop complex HQL queries to do data analytics on top of Bigdata
- Develop Sqoop jobs to migrate huge amount of data from Relational Database Systems to
- HDFS/Hive Tables and vice-versa
- Develop routines to integrate NOSQL databases with Hadoop cluster to store and retrieve huge amount of data
- Develop complex SQL queries using relational databases like Oracle and MySQL
- Develop applications using AWS services like S3, EMR, EC2, Step functions and CloudWatch