|
|
|
|
| Desired Industry: Information Technology |
SpiderID: 84890 |
| Desired Job Location: huber heights, Ohio |
Date Posted: 2/20/2025 |
| Type of Position: Contractor |
Availability Date: 02/24/2025 |
| Desired Wage: |
|
|
U.S. Work Authorization: Yes |
| Job Level: Experienced with over 2 years experience |
Willing to Travel: Yes, More Than 75% |
| Highest Degree Attained: Masters |
Willing to Relocate: Yes |
Objective: Highly skilled and results-driven Data Engineer with over 8 years of extensive experience in designing, building, and optimizing scalable data pipelines, big data architectures, and cloud-based solutions. Proficient in Python, Spark, Scala, and SQL, with expertise in tools like Kubernetes, Airflow, Databricks, and Snowflake. Adept at managing large-scale data processing systems, leveraging AWS, Azure, and Google Cloud platforms for efficient data storage, transformation, and analysis. Demonstrated ability to implement advanced machine learning models, perform feature engineering, and create insightful visualizations using Tableau and Matplotlib. Strong problem-solving skills and a proven track record of improving data infrastructure, ensuring data quality, and delivering actionable insights to drive business decisions. Experience in validating the Data from source to target (ETL Process). Checking the completeness and accuracy of the data by running various testing techniques using SQL Queries. Experienced working on Big-Data technologies like Hive, Sqoop, HDFS and Spark, Good exposure on cloud technologies like AWS S3, AWS EC2, AWS Redshift. Strong programming experience using Scala, Python and SQL. Having experience in developing a data pipeline using Kafka to store data into HDFS. Good Knowledge in loading the data from Oracle and MySQL databases to HDFS system using SQOOP (Structured Data). Experience in working with NoSQL databases like HBase, Cassandra and Mongo DB. Experience in developing customized UDFs in Python to extend Hive and Pig Latin functionality. Extensive experienced Big Data - Hadoop developer with varying level of expertise around different Big Data/Hadoop ecosystem projects which include Spark streaming, HDFS, MapReduce, HIVE, HBase, Storm, Kafka, Flume, Sqoop, Zookeeper, Oozie etc.
Experience: CLIENT 1: STANLEY BLACK AND DECKER, MISSION, TEXAS AUG 2023-PRESENT ROLE: DATA ENGINEER RESPONSIBILITIES: Supported eBusiness team to enable easy access to data via web scraping, data mining, and helped design content-based recommendation (to predict recommendation for the product) Collaborated with external partners to collect product data using Python. Worked on Big Data architecture projects for over 2 years, focusing on the design and implementation of scalable and efficient data processing pipelines. CLIENT 2: DEUTSCHE BANK, CARY, NORTH CAROLINA JUN 2021-AUG 2023 ROLE: BIG DATA ENGINEER RESPONSIBILITIES: Implemented Partitioning, Dynamic Partitions, Buckets in HIVE. Develop database management systems for easy access, storage, and retrieval of data. Perform DB activities such as indexing, performance tuning, and backup and restore. CLIENT 3: EARLY WARNING SERVICES, LLC, SCOTTSDALE, ARIZONA FEB 2019-MAY 2021 ROLE: DATA ENGINEER RESPONSIBILITIES: Developed Talend Bigdata jobs to load heavy volume of data into S3 data lake and then into Snowflake. Developed snow pipes for continuous injection of data using event handler from AWS (S3 bucket). CLIENT: HUMANA, LOUISVILLE, KENTUCKY AUG 2017- FEB 2019 ROLE: DATA ENGINEER RESPONSIBILITIES: Strategically architected and deployed an enterprise-grade Data Lake, facilitating advanced analytics, processing, storage, and reporting on high-velocity, large-scale data sets. Diligently ensured the integrity and quality of reference data within source systems through meticulous cleaning and transformation processes, working in close collaboration with stakeholders and the solution architect CLIENT: AMGEN, NEW ALBANY, OHIO JUN 2016- AUG 2017 ROLE: SOFTWARE DEVELOPER RESPONSIBILITIES: Hands-on experience on developing SQL Scripts for automation purposes. Data conversions and data loads from various databases and file structures.
Education: University of Central Missouri, USA, Masters in Computer Science.
Skills: Cloud Management AWS (EC2, EMR, S3, Redshift, Lambda, Snowball, Athena, Glue, DynamoDB, RDS, Aurora, IAM, Firehose), Azure (Databricks, Data Explorer, Data Lake Storage Gen-2, Data Factory, Airflow, Blob Storage, File Storage, SQL DB, Synapse Analytics, App Service, Kubernetes Service), Google Cloud Platform (GCP) Big Data Technologies Hadoop Distributed File System (HDFS), MapReduce, Apache Spark, Spark Streaming, Kafka, Hive, Pig, Impala, HBase, Sqoop, Flume, Oozie, Zookeeper, YARN, Snowflake, Cassandra Databases MySQL, PostgreSQL, SQL Server, Oracle, MongoDB, DynamoDB, Redis, Cassandra, Azure SQL DB, Azure Synapse, Teradata Programming Languages SQL, PL/SQL, HiveQL, Python, PySpark, Scala, R, Shell Scripting, Java, Regular Expressions Version Control Git, GitHub Operating Systems Windows (XP/7/8/10), UNIX, Linux, Ubuntu Containerization Tools Kubernetes, Docker, Docker Swarm APIs and Web Frameworks Django, APIs (third-party integrations)
Development Environments and IDEs Eclipse, Visual Studio Methodologies Agile Methodology, JIRA, Waterfall Methodology
Candidate Contact Information:
JobSpider.com has chosen not to make contact information available on this page. Click "Contact Candidate" to send this candidate a response. |
|
|
|
|
|