Data Engineer OR Data Analyst - Computer Software Programming Resume
Data Engineer OR Data Analyst  - Computer Software Programming Resume
My Spider Scam Awareness Contacting Us F. A. Q.
 
Job Seekers
Search Jobs
Browse Jobs
Post a Resume
Job Alerts
 
Employers
Search Resumes
Browse Resumes
Post a Job

Data Engineer OR Data Analyst Resume


Desired Industry: Computer Software/Programming SpiderID: 84360
Desired Job Location: Jacksonville, Florida Date Posted: 10/11/2023
Type of Position: Contractor Availability Date:
Desired Wage:
U.S. Work Authorization: Yes
Job Level: Experienced with over 2 years experience Willing to Travel: Yes, Less Than 25%
Highest Degree Attained: Masters Willing to Relocate: Yes


Experience:
Data Analytics Engineer,American Tire Distributors, Inc., Huntersville, NC June 2019 – Present
● Built data pipelines to load data from MS sql server from cloud storage buckets into bigquery on google cloud platform (python, ms-sql, google cloud platform - cloud storage, bigquery)
● Built framework to automate the data quality testing process of flat files (python)
● Built third party file monitoring - loading framework to load data into bigquery (python)
● Built scheduler framework using Airflow and deployed jobs using CI-CD pipeline (python, git, airflow)
● Built a framework to stream data from Oracle databases (golden gate) into bigquery using various google cloud technologies like confluent cloud kafka, google pub/sub, cloud functions.
● Built a scalable framework to stream real-time data from oracle databases into postgres and bigquery using kafka deployed on managed kubernetes clusters
● Automated the code generation and testing process of ingesting data into new bigquery tables using python.
● Ad-hoc analytics to analyze and predict customer behaviour using unsupervised learning (k-means clustering)
● Built a framework for on demand docker file runs using google container registry. (docker, python, cloud instance)
● Built framework to download/ harvest/ scrape data from 2000+ sources
● Built framework to extract data from different types of sources like csv, excel, pdf, web tables, etc,.
● Built Temporal workflow management tool to execute jobs
● Built a framework to process 2000+ unstructured pdfs and stored data in elasticsearch to make search and extract more efficient
● Built internal websites using Appsmith to build dashboards
Sr. Data Analyst/ Engineer, Bluestem Brands, Inc., Minneapolis, MN Jan 2017 – June 2019
● Analyzed Credit reporting data, architected and built scalable application which decreased manual effort and resulted in 200% plus productivity helping to cut down costs which decreased manual effort (Python, Selenium)
● Investigated data flow from various internal and external applications and predicted yearly and monthly credit bureau dispute volumes. Recommended process improvement for cost savings.
● Analyzed Collections data results, architected and built data pipelines and batch jobs using HiveQL, python, HDFS, Activebatch
● Analyzed Collections data and submitted daily/weekly/monthly reports that helps business in monitoring key metrics and underperforming areas. Python, MySQL, HiveQL
● Analyzed dialer results and decreased cost/credit application by making strategic changes. Python, Mysql
● Analyzed the data and made strategic changes to new customer acquisition process helping to cut down costs (Python, SQL)
● Analyzed Fraud data(peak season vs non-peak) from different samples provided insights on various fraud related reports and recommended areas of improvement that reduced avg. delinquency rates
● Recommended traditional and nontraditional methods of investigating fraud (Web and phone orders) to the Fraud investigators.
● Analyzed retail, credit, web and payments data and recommended modifications to the fraud detections and monitoring systems.
(HiveQL, MYSQL, HDFS, Python etc.)
● Built Fraud monitoring system (proof of concept ) and predicted potentially fraudulent activity by analyzing customer’s behavior to further improve customer satisfaction (Python, SQL, HIVEQL, Spark, Scikit-Learn- K-Means, SVM, Isolation Forest, etc.)
● Partnered with product owner, coordinated and assisted with Collections, Credit IT, Credit Bureau Reporting teams.
● Partnered with the product owner on the analysis and automated the process of reporting.
● Built and monitored technical programs for Collections, Credit Bureau reporting and Fraud teams.
● Served as a liaison for technical teams and non-technical teams

Data Science Intern, GoFind.ai Berkeley, CA May 2016- Sept. 2016
● Designed and built a scalable spark architecture to distribute data across clusters using Spark’s ML, MLlib library and deployed batch jobs on Amazon EMR cluster and ec2 instances.
● Designed work flows for processing 3 million images and implemented feature extraction and dimensionality reduction techniques and improved the classification accuracy of the retail categories using TensorFlow’s machine learning models.
Assistant Systems Engineer, Tata Consultancy Services, Bangalore, India Oct. 2013 – Dec. 2014
● Developed machine learning algorithms to improve the search results of products
● Developed a web admin tool for the client to retrieve data related to HP’s Printers and Personal systems
● Analyzed and reported real-time data from google APIs to the MongoDB database using MongoDB, Java and scheduled batch jobs
● Developed map reduce programs to segregate live streaming messages and pipelined the output data into the HDFS through Apache Kafka.
Transaction Risk Analyst, Amazon.com, Bangalore, India Jul. 2013- Sep. 2013
● Analyzed real time order level credit card, prepaid card and gift card transactions using Amazon’s analytic tools to detect risk related patterns and documented operational patterns.
● Decisions were made on accounts using traditional and nontraditional methods using customer’s historical web activity, banking information, E-commerce and social media.
● Co-ordinated with Machine learning teams in building robust fraud detection applications


Education:
South Dakota State University, Brookings, SD 2015 – 2016
Master of Science, Data Science [GPA: 3.7]
Courses: Big Data Analytics, SAS Programming, Data Warehousing/ Data Mining, Modern Applied Statistics, Statistical Programming, Predictive Analytics, Nonparametric Statistics, Programming-Data Analytics
SRM University, Chennai, India 2009 –2013
Bachelor of Technology, Mechatronics



Affiliations:
Data Analytics Engineer,American Tire Distributors, Inc., Huntersville, NC June 2019 – Present
● Built data pipelines to load data from MS sql server from cloud storage buckets into bigquery on google cloud platform (python, ms-sql, google cloud platform - cloud storage, bigquery)
● Built framework to automate the data quality testing process of flat files (python)
● Built third party file monitoring - loading framework to load data into bigquery (python)
● Built scheduler framework using Airflow and deployed jobs using CI-CD pipeline (python, git, airflow)
● Built a framework to stream data from Oracle databases (golden gate) into bigquery using various google cloud technologies like confluent cloud kafka, google pub/sub, cloud functions.
● Built a scalable framework to stream real-time data from oracle databases into postgres and bigquery using kafka deployed on managed kubernetes clusters
● Automated the code generation and testing process of ingesting data into new bigquery tables using python.
● Ad-hoc analytics to analyze and predict customer behaviour using unsupervised learning (k-means clustering)
● Built a framework for on demand docker file runs using google container registry. (docker, python, cloud instance)
● Built framework to download/ harvest/ scrape data from 2000+ sources
● Built framework to extract data from different types of sources like csv, excel, pdf, web tables, etc,.
● Built Temporal workflow management tool to execute jobs
● Built a framework to process 2000+ unstructured pdfs and stored data in elasticsearch to make search and extract more efficient
● Built internal websites using Appsmith to build dashboards


Skills:
Machine Learning: Predictive Modeling, Data Pre-Processing, Natural Language Processing (NLP), Neural Networks/ Deep Learning
Statistics: Hypothesis Testing, Confidence Intervals, Inference, Bayesian Modeling, Markov Chain Monte Carlo, Time Series
Programming: R, Python (Pandas, Scikit-learn, Scipy, Numpy), MATLAB, SAS, Python, Hive, Apache Spark, TensorFlow
Databases: MongoDB, Cassandra, MySQL, MS SQL
Tools & Applications: Hortonworks, R, Cloudera, SAS E- Miner, SAS Enterprise Guide, SAP business One, Tableau


Additional Information:
Sr. Data Analyst/ Engineer, Bluestem Brands, Inc., Minneapolis, MN Jan 2017 – June 2019
● Analyzed Credit reporting data, architected and built scalable application which decreased manual effort and resulted in 200% plus productivity helping to cut down costs which decreased manual effort (Python, Selenium)
● Investigated data flow from various internal and external applications and predicted yearly and monthly credit bureau dispute volumes. Recommended process improvement for cost savings.
● Analyzed Collections data results, architected and built data pipelines and batch jobs using HiveQL, python, HDFS, Activebatch
● Analyzed Collections data and submitted daily/weekly/monthly reports that helps business in monitoring key metrics and underperforming areas. Python, MySQL, HiveQL
● Analyzed dialer results and decreased cost/credit application by making strategic changes. Python, Mysql
● Analyzed the data and made strategic changes to new customer acquisition process helping to cut down costs (Python, SQL)
● Analyzed Fraud data(peak season vs non-peak) from different samples provided insights on various fraud related reports and recommended areas of improvement that reduced avg. delinquency rates
● Recommended traditional and nontraditional methods of investigating fraud (Web and phone orders) to the Fraud investigators.
● Analyzed retail, credit, web and payments data and recommended modifications to the fraud detections and monitoring systems.
(HiveQL, MYSQL, HDFS, Python etc.)
● Built Fraud monitoring system (proof of concept ) and predicted potentially fraudulent activity by analyzing customer’s behavior to further improve customer satisfaction (Python, SQL, HIVEQL, Spark, Scikit-Learn- K-Means, SVM, Isolation Forest, etc.)
● Partnered with product owner, coordinated and assisted with Collections, Credit IT, Credit Bureau Reporting teams.
● Partnered with the product owner on the analysis and automated the process of reporting.
● Built and monitored technical programs for Collections, Credit Bureau reporting and Fraud teams.
● Served as a liaison for technical teams and non-technical teams

Data Science Intern, GoFind.ai Berkeley, CA May 2016- Sept. 2016
● Designed and built a scalable spark architecture to distribute data across clusters using Spark’s ML, MLlib library and deployed batch jobs on Amazon EMR cluster and ec2 instances.
● Designed work flows for processing 3 million images and implemented feature extraction and dimensionality reduction techniques and improved the classification accuracy of the retail categories using TensorFlow’s machine learning models.
Assistant Systems Engineer, Tata Consultancy Services, Bangalore, India


Candidate Contact Information:
JobSpider.com has chosen not to make contact information available on this page.
Click "Contact Candidate" to send this candidate a response.


    





© 2025 Job Spider
Privacy Policy | CC Marketing Sites | Site Map | Links