Imran – ML Engineer – NLP/ML/Deep Learning – 10 years Exp – Current Location : Humble,TX – (Willing to relocate anywhere in USA)
Consultant's Details:
Consultant Name: Imran
Current Location : Bay Area,CA
Employer Details:
Employer:Nextgen Technologies Inc
Contact Person:Kushal
Email:kushal.desai@nextgentechinc.com
Note: Please call between 09:30 AM PST to 06:00 PM PST
Phone: +1 (413) 424-0484
Imran's Resume
Sr. Data Scientist/ ML engineer
PROFESSIONAL SUMMARY
- Results-driven Machine Learning Engineer / Data Scientist with 10 years of experience in Data Analysis, Data Mining, and Machine Learning with large data sets of structured, semi-structured, and unstructured data for data acquisition, validation, predictive modeling, and visualization using Python, Spark MLlib, SQL, and Scikit-Learn.
- Participated actively in all areas of the data science project life cycle, including data extraction, cleaning, visualization, and model development and maintenance.
- Excellent knowledge of machine learning techniques and algorithms such as logistic regression, SVM, random forests, deep learning, and so on.
- Strong experience with the Software Development Life Cycle (SDLC), including requirements gathering, analysis, design specification, and testing in both Waterfall and Agile methodologies.
- Extensive expertise with Python Visualization packages such as Seaborn, Matplotlib, and Plotly to apply Machine Learning solutions to a wide range of business problems.
- Experience building Machine Learning models with Python packages such as NLP libraries like NLTK or spaCy, Pandas, NumPy, Seaborn, SciPy, Matplotlib, and Scikit-learn.
- Comprehensive understanding of statistical prediction modeling, model testing, issue analysis, machine learning, validation, classification techniques, and economic forecasting approaches.
- Excellent understanding of analytics concepts and Supervised Machine Learning algorithms like Logistic Regression, Linear regression, K- Nearest Neighbors, Naïve Bayes, Support Vector Machines, Decision Trees, and Ensemble models: Random Forests, Gradient Boosted Decision Trees, Stacking Models.
- Experienced in visualizing and reporting real-time insights using Tableau, Grafana, and ggplot, matplotlib to increase project visibility and ensure better business decisions.
- Working knowledge of relational databases (Teradata, Oracle) and advanced SQL programming, text analytics, data visualization using Python, NoSQL databases like MongoDB, and dashboard development using platforms like Tableau.
- Ability to create advanced statistical and predictive models such as generalized linear, decision tree, clustering, neural network models, ensemble models, Support Vector Machines (SVM), and Random Forest.
- Experience with a wide range of NLP approaches for information extraction, topic modeling, parsing, and relationship extraction, as well as designing, implementing, and maintaining scalable production NLP models.
- Technical expertise and experience in building, data modeling, and online applications, as well as working as a solution lead for Architecting Data Warehouse and Business Intelligence Applications.
- Extensive experience converting business requirements into analytical models, inventing algorithms, building models, and producing data mining and reporting solutions that scale across massive amounts of structured and unstructured data.
- Skilled in data processing, modification, and preparation techniques such as describe data contents, compute descriptive statistics of data, regex, divide and combine, Remap, reindex, merge, subset, and reshape.
TECHNICAL SKILLSET
Programming Languages: Python, SQL, Java, C#, Shell Scripting, JavaScript
Machine Learning & Deep Learning: Scikit-learn, TensorFlow, Keras, PyTorch, Spark MLlib, NLP (spaCy, NLTK), OpenAI, Hugging Face
Big Data & Databases: Hadoop, Spark, Hive, Teradata, Oracle, PostgreSQL, MongoDB, Cassandra, SQL Server
Cloud & DevOps: AWS (S3, Redshift, Sage Maker, Lambda, Glue), Azure, Snowflake, Google Analytics, Git, GitHub
Visualization & Reporting: Tableau, Power BI, Matplotlib, Seaborn, Plotly, ggplot2, Excel
Stats Analysis & Predictive Modeling: Regression Analysis, Decision Trees, Random Forest, XG Boost, PCA, A/B Testing, ANOVA, Hypothesis Testing
Key expertise includes:
- End-to-end ML lifecycle: data extraction, cleaning, visualization, model development, deployment, and maintenance.
- Machine learning & statistical modeling: Logistic Regression, SVM, Random Forests, Gradient Boosted Trees, Deep Learning (CNN, LSTM, RNN).
- Big Data & Cloud: Hands-on experience with AWS, Azure, Snowflake, Hadoop, Spark, Hive, and PySpark.
- NLP & LLMs: Specialization in transformer-based models (BERT, GPT-4, LLAMA2), fine-tuning, and prompt engineering.
- Visualization & Analytics: Proficient in Tableau, Grafana, Matplotlib, Seaborn, and Plotly.
- Agile & SDLC: Strong understanding of Agile (Scrum, Kanban) and Waterfall methodologies for software development.
EDUCATION DETAILS
JNTUH University, India 2010 – 2014
Bachelor of Technology, Computer Science, and Information Technology
Relevant courses– IoT, Data Structures, Cloud Computing, Object-Oriented Analysis, DBMS, and Design.
PROFESSIONAL EXPERIENCE
Intuit, Mountain View Sep 2024 – Present
ML Engineer – NLP/ML/Deep Learning
Responsibilities:
- Designed and implemented core software features for financial products, using Python, JavaScript, and Typescript, with a focus on enhancing user experience and performance.
- Integrated machine learning models for fraud detection using Scikit-Learn and XG Boost, improving
detection accuracy by 18%.
- Collaborated with DevOps teams to implement automated testing and continuous integration pipelines
- Using Jenkins, Docker, and Kubernetes, improving feature release cycles by 30%.
- Developed and maintained RESTful APIs using Flask, FastAPI, and Django, facilitating smooth communication between frontend and backend services.
- Designed and optimized relational databases (PostgreSQL, MySQL) and NoSQL databases (MongoDB) to accommodate large-scale financial data.
- Utilize a variety of tools and frameworks such as TensorFlow, scikit-learn, Azure ML, Google Cloud, and Hadoop for data analysis and model development.
- Engineer custom data pipelines using Pandas and integrate AI-driven components to automate complex data pre-processing tasks.
- Develop and supervise classification models using algorithms such as Stochastic Gradient Descent, Logistic Regression, Random Forest, and SVM.
- Evaluate model performance using appropriate metrics and apply advanced hyperparameter tuning techniques to enhance accuracy and generalization.
Environment: Machine learning, Python, Numpy, NLTK, Pandas, Scipy, SQL, Tableau, HDFS, Tableau, AWS Sage Maker, AWS Lambda, AWS Glue, DynamoDB, Mongo DB, SQL Server, and ETL.
Tech Mahindra, India Aug 2023 – July 2024
ML Engineer
Responsibilities:
- Applied ML modeling, trend analysis, and statistical data analysis techniques to understand customer behaviour, segmentation, text analytics, and product-level analysis.
- Developed and deployed ML models using AWS EC2, S3, and SageMaker for training and deployment.
- Designed and implemented diagnostic tools by using python, pandas & Plotly.
- Automated diagnostics tests for sensor interface connectivity in python to verify sensor status.
- Automated data preprocessing using AWS Lambda and optimized ETL workflows with AWS Glue.
- Managed data warehousing and analytics using AWS Redshift.
- Conducted data cleaning using Backward-Forward filling methods and performed data preparation in Azure ML with historical models.
- Designed and implemented advanced analytical tools and modeling methodologies to enhance decision-making.
- Built a recommendation system using content-based and collaborative filtering techniques, including NMF, SVD, Siamese ANN deep learning, and LightFM.
- Forecasted time series data using ARIMA, SARIMA, and SARIMAX models in Python.
- Designed and maintained Hadoop clusters for high availability and reliability.
- Integrated Matplotlib with Python libraries like NumPy and Pandas for data visualization.
- Applied ML algorithms such as decision trees, NLP, regression models, neural networks, deep learning, clustering, and SVM using scikit-learn and MATLAB.
- Optimized ML algorithms using stochastic gradient descent and fine-tuned parameters via Bayesian Optimization.
- Developed Power BI and Tableau dashboards by extracting complex SQL queries for data visualization.
- Utilized Python libraries (pandas, NumPy, SciPy, scikit-learn, TensorFlow, Keras) for ML model development, including neural networks, NLP, and ensemble learning.
- Participated in Agile workflows, including sprint planning, stand-ups, and client communication.
- Implemented MLlib and Spark ML for binary classification, regression, clustering, and collaborative filtering.
- Conducted sentiment analysis, word cloud creation, and text analytics using Twitter and other social media data.
Environment: Python, Machine Learning, AWS (Redshift, EC2, S3, Glue, Lambda), SQL, Tableau, Hadoop, DynamoDB, MongoDB, Spark ML, NLP, TensorFlow, Keras, MATLAB.
eBay, India May 2021 – June 2023
Senior Data Analyst
Responsibilities:
- Analyzed customer behavior, marketplace trends, and transactional data using SQL and Python to drive insights for eBay’s e-commerce platform.
- Extracted and processed data from multiple sources, including Oracle, SQL Server, and MySQL, to support business intelligence initiatives.
- Built predictive models using machine learning techniques such as regression, classification, and clustering (K-Means, DBSCAN) to enhance recommendation engines and fraud detection systems.
- Designed and optimized SQL queries using BTEQ (Basic Teradata Query) for advanced analytics and reporting.
- Created dashboards and data visualizations using Tableau to provide insights to senior management and improve decision-making.
- Conducted NLP-based text analytics on customer reviews, search queries, and seller feedback to improve search ranking algorithms.
- Collaborated with business teams to define key performance metrics and implement data governance best practices.
- Performed exploratory data analysis (EDA) and data cleaning techniques such as imputation to ensure high data quality.
- Utilized TensorFlow and Scikit-learn for deep learning applications, including customer sentiment analysis and demand forecasting.
- Developed ETL pipelines and leveraged Spark Streaming to process real-time data from Kafka, optimizing marketplace operations.
- Documented data mapping, transformation logic, and business rules to ensure consistency and accuracy in analytics projects.
Environment: MySQL, Python, SQL, Tableau, Pandas, NumPy, Scikit-learn, TensorFlow, Hadoop, Spark, NLP, BTEQ, Teradata, Data Governance.
Accenture, India Oct 2018 – Apr 2021
Data Analyst
Responsibilities:
- Worked with BTEQ (Basic Teradata query) and Oracle to submit complex SQL statements, import and export data, do text analytics, and generate Teradata reports.
- Worked with and extracted data from a range of database sources, including Oracle, SQL Server, and MySQL, and used the JIRA tool and other internal problem trackers on a regular basis for project development.
- Worked with Python programming for several Python visualization libraries such as Pandas, Seaborn, NumPy, Matplotlib, SciPy, Scikit-learn, plotly, and NLTK to perform data imputation.
- Working with unsupervised (K-means, DBSCAN) and supervised learning approaches (Regression, Classification) for feature engineering, as well as Principal Component Analysis for feature dimensionality reduction.
- Worked with Classification Machine Learning Algorithms such as Nave Bayes, Linear Regression, Logistic Regression, SVM, Neural Networks, and Clustering Algorithm K Means.
- Working with senior management to discover possible Machine Learning use cases and to set up a server-side development strategy environment and approach to identifying business challenges.
- Worked with visualization technologies such as Tableau to develop business reports and data visualization for top management in order to gain insights and enhance business processes.
- Worked with graphical models and deep learning models, as well as deep learning frameworks such as the TensorFlow library and TensorFlow API.
- Interacted with different departments to understand and identify data needs and requirements for defining sales and service data business rules.
- Worked using SQL and Python to analyze customer help data, contact volumes, and other operational data to deliver insights that allow to improve assistance content and customer experience.
- Responsible for developing data standards and data governance to specify the key identifiers for each mapping/interface.
- Used Spark-Streaming APIs to perform on-the-fly transformations and actions for developing the common learner data model, which receives data from Kafka in near real time.
- Performed on EDA (Exploratory data analysis) to comprehend and handle missing values using imputation techniques.
- Documenting new data to aid in source-to-target mapping and updating documentation for old data to aid in data cleaning and validation.
Environment: MySQL, Python, SQL, Statistical modelling, Pandas, Tableau, NumPy packages, Hadoop
Giant Eagle GCC – India Sep 2014 – Oct 2017
Data Analyst
Responsibilities:
- Collaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to t the analytical requirements.
- Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from RedShift.
- Explored and analyzed the customer specific features by using Spark SQL. Performed univariate and multivariate analysis on the data to identify any underlying pattern in the data and associations between the variables.
- Performed data imputation using Scikit-learn package in Python.
- Participated in features engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn preprocessing.
- Used Python 3.X (numpy, scipy, pandas, scikit-learn, seaborn) and Spark 2.0 (PySpark, MLlib) to develop variety of models and algorithms for analytic purposes.
- Developed and implemented predictive models using machine learning algorithms such as linear regression, classification, multivariate regression, Naive Bayes, Random Forests, K-means clustering, KNN, PCA and regularization for data analysis.
- Conducted analysis on assessing customer consuming behaviours and discover value of customers with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means Clustering and Hierarchical Clustering.
- Built regression models include: Lasso, Ridge, SVR, XGboost to predict Customer Life Time Value.
- Built classification models include: Logistic Regression, SVM, Decision Tree, and Random Forest to predict Customer Churn Rate.
- Used F-Score, AUC/ROC, Confusion Matrix, MAE, and RMSE to evaluate different Model performance.
- Designed and implemented recommender systems which utilized Collaborative filtering techniques to recommend course for different customers and deployed to AWS EMR cluster.
- Utilized natural language processing (NLP) techniques to Optimized Customer Satisfaction.
- Designed rich data visualizations to model data into human-readable form with Tableau and Matplotlib.
Environment: AWS RedShift, EC2, EMR, Hadoop Framework, S3, HDFS, Spark (Pyspark, MLlib, Spark SQL), Python 3.x (Scikit-Learn/Scipy/Numpy/Pandas/NLTK/Matplotlib/Seaborn), Tableau Desktop (9.x/10.x), Tableau Server (9.x/10.x), Machine Learning (Regressions, KNN, SVM, Decision Tree, Random Forest, XGboost, LightGBM, Collaborative ltering, Ensemble), NLP, Teradata, Git 2.x, Agile/SCRUM
Note: Please call between 09:30 AM PST to 06:00 PM PST
Kushal
| 1735 N 1St ST., Suite 308 |San Jose, CA 95112
NextGen Technologies Inc
Email: kushal.desai@nextgentechinc.com. Website: www.nextgentechinc.com | +1 (413) 424-0484 |
To unsubscribe from future emails or to update your email preferences click here