Ishani – Data Scientist/ ML engineer – (EX. Paypal,Lucid) – U.S. Citizen – 6+ years Exp – Current Location : Richardson,TX (Willing to relocate anywhere in USA)
Consultant's Details:
Consultant Name: Ishani
Current Location : Richardson,TX
work authorization : U.S. Citizen
Employer Details:
Employer:Nextgen Technologies Inc
Contact Person:Kushal
Email:kushal.desai@nextgentechinc.com
Note: Please call between 09:30 AM PST to 06:00 PM PST
Phone: +1 (413) 424-0484
Amish's Resume
PROFESSIONAL SUMMARY
Highly experienced Machine Learning Engineer with 6+ years of expertise in designing, developing, and deploying machine learning models across various industries. Proficient in building and optimizing large-scale ML systems, with a strong background in data science, software engineering, and statistical analysis. Skilled in leveraging cutting-edge machine learning frameworks and tools to solve complex business problems. Adept at working in cross-functional teams to deliver scalable, production-ready models that drive business outcomes. Proven track record of deploying, consuming, and fine-tuning NLP model & LLMs such as Azure Open AI, Llama 2/3, and Hugging Face. Extensive expertise in designing and implementing scalable AI solutions. Strong proficiency in Python, TensorFlow, PyTorch, and scikit-learn, coupled with hands-on experience in deploying AI solutions on cloud platforms like Azure and AWS. Exceptional communication and leadership skills with a history of successful collaboration across cross-functional teams.
TECHNICAL SKILLS
Languages: Python, R, SQL, JavaScript
ML/AI: Tensorflow, Keras, Scikit-learn, Prophet, PySpark, Pandas, OpenCV
Database: MySQL, SQL server, PostgreSQL, MongoDB
Reporting Tools: Tableau, Power BI, Wavefront
Predictive and Machine Learning: Regression (Linear, Logistic, Bayesian, Polynomial, Ridge, Lasso), Classification (Logistic, two/multiclass classification, Boosted Decision Tree, Random Forest, Decision Tree, Naïve Bayes, Support Vector Machines, KNN, Neural Network, Clustering (K-means, Hierarchical), Anomaly Detection
Cloud: Google Cloud Platform, Azure, AWS
Cloud Resources: Azure Data bricks, AWS Glue, GCP BigQuery, Cloud Composer, Dataflow
Frameworks: Flask, Django, Falcon
Tools: Jupyter, Git, Jira, Docker
Operating System: Linux, Windows, MacOS
EDUCATION
University of Texas at Dallas, USA 2017 – 2019
MS, Information Technology and Management
Dhirubhai Ambani University, India 2012 – 2016
BTech, Information Communication Technology
PROFESSIONAL EXPERIENCE
PayPal, Austin, TX Oct 2024 – Present
Data Scientist
Responsibilities:
-
Identified business problems or management objectives that can be addressed through data analysis and propose creative solutions and strategies to existing business challenges.
-
Analyse, manipulate, and process massive amounts of data using statistical software to discover trends, patterns, and insights via Jupyter, Sci-kit learn and Tableau.
-
Automate the entire collection process pipeline by identifying valuable data sources by using ETL tools like Apache Nifi, Apache Beam.
-
Apply feature selection algorithms to models such as ANOVA (analysis of variance), decision trees using PySpark and hyper tune the parameters based on interest and to predict the outcomes.
-
Developed and maintained Tableau dashboards used by Transaction Monitoring, and FIU departments for reporting essential Anti-Money laundering (AML) transactional metrics, improving workstream by 17%.
-
Revamped PostgreSQL/SQL to Hive queries, increasing processing speed by 40% and reducing deployment time by three weeks.
-
Reviewed 120 AML Data Quality rules consistent with the Business Requirement Documents (BRD) and Functional requirement documents (FRD) and made updates based on the upstream rule change requirements.
-
Experienced in handling large datasets using Partitions, spark in-memory capabilities, Broadcasts in spark, effective and efficient Joins, Transformations, and others during the ingestion process.
-
Automated data governance for asset management data using Python scripts, reducing manual efforts by 60%
-
Optimized SQL queries on AML transaction monitoring rules by automating data cleansing, extraction, and analysis processes, resulting in improved performance and data accuracy by 50%.
-
Identified close to 100k fallouts in the Data Quality report and executed Hive and SQL queries on the Hadoop platform to understand the root cause of the defect and underlying transaction and reference data.
Environment: SQL, Python, JavaScript, TensorFlow, PyTorch, MongoDB, MySQL, Azure (Databricks)
Lucid Motors, Newark, CA Aug 2023 – Aug 2024
Machine Learning Engineer
Responsibilities:
-
Developed chatbots which use Generative AI (GenAI) and Large Language Models (LLMs) to deliver personalized and context-aware responses, enhancing user engagement and satisfaction.
-
Applied NLP techniques using transformer-based models for natural language understanding and generation, enhancing the chatbot’s ability to process and generate contextually accurate responses based on user input and domain-specific language.
-
Developed and implemented custom web scraping pipelines using Python libraries to gather domain-specific data from websites and APIs, ensuring the chatbot model is continuously fed with high-quality, relevant data to enhance its understanding and response capabilities.
-
Utilized Vector Databases to store and retrieve embeddings efficiently, facilitating quick and accurate similarity searches for improved user query handling.
-
Fine-tuned LLMs using domain-specific datasets to optimize the chatbot’s performance, ensuring high accuracy and relevance in conversations.
-
Integrated Hugging Face Transformers to leverage pre-trained LLMs, streamlining fine-tuning and deployment processes for domain-specific applications, enhancing response quality and model adaptability
-
Deployed the chatbot on AWS using Bedrock for scalable model training and inference, ensuring reliable and efficient performance under varying load conditions.
-
Developed and deployed scalable RESTful APIs with FastAPI, enabling real-time communication between the chatbot and external systems for data retrieval and model inference.
-
Designed and managed scalable infrastructure solutions using Amazon EKS for Kubernetes-based container orchestration, ensuring high availability and efficient resource utilization and Git for CI/CD pipeline management.
-
Coordinated cross-functional efforts to integrate domain-specific knowledge into Generative AI models, ensuring the system met both technical and business requirements.
Environment: Python, Tableau, Power BI, Machine Learning (Keras, PyTorch), Generative AI, Deep Learning, Natural Language Processing, Cognitive Search, Data Analysis (Pandas, NumPy), Vertex AI, Agile Methodologies, SCRUM Process, GCP, GitLab, Databricks, PySpark, BigQuery, Dataflow
The Walt Disney, Newark, WA, USA Jan 2022 – June 2023
Data Scientist
Responsibilities:
-
Explored and analyzed the customer specific features, as well as performing data imputation using the Scikit-learn package in Python.
-
Participated in features engineering such as feature generating, feature normalization and label encoding with Scikit-learn preprocessing.
-
Used Python (NumPy, SciPy, pandas, Scikit-learn, seaborn) and R to develop a variety of models and algorithms for analytic purposes.
-
Experimented with ensemble methods to increase the accuracy of the training model with different bagging and boosting methods and deployed the model on AWS.
-
Designed and implemented a recommendation system which leveraged Google Analytics data and the machine learning models and utilized Collaborative filtering techniques to recommend courses for different customers.
-
Designed rich data visualizations to model data into human-readable form with Tableau and Matplotlib.
-
Built machine learning models to identify whether a user is legitimate using real- time data analysis and prevent fraudulent transactions using the history of customer transactions with supervised learning.
-
Involved in different phases of data acquisition, data collection, data cleaning, model development, model validation, and visualization to deliver solutions.
-
Performed data cleaning including transforming variables and dealing with missing value and ensured data quality, consistency, and integrity using Pandas and NumPy.
-
Developed various machine learning models such as Logistic regression, KNN, and Gradient Boosting with Pandas, NumPy, Matplotlib, Scikit-learn in Python.
Environment: Python, R, Linux, Spark, Tableau Desktop, Microsoft Excel, MATLAB, Spark SQL
Geico, Richardson, USA Sep 2019 – Dec 2021
Machine Learning Engineer
Responsibilities:
-
Collaborated closely with data engineering teams to streamline data ingestion workflows, reducing data processing time by 25%.
-
Developed ML models in AWS SageMaker for fraud detection and risk scoring, reducing fraudulent claims by 12%.
-
Led the implementation of containerized ML models using Docker and Kubernetes, ensuring scalability and efficient resource utilization across the organization.
-
Worked with cross-functional teams to optimize and validate ML models, contributing to a 20% increase in overall model performance.
-
Developed and maintained CI/CD pipelines for ML model deployment using Jenkins, reducing deployment time by 30%.
-
Implemented edge computing solutions for ML inference, reducing latency by 35% and improving user experience in real-time applications.
-
Conducted regular model reviews and A/B testing, leading to the successful deployment of models with a 95% accuracy rate.
-
Ensured security and compliance in data handling processes, adhering to industry standards and reducing potential vulnerabilities by 40%.
-
Optimized resource allocation and cloud services, cutting infrastructure costs by 25% while maintaining performance.
-
Documented best practices and processes for ML model deployment, contributing to a shared knowledge base that improved team efficiency by 15%.
-
Worked on optimizing ELT workloads against the Hadoop file system by implementing Hive SQL for transformation and performance tuning methodology in optimizing SQL, ETL mappings, and HIVE- HIVE-managed / ORC tables.
-
Utilized version control systems like Git and collaboration platforms like Jira to facilitate seamless collaboration with cross-functional teams and track project progress efficiently.
Environment: Hadoop, Jenkins, Docker, Sci-kit learn, Python, Tableau, Hive SQL, SQL, Microsoft Excel, Business Analysis, Feature Engineering, Git, Jira
Note: Please call between 09:30 AM PST to 06:00 PM PST
Kushal
| 1735 N 1St ST., Suite 102 |San Jose, CA 95112
NextGen Technologies Inc
Email: kushal.desai@nextgentechinc.com. Website: www.nextgentechinc.com | +1 (413) 424-0484 |
To unsubscribe from future emails or to update your email preferences click here