Sayali – Senior Data Scientist [Skills: Python,Vertex AI,LLM] – 10 years Exp – Our own H1B – Local to Bay Area CA – Willing to relocate anywhere in USA
Consultant's Details: | Employer Details: |
Consultant Name: Sayali M | Employer Name:Nextgen Technologies Inc |
Work Visa: Our own H1B | Contact Person:Kushal Desai |
Location: Bay Area,CA | Email:kushal.desai@nextgentechinc.com |
Relocation: Yes – Willing to relocate anywhere in USA | Phone: +1 (413) 424-0484 Note: Please call after 09:00 AM PST |
Sayali's Resume
SUMMARY:
Data Scientist & ML Ops Engineer: Results-driven professional with around 10 years of experience bridging data science model development and ML Ops engineering across FinTech, SaaS, Utilities, and Energy. Expert in Python, SQL, TensorFlow, PyTorch, Scikit-learn, and XGBoost, with proven success designing end-to-end ML pipelines (MLflow, Kubeflow, Vertex AI), deploying models via Docker/Kubernetes, and automating workflows using Airflow/Composer. Skilled in AutoML frameworks (Vertex AI AutoML, H2O Driverless AI), CI/CD, and observability for model governance and performance monitoring. Adept at collaborating with Finance, Risk, and Engineering teams to deliver scalable, compliant, and production-grade AI solutions that reduce fraud, improve operational efficiency, and drive business insights.
AREAS OF EXPERTISE:
Business Applications & ERP: Salesforce, Workday, Oracle ERP, Jenzabar, ICIMS, SAP BW, HCSS, CRM, Jira, Confluence.
Data Platforms & Tools: SQL, Python, BigQuery, Snowflake, Spark, PySpark, Anaplan, CI/CD Workflows (GitHub, GitLab), Cloud (AWS/GCP), AWS, AWS Glue, EMR, Datalakes, Airflow, Redshift, Delta Tables, Azure ML
BI & Reporting : Power BI, Qlik Sense, Tableau, Executive Reporting, FP&A Dashboards, Sigma, Crystal Reports
Integration & Architecture: Data Modeling, ETL Pipelines, System Integration, Application, Architecture, Digital Transformation Strategy, Data Governance Frameworks, Data Warehousing, ETL/ELT Orchestration & Automation, Model Monitoring & Explainability
Analytics & Automation: Forecasting Models, Supervised & Unsupervised ML, Fraud Detection & Risk Modeling, Anomaly Detection, ML Pipelines, Scenario Modeling, A/B Testing, Model Monitoring & KPI Tracking, Funnel Optimization, Streaming Data Processing(Kafka, Spark), Cloud Logging, Cloud Monitoring, Escalation Handling, SOX, GAAP, MRM, Data Quality & Governance, Audit Compliance
Cloud Data Warehousing: Amazon Redshift (ETL, schema design, performance tuning, workload management)
PROFESSIONAL EXPERIENCE:
Intuit Mountain View, CA
Senior Data Scientist/ Data Engineer 2 Feb 25 – Present
Improving and advancing enterprise SaaS strategy by building scalable AI/ML models, data pipelines, and financial forecasting systems integrated into planning and compliance frameworks.
Achievements/Tasks:
- Developed governed data pipelines in datalakes to integrate Workday and ICIMS datasets, enhancing data reliability by 25% across HR and finance workflows. This integration powered a Workforce Management Dashboard for the entire fintech division (TurboTax and QuickBooks products), enabling improved decision-making for executive teams, PMOs, and other stakeholders. The dashboard was showcased and acknowledged at Intuit Hackerrank.
- Integrated LLM-based prompt workflows with planning dashboards to automatically generate executive summaries using LLMs, Python, and Power BI, which led to reduced manual preparation time by 50% and improved decision-making speed for leadership.
- Designed, deployed machine learning models (XGBoost, TensorFlow, PyTorch, Scikit-learn) that improved pricing optimization, anomaly detection, revenue forecasting by 15%, enabling finance leadership to reduce uncertainty in quarterly planning cycles.
- Built distributed ML pipelines with Kubeflow, MLflow, and Vertex AI, orchestrating automated training, version control, and deployment into real-time APIs, which reduced deployment errors by 30% and provided reproducible, audit-ready ML workflows.
- Automated end-to-end ETL/ELT pipelines using Python, SQL, Spark, Snowflake, GCP BigQuery, and AWS Glue, transforming raw ERP, workforce data into ML-ready datasets, which reduced reporting latency by 40%, supported advanced ML model accuracy.
- Partnered with Risk, Compliance teams to implement fraud anomaly detection frameworks within financial close pipelines, embedding Spark- SQL-based anomaly rules that reduced audit-identified discrepancies by 25%, improved GAAP/SOX compliance.
- Containerized ML models using Docker and Kubernetes, enabling scalable SaaS deployment and ensuring production-grade inference performance with minimal downtime across high-volume applications.
- Created observability dashboards in Tableau, Qlik, and Sigma integrated with MLflow and Kubeflow logs, which proactively monitored data drift, retraining needs, and KPI anomalies, ensuring adherence to Model Risk Management (MRM) frameworks.
- Designed Delta and Iceberg table structures in data lakehouse environments, enabling efficient storage, schema evolution, and query optimization, which accelerated ML feature extraction pipelines by 35%.
- Automated financial close and compliance reporting with Python, SQL, and AWS Glue workflows, reducing manual effort by 25% and providing auditors with transparent traceability logs aligned with CI/CD governance.
- Applied Spark and PySpark transformations on 100M+ financial transactions to engineer ML-ready features for fraud detection models, which enhanced prediction recall and scalability of classification pipelines.
- Partnered with GTM, RevOps, and Finance leaders to integrate ML-driven predictive pricing and forecasting models into Annual Operating Plan (AOP) cycles, cutting the planning timeline by 30% while improving margin analysis.
- Conducted root cause investigations of ERP and dashboard inconsistencies, embedding ML-based anomaly detection and SQL audit controls that reduced pricing and reporting risk by 20%.
- Delivered scenario-based forecasting models using Python, SQL, and TensorFlow to simulate multiple investment, cost, and revenue scenarios, providing executives with better decision-making visibility during strategic planning.
- Integrated Workday and ICIMS workforce datasets into predictive workforce ML models, improving headcount and capacity planning accuracy by 10% while identifying risks of over/under-utilization.
- Deployed CI/CD-enabled ML pipelines with GitHub Actions and Jenkins, embedding automated testing and audit checkpoints, which improved reproducibility and cut deployment cycle time by 20%.
- Mentored a team of analysts on machine learning best practices, SQL optimization, anomaly detection frameworks, and model interpretability, improving overall team productivity and elevating ML solution adoption across departments.
- Mentored a team of analysts on machine learning best practices, SQL optimization, anomaly detection frameworks, and model interpretability, improving overall team productivity and elevating ML solution adoption across departments.
Charge EPC Concord, CA
Data Scientist/ Project Controls Operations Jan 23 – Jan 25
Served as the lead data scientist for multiple California utility infrastructure projects, Delivered machine learning, predictive analytics, and data engineering solutions for multimillion-dollar California utility infrastructure projects, directly supporting CFO and project executives with financial forecasting, compliance, and profitability insights.
Achievements/Tasks
- Designed predictive maintenance models using XGBoost, Scikit-learn, and NLP applied to IoT telemetry and maintenance logs, reducing unplanned downtime by 12% and improving fleet utilization across multiple projects.
- Automated ETL/ELT pipelines with Python, SQL, Spark, AWS Glue, and EMR, integrating ERP, IoT, and project datasets into a centralized data warehouse, which improved compliance reporting and accelerated anomaly detection across asset data.
- Developed Power BI dashboards with complex DAX measures for cost, revenue, and fleet performance (Snowflake, Oracle ERP, HCSS) — identified $350K in cost variance and improved executive visibility into project profitability. Developed an automated Power BI workbook using DAX to overcome overbilling and underbilling of jobs and made sure it was user friendly for technical as well as non technical staff.
- Applied star and snowflake schema dimensional modeling on ERP and IoT datasets to deliver ML-ready structures that reduced reporting discrepancies and improved training dataset accuracy.
- Implemented fraud and anomaly detection pipelines with Spark and PySpark to identify billing irregularities and project cost anomalies, cutting financial discrepancies by 20% and improving audit compliance.
- Automated pipeline orchestration with Airflow DAGs, AWS Step Functions, and Lambda, improving data ingestion reliability and reducing pipeline failures by 35%.
- Partnered with the CFO to create financial forecasting and capital allocation ML models using Python and SQL, which enhanced cash flow projections and enabled data-driven investment decisions.
- Built predictive cost forecasting models with Spark, SQL that improved 10% demand forecasting, reducing project budget overruns.
- Developed geospatial ML models on telematics data to optimize route planning, asset rotation, which cut operational costs by 12%.
- Created regulatory compliance dashboards aligned with SOX and GAAP standards, embedding ML-based anomaly alerts that reduced audit review time by 30%.
- Partnered with executives to implement AI-driven cost and demand forecasting dashboards, modernizing project tracking tools and improving financial planning efficiency by 20%.
- Developed NLP-based classification models on maintenance logs to categorize risk events, streamline safety compliance reporting.
- Converted SAS datasets into Snowflake and Power BI pipelines to enable advanced ML training and improve cost/variance reporting accuracy.
- Deployed GitHub CI/CD workflows for ML pipelines, embedding testing and governance to improve reproducibility and reduce deployment errors by 25%.
- Delivered interactive Tableau and Power BI dashboards integrating ML-powered predictions, which improved executive decision-making speed and operational transparency across project portfolios.
B2U Storage Solutions Santa Monica, CA
Data Scientist/ Jr. Data Engineer April 22 – Dec 22
Delivered analytics solutions for second-life EV battery and solar energy storage operations, optimizing performance, lifecycle value, and financial planning. Partnered with operations, supply chain, and finance teams to improve asset uptime, fleet utilization, and profitability through automation, BI dashboards, and predictive analytics.
- Built interactive Power BI and Qlik Sense dashboards and automation tools integrating IoT battery sensors, ERP inventory data, and maintenance logs — improved uptime by 25% and reduced unplanned maintenance costs using MLflow tracking for reproducibility and retraining workflows in GCP Composer.
- Migrated forecasting, asset performance workflows in AWS Glue/EMR, improving scalability of analytics ETL pipelines.
- Leveraged Pub/Sub + Cloud Storage for near-real-time ingestion of IoT sensor and ERP data.
- Developed complex Power BI dashboards using DAX, NLP, Vertex AI and ML to track EV battery health, solar energy throughput, and fleet utilization — reduced breakdown costs by 25% and enhanced operational visibility.
- Developed orchestration workflows with Apache Airflow and Spark (AWS EMR) to automate ingestion and transformation of operational data into AWS Redshift, improving reporting reliability and enabling real-time Power BI dashboards.
- Developed NLP predictive maintenance alerts, AI-driven Power BI automation for real-time reporting, anomaly detection.
- Developed financial forecasting models (Advanced Excel, SQL, Python) for energy production revenue, maintenance cost projections, and procurement planning — integrated final outputs into Power BI dashboards with what-if parameters.
- Automated ETL pipeline deployments via GitHub CI/CD, reducing downtime and improving release management.
- Created a rule-based inventory ticketing system with Power BI, DAX and Power Automate — streamlined work order tracking and improved process efficiency by 15%.
- Created predictive maintenance and lifecycle models leveraging TensorFlow, PyTorch, and Scikit-learn to optimize EV battery performance, extending asset life and reducing breakdown costs by 25%.
- Designed anomaly detection and risk analytics workflows (TensorFlow, PyTorch, XGBoost) integrated with Qlik Sense dashboards to provide executives with real-time insights.
- Built predictive maintenance and lifecycle models deployed on AWS EMR Spark clusters for scalability— results visualized in Power BI using trend lines, conditional formatting, and custom tooltips.
- Defined operational KPIs for asset lifecycle management in collaboration with supply chain leadership — built performance scorecards and alerts via Power BI.
- Audited SAP/Oracle order workflows and built SOX-compliant dashboards — used Power BI row-level security and workspace controls to ensure governance.
- Developed security-aware Power BI dashboards integrated with SharePoint, controlling access to high-impact asset performance visuals ensuring MRM compliance.
- Collaborated with executive teams to streamline supply chain operations — insights derived from automated Power BI reports and slicer-driven executive dashboards.
- Built SharePoint-integrated dashboards and automated alerts using Power Automate — reduced manual tracking effort by 40% and supported incident management processes.
- Used DAX to build KPI indicators, time intelligence measures, and data quality checks for business continuity reporting.
- Incorporated custom visuals, matrix drill-downs, filters in Power BI to support in-depth root cause analysis, weekly leadership reviews.
- Designed rule-based Power Automate inventory ticketing workflows integrated with Power BI, cutting manual processing effort by 40% and ensuring operational continuity.
California State University, East Bay Hayward, CA
Data Scientist (Part-Time) Nov 17 – Apr 20
Supported university administrative operations with data analytics, reporting, and process automation while completing graduate studies.
- Automated student enrollment and financial aid reporting pipelines using Python, SQL, and Excel, reducing manual data entry errors and cutting reporting cycle time by 20%.
- Built Power BI dashboards to monitor student performance metrics, financial tracking, and administrative KPIs, which improved decision-making for program managers.
- Applied predictive analytics techniques to identify enrollment and retention trends, enabling administrators to proactively address risk factors affecting student success.
- Partnered with HR and admissions to integrate Workday and PeopleSoft data sources into reporting systems, ensuring compliance with university standards and state reporting requirements.
Intellect Design Arena Ltd. India
Associate Data Scientist/ Product Data Engineer Dec 17 – Dec 20
Developed ML and data engineering solutions for banking and insurance SaaS products, with a focus on fraud detection, payment optimization, and compliance reporting.
- Gathered and documented business requirements (BRDs, FRDs, user stories) for insurance policy, claims, and billing workflows; conducted stakeholder interviews and facilitated requirement workshops.
- Built fraud detection models and dashboards using SQL, Python, and PySpark, reducing false positives while increasing fraud catch rates by 20% across high-volume transactions.
- Automated ETL pipelines with Python, SQL, AWS Glue, and Airflow to unify policy, billing, and claims datasets, accelerating fraud reporting speed by 35% and improving ML model readiness.
- Designed regulatory reporting dashboards with Python and Tableau to automate ISO 20022 and SOX reporting, cutting compliance errors by 30%.
- Created A/B testing frameworks with SQL and Python to optimize payment approval rates, increasing approval efficiency by 5% while reducing false rejections.
- Partnered with SMEs and QA teams to integrate Guidewire PolicyCenter/ClaimCenter data into ML workflows, reducing claims processing cycle times by 20%.
- Migrated traditional ETL processes to GCP Dataflow, Redshift pipelines, enabling scalable feature extraction, ML training on larger datasets.
- Embedded ML-based anomaly detection in UAT and test processes, improving defect triage speed and ensuring higher accuracy before production releases.
- Authored data governance documentation, BRDs/FRDs, and ML process flows, ensuring end-to-end audit readiness and compliance in insurance product deployments.
Globuzz Media Pvt. Ltd. India
Data Scientist Jan 16 – Dec 17
Analyst in a marketing startup performing varied duties including Data, Marketing Analytics, CRM,etc.
Achievements/Tasks
- Designed interactive BI dashboards in Power BI and Tableau to track campaign ROI, customer engagement, and retention metrics, improving executive visibility.
- Conducted A/B testing and customer segmentation with SQL, Python, and Excel, providing insights that optimized targeting strategies and increased campaign conversion rates.
- Developed machine learning classification models in Python to recommend products based on customer behavior, achieving an 88% F1 score.
- Automated digital reporting pipelines with Python and SQL, reducing manual reporting effort by 50% and enabling daily client updates.
- Integrated cross-platform digital engagement data (Google Analytics, Facebook Insights, Spredfast) into centralized data marts, improving consistency of analytics reporting.
- Built customer churn prediction models with Scikit-learn to identify at-risk segments, helping design targeted retention campaigns that reduced churn rates.
- Partnered with business and marketing teams to apply Python-based funnel analysis that uncovered optimization opportunities, increasing ROI across digital campaigns.
PROJECTS:
- Study of Breakthrough Therapies in the US and their coverage- Capstone Project – Link
Capstone project for Healthcare Analytics, analyzing FDA approvals using NLP techniques to identify trends and forecast adoption based on disease area and clinical trial metadata.
- Time Series Analysis – Hobby and Game Store Retail Sales & Inventory – Link
Developed seasonal ARIMA models and Prophet-based forecasts for a multi-store retail chain using Python. Focused on inventory-level signal optimization and revenue planning.
- Wine Analysis – Data Classification – Link
Built classification models using scikit-learn to categorize wines by quality. Applied logistic regression, decision trees, for 88% F1 score.
EDUCATION:
- Master’s. Business & Data Analytics | California State University, East Bay, USA 2022
- Post Graduate Diploma in Advanced Computing (CDAC) | University of Pune, Pune, India 2017
- Bachelor of Engineering, Computer Engineering | University of Pune, Pune, India 2016
RESEARCH PAPER PUBLICATIONS:
- Research paper for the project, 'Data Hiding in Encrypted H.264/AVC Video Streams by Codeword Substitution' was published in an international conference journal named IFERP.
Note: Please call between 09:00 AM PST to 06:00 PM PST
Kushal Desai
| 1735 N 1St ST., Suite 102 |San Jose, CA 95112
NextGen Technologies Inc
Email: kushal.desai@nextgentechinc.com. Website: www.nextgentechinc.com | +1 (413) 424-0484 |
To unsubscribe from future emails or to update your email preferences click here