Location Atlanta. GA
Key Responsibilities
· Design and execute test strategies for GenAI systems (LLMs, RAG pipelines, AI agents, copilots)
· Validate accuracy, relevance, consistency, and factuality of AI-generated outputs
· Perform prompt testing, adversarial testing, and edge-case validation
· Test for hallucinations, bias, toxicity, and harmful content
· Validate retrieval quality in RAG-based systems (chunking, embeddings, relevance)
· Conduct regression testing for model updates, prompt changes, and data refreshes
· Collaborate with data scientists, ML engineers, product managers, and compliance teams
· Define and track AI quality metrics (BLEU, ROUGE, faithfulness, groundedness, latency)
· Automate test cases for GenAI workflows where feasible
Required Skills
· 8+ years of experience in QA, testing, or quality engineering, with hands-on exposure to AI/ML or Generative AI systems
· Experience testing with chatbots, NLP-based applications, and GenAI solutions, including prompt engineering and optimization
· Strong understanding of AI evaluation techniques, including hallucination detection, factual accuracy, bias, and output consistency
· Knowledge of Responsible AI principles, including fairness, transparency, and explainability
· Experience validating data quality with a basic understanding of statistics and AI performance metrics
· Proficient in API testing (REST, JSON) and testing AI model endpoints
· Hands-on experience with test automation tools and scripting (Python preferred)
· Familiarity with the ML lifecycle, model versioning, and regression testing for AI systems
· Exposure to cloud-based AI platforms such as Azure OpenAI, AWS Bedrock, or Google Vertex AI
· Strong foundation in software testing methodologies, including exploratory, negative, and adversarial testing
· Ability to design test cases for non-deterministic AI systems
· Strong analytical and critical-thinking skills, with the ability to objectively assess subjective AI outputs
· Excellent documentation, communication, and collaboration skills, with experience working in Agile / DevOps, cross-functional AI teams
—
—