About DevRev
At DevRev, we're building the future of work with Computer – your AI teammate. Unlike traditional tools, Computer unifies all your data sources, tools, and workflows into a single AI-ready platform, giving employees real-time insights, proactive suggestions, and powerful agentic actions. It extends your existing software with AI-native apps and agents that work alongside your teams and customers – updating workflows, coordinating across teams, and eliminating repetitive work. We call this Team Intelligence: human-AI collaboration that breaks down silos, brings people back together, and frees you to solve bigger problems. Backed by Khosla Ventures and Mayfield with $150M+ raised, DevRev is trusted by global companies across industries.
About the role:
As our Lead AI Test Automation Specialist, you'll develop testing strategies, evaluation frameworks, and quality metrics specifically designed for LLM-powered applications. This role requires a unique blend of QA expertise, understanding of GenAI behaviour, and automation skills to ensure our AI features are reliable, accurate, and trustworthy.
Key Responsibilities:
-
- Design and implement comprehensive testing strategies for GenAI features, including conversational AI, agentic systems, and LLM-powered workflows
- Develop automated test suites for prompt testing, including regression tests that detect unintended changes in model behaviour
- Create evaluation frameworks to measure GenAI quality across multiple dimensions (accuracy, relevance, safety, consistency, latency)
- Build and maintain test datasets and golden examples that represent diverse user scenarios and edge cases
- Implement monitoring and alerting systems to detect quality degradation in production GenAI features
- Perform adversarial testing to identify potential failures, hallucinations, biases, or security vulnerabilities in AI systems
- Collaborate with engineers to define acceptance criteria and quality gates for AI feature releases
- Develop tools and frameworks that make it easy for engineers to test their GenAI implementations
- Conduct user acceptance testing and gather feedback on AI feature performance from internal users
- Document testing procedures, known issues, and quality metrics in clear, accessible formats
- Partner with Product and Design teams to ensure AI features meet user experience standards
- Stay current with GenAI testing methodologies, tools, and industry best practices
Your Qualifications
- PRE or test engineering experience, preferably with AI/ML systems.
- Strong understanding of GenAI technologies including LLMs, prompt engineering, and AI application patterns
- Experience with test automation frameworks and scripting (Python, JavaScript, Selenium, Pytest)
- Knowledge of software testing methodologies (functional, integration, regression, performance, security testing)
- Ability to design test cases and evaluation criteria for non-deterministic systems
- Strong analytical and problem-solving skills with attention to detail
- Experience with API testing tools (Postman, REST Assured) and backend testing
- Familiarity with CI/CD pipelines and automated testing integration
- Excellent communication skills for documenting issues and collaboration
Preferred Qualifications
- Experience testing conversational AI, chatbots, or agentic systems
- Knowledge of ML model evaluation metrics and techniques
- Familiarity with LLM evaluation frameworks (LangSmith, PromptFoo, Ragas)
- Experience with performance testing and load testing AI APIs
- Understanding of responsible AI principles, including fairness, transparency, and safety testing
- Background in enterprise software or SaaS QA
- Experience with test management tools (TestRail, Zephyr, Jira)
- Knowledge of security testing methodologies for AI systems
- Scripting experience with Python, including working with LLM APIs
What Makes This Role Exciting
- Define Quality practices for GenAI applications
- Work on cutting-edge AI technologies and help ensure they're reliable and trustworthy
- Shape quality standards that will impact millions of enterprise users
- Collaborate closely with engineers, data scientists, and product teams
- Grow expertise in a highly specialized and increasingly important domain
- Influence the entire AI product development lifecycle from design to release
- Join a team that values quality as a first-class concern, not an afterthought
Join us in innovating our testing processes and ensuring the delivery of high-quality software products through advanced automation techniques.
DevRev is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.