

What if your Quality Assessment system understood user stories and executed end-to-end tests without writing a single script? That’s exactly what we built, an AI-native QA framework powered by Gemini. The framework automates test generation, UI workflows, and API validation in an integrated suite of utilities. In this blog, we’ll explore how it works and what makes it fundamentally different from traditional QA approaches.
Reimagine test automation with Niveus
82% of QA teams still rely on manual testing, and more than half struggle with time constraints. This reinforces a familiar challenge: QA pipelines remain fragmented, with slow test case generation and even slower execution across the UI and API layers. To overcome this, Niveus Solutions – part of NTT DATA developed an AI-native three-phase QA framework on Vertex AI Gemini that connects these stages into a structured, automated pipeline.
How the Framework Works: The Big Picture
The framework contains in-progress packaging of three core apps, and each module addresses a distinct layer of the QA problem. First, set up one project and point it at a Drive folder. Then, everything modularly processes data across phases :
| Module | What It Does | AI Engine |
| Module 1: Test Case Generator | Transforms user stories into structured, export-ready test cases | Vertex AI Gemini |
| Module 2: UI Test Execution | Executes test cases through real browser interactions, no selectors, no scripts | Vertex AI Gemini |
| Module 3: API Testing | Uses OpenAPI specifications or Postman collections to test and validate API endpoints. | Vertex AI Gemini (Multi-agent) |
Each module deploys independently, runs in Docker, and integrates deeply with Google Cloud, Vertex AI for reasoning, Google Drive for report delivery, and Google Sheets as the connective tissue between phases.
Module 1 – Test Case Generation: From User Stories to Structured QA
The first module addresses the most universal QA pain point: writing test cases from scratch. Instead of a QA engineer spending hours translating a user story into a spreadsheet, they simply upload the story document and let the AI do it.
What Goes In
The system takes a user story document as input, a plain-language description of the feature, such as authentication requirements, form fields, validation rules, and expected behaviours. The system accepts .docx, .pdf, and image files, meaning teams don’t need to reformat their existing documentation.
How the Gemini Model Processes It
The system supports multiple Gemini models. It reads the user story, segments it into logical sections, and generates structured test cases. The system then merges and deduplicates them into a final output.
The system maintains conversation context, enabling QA teams to iteratively refine test cases, such as adding edge cases or negative scenarios, without losing prior inputs.
Why it Matters
This produces structured, immediately usable output, enabling teams to transition directly from requirements to execution with minimal effort.
Prompt Customisation as a Safety Net
If the output requires refinement, QA engineers can adjust the prompt and regenerate test cases. The default prompt is optimized for most scenarios, but this flexibility ensures test generation can be iteratively improved without restarting the process.
Module 2 – UI Testing: Automating Real User Flows Without Scripts
The Old Way
Traditional UI automation relies on scripts that depend on element selectors such as CSS or XPath. This makes tests fragile, as any UI change, such as a renamed class or layout update, can break the test and require rework.
The New Way
This module executes Module 1 test cases directly in the browser. There are no selectors and no scripts. The AI reads each step in natural language and performs real user interactions to validate the application.
Built on Playwright and powered by Vertex AI Gemini, the system adapts dynamically to UI changes, eliminating the need for constant script maintenance.
Execution Environments: Local vs. Cloud
Local Chrome (Chromium): Runs on the tester’s machine for quick validation and fast iteration.
LambdaTest Cloud: Enables cross-platform and cross-browser testing, supports responsive layouts, and provides session recordings for full visibility.
Seamless Authentication Handling
The system handles authentication once and reuses it across all test cases:
- A single login captures session state
- The system reuses sessions across executions
- The system runs tests only after validating authentication
The AI identifies input fields and completes login flows contextually, without relying on element IDs.
Why This Matters
After execution, the system generates a comprehensive report that includes:
- Pass/fail summary across test cases
- Step-by-step execution logs
- Visual evidence (GIF recordings)
- AI-generated insights explaining outcomes
The system automatically stores all reports in the project’s Google Drive folder.
Module 3 – API Testing: OpenAPI Specs or Postman Collections In, Full Reports Out
| Traditional API Testing | AI-Driven API Testing (This Framework) |
| Teams manually document endpoints (e.g., using Postman) | The system uses OpenAPI specifications (YAML) or Postman collections as centralized sources of truth for API testing. |
| Engineers craft test inputs manually | The AI generates test inputs automatically |
| Teams require a deep understanding of all endpoints | The system extracts and understands API metadata automatically |
| Teams handle test design, execution, and validation separately | Unified pipeline for generation, execution, and validation |
| Teams produce subjective and inconsistent coverage | Objective-driven, structured testing |
| Teams manually validate responses | The system performs automated validation and generates structured reports |
The Pipeline
A structured pipeline validates inputs, interprets the API specification, executes tests, and generates reports, ensuring consistency across the entire process.
| Stage | What Happens |
| 1. Objective Validation | The system screens the objective for malicious intent, blocking DoS attacks, vulnerability scanning, or credential brute-forcing before any API call is made |
| 2. Spec Analysis | The system parses the OpenAPI YAML or Postman collections; endpoints, methods, schemas, and auth requirements are extracted and structured |
| 3. Test Execution | The AI generates appropriate test inputs, fires real HTTP requests, evaluates responses against the spec, and records results |
| 4. Report Generation | The system creates a detailed report and pushes it to Google Drive |
What Insights Are Generated
The system produces actionable insights, including recurring failure patterns, unstable or inconsistent test behaviors, and areas with limited coverage. It also highlights performance characteristics such as response times and endpoint reliability, providing a clearer view of where the system is most vulnerable.
How Does This Drive Action
These insights enable QA and engineering teams to move beyond surface-level results and focus on what matters most. Teams can prioritize fixes based on impact, address unstable areas of the application, and refine test coverage where gaps exist. This shifts QA from a reactive process to a more strategic function, improving both release confidence and overall product quality.
Real-World Impact: What This Looks Like in Practice
The framework is already in active use internally at Niveus, demonstrating measurable improvements across the QA lifecycle.
Across all three modules, the impact is consistent and tangible:
| Testing Layer | Before | After |
| Test Case Writing | 2–3 hours per feature, manual spreadsheet effort | Under 3 minutes, AI-generated and auto-exported |
| UI Test Execution | Days of scripting, breaks with UI changes | Natural language execution, no selectors |
| API Testing | Manual setup, handcrafted inputs, fragmented reporting | OpenAPI specifications or Postman collections as input, with automated end-to-end API validation. |
However, the real value lies in the compounding effect. A user story created at the start of the day can move seamlessly through test case generation, UI execution, and API validation within hours, without requiring a single line of test automation code. This fundamentally changes how QA operates, shifting it from a reactive bottleneck to a continuous, intelligent system that scales with the application.
Conclusion
Modern QA stands at an inflection point. As applications grow more complex and release cycles accelerate, traditional approaches built on manual effort and script-based automation are no longer sufficient to keep pace.
AI-driven testing introduces a fundamentally different model, one where test generation, execution, and validation are interconnected, adaptive, and continuously improving. By shifting from rigid, maintenance-heavy processes to intent-driven systems, organizations can achieve greater consistency, broader coverage, and faster feedback loops.
This does not simply evolve existing practices; it fundamentally shifts how QA is engineered and how quality is engineered. QA shifts from a reactive checkpoint to becoming an integrated, intelligent function that scales with the product, enabling teams to deliver with greater confidence and speed.










