Well said, GenAI testing does feel a lot like UX testing. Even more so for features allowing free form input from users.
Another thing that I have observed is how leaders interpret these human testing results. Just because the test scores were low in the first try doesn't mean that the feature doesn't have any potential. In fact, the better the testing setup, worse first cut results are to be expected. It becomes a lot easier to improve the features with good real test data.
Well said, GenAI testing does feel a lot like UX testing. Even more so for features allowing free form input from users.
Another thing that I have observed is how leaders interpret these human testing results. Just because the test scores were low in the first try doesn't mean that the feature doesn't have any potential. In fact, the better the testing setup, worse first cut results are to be expected. It becomes a lot easier to improve the features with good real test data.