To validate an LLM’s performance, navigate to the Run Task view. You can access this directly from the Task view (where you landed once you created a task).
In the Run Task view, the left panel displays your task configuration—including the user prompt, system prompt, chosen model, and defined outputs.
You can then enter or upload the input data for the task. Remember, the task is set up to run on dynamic input variables (e.g., {customer_review}). In other words, you instruct the LLM to execute the prompts on the provided input and produce outputs in the required format.
<aside> 🧠
Dynamic inputs can be text, images, URLs, or a combination. Each task can have as many dynamic input variables as required. For example, you might you might have a task that compares text to images, or summarises a range of numerical and text data. You could also have a task that scrapes a URL and analyses the data returned.
</aside>
For this demo, enter the following dummy {customer_review} in the input field:
"My toaster exploded during breakfast, sending flaming bread across the kitchen! 😱 On the bright side, I've discovered a new way to heat up the whole house. But seriously folks, this isn't just a hot topic - it's a fire hazard! The warranty card didn't mention anything about impromptu fireworks displays. 🎆"
We’re also going to add an image of a burnt toaster as evidence.
Click Run Task (or press ⌘↩️) to execute the task. Within seconds, the model’s response appears in the right panel.
Understanding the Response