Skip to main content
A spreadsheet-style editor for running prompts and models across multiple test cases. Import testsets to easily test, evaluate, and optimize your LLM outputs.

Prerequisites


Run an experiment

1

Create a prompt with variables

Use {{variable}} to define variables in your prompt. Configure parameters in the side panel, then commit the version.
2

Create a testset

Every column in the CSV maps to a variable in your prompt. Add an ideal_output column for expected outputs.
Click Import from CSV to import your testset.
Click Create empty, then add rows with the + Add row button.
3

Create the experiment

Click + NEW experiment, select the prompt and versions to test, then import a testset or add test cases manually.
4

Compare results

Scroll down to compare results across prompt versions.
5

Evaluate with LLM

Run LLM evaluators to automatically score results.

Experiments with images

Run experiments using image-based datasets to test visual prompts.
1

Create a dataset with images

Prepare a CSV where column headers match your prompt variables. Include image URLs in the image column.After uploading, change the image column’s data type to Image.
Add test cases, set a column type to Image, then input image URLs or upload directly.
2

Configure and run

Attach your image testset to the experiment. Ensure variable names in your prompt match the testset column headers exactly.
Experiments with images