Data Quixote

Test your LLM prompts on local data. Run it through multiple models at once. Compare outputs, check costs, export what works.

Download for macOS GitHub

1.1.1 · macOS Sonoma or later · Free

Quixote running SEO prompts across an ecommerce CSV with OpenAI and GPT-5.4 mini output columns.

Load a local CSV, write prompts that reference any column, and run them through OpenAI, Gemini, Ollama, LM Studio, or a compatible API. The table, raw responses, token costs, and timing all stay in one window. Nothing leaves your machine.

Providers and models

Use hosted APIs, local model servers, or any OpenAI-compatible endpoint.

Provider Supported models

OpenAI gpt-5.4, gpt-4.1, gpt-4o

Gemini gemini-2.5-pro, gemini-2.5-flash

Ollama llama3.2, qwen2.5, mistral, gemma3

LM Studio Gemma, Llama, Qwen, local downloads

Compatible API Any exposed OpenAI-compatible model ID

Models

Multi-provider

OpenAI, Gemini, Ollama, LM Studio, and any OpenAI-compatible endpoint.

Model refresh

Pull model lists from providers that expose /v1/models.

Manual models

Add model IDs for local servers or gateways that cannot list models.

Side-by-side

Run two or three models on the same rows and compare their outputs column by column.

Data

File support

Open CSV, TSV, tab-delimited, JSON arrays, or Excel files.

Column variables

Reference any column with {{column_name}} inside a prompt.

One file, many prompts

Attach multiple prompt tabs to one dataset — test variants without re-importing.

Export

Download the completed table as CSV when the run finishes.

Prompts

Prompt variants

Compare different instructions against the same data.

System message

Write a system prompt once; every row in the run uses it.

Parameters

Tune temperature, top-p, max tokens, and reasoning level per model.

Run limits

Test all rows or sample the first 10, 100, or 1000 rows.

Runs

Concurrent processing

Process multiple rows at once instead of waiting for each response in sequence.

Pause and resume

Stop a run and continue later from the saved queue state.

Retry failed rows

Re-run only the outputs that errored.

Raw response viewer

Inspect and copy the model response behind any cell.

Stats

Token usage

Track input and output tokens by run, row, and model.

Cost estimate

See what each model is costing you, per row and in total.

Latency

Compare response time and throughput while a run is active.

Quality signals

Review similarity and ROUGE metrics when references are available.