Get eval results

GET

/v1/eval/:id/results

GET

/v1/eval/:id/results

$ curl https://api.runtype.com/v1/eval/id/results \
>      -H "Authorization: <apiKey>" \
>      -H "Content-Type: application/json"

1 {
2   "batchExecutionId": "b7f3c9d2-4a1e-4f6a-9c3e-2d5a7b8e9f01",
3   "failedRecords": 2,
4   "flowId": "flow-9a8b7c6d",
5   "processedRecords": 50,
6   "resultsByRecord": {
7     "record-001": [
8       {
9         "completionTokens": 10,
10         "durationMs": 120,
11         "errorMessage": null,
12         "errorStack": null,
13         "executedAt": "2024-06-10T16:40:00Z",
14         "executionSessionId": "sess-5678efgh",
15         "messageHistory": [
16           {
17             "role": "user",
18             "content": "I want to check my order status."
19           }
20         ],
21         "modelUsed": "gpt-4-chat",
22         "output": "Order Status Inquiry",
23         "promptTokens": 15,
24         "reasoningContent": "The query explicitly asks about order status, indicating intent to check order progress.",
25         "reasoningTokens": 20,
26         "recordName": "Customer Query #001",
27         "resolvedPrompt": "Classify the intent of the customer query: 'I want to check my order status.'",
28         "retryCount": 0,
29         "stepId": "step-01",
30         "stepName": "Intent Classification",
31         "stepResultId": "stepres-1234abcd",
32         "stepType": "classification",
33         "toolCalls": [
34           {
35             "errorMessage": null,
36             "executedAt": "2024-06-10T16:40:00Z",
37             "executionTimeMs": 100,
38             "model": "intent-model-v2",
39             "status": "success",
40             "toolDescription": "Classifies user intents from text input",
41             "toolExecutionId": "toolexec-7890ijkl",
42             "toolId": "tool-nlp-01",
43             "toolName": "IntentClassifier",
44             "inputParameters": {
45               "text": "I want to check my order status."
46             },
47             "outputResult": {
48               "intent": "Order Status Inquiry",
49               "confidence": 0.95
50             }
51           }
52         ],
53         "totalCost": 0.0025
54       }
55     ]
56   },
57   "startedAt": "2024-06-10T16:30:00Z",
58   "status": "completed",
59   "totalRecords": 50,
60   "completedAt": "2024-06-10T16:45:30Z",
61   "evalGroupId": "evalgroup-20240610",
62   "evalName": "Customer Support Chatbot Accuracy Test"
63 }

Get detailed results for a specific eval batch including step results and tool calls.

Authentication

AuthorizationBearer

API key or Clerk session token

Path parameters

idstringRequired

Batch execution ID

Response

Eval results returned

batchExecutionIdstring

failedRecordsinteger

flowIdstring or null

processedRecordsinteger

resultsByRecordmap from strings to lists of objects

startedAtstring

statusstring

totalRecordsinteger

completedAtstring

evalConfigany

evalGroupIdstring

evalNamestring

Errors

401

Unauthorized Error

404

Not Found Error

500

Internal Server Error