Chat

Get chat completions

1POST https://api.markprompt.com/chat

Creates a chat completion for the provided prompt, taking into account the content you have trained your project on.

Request body

KeyTypeDescription
messagesChatCompletionMessageParam[]

A list of user and assistant messages to which the AI should respond.

projectKeystring

The project key associated to your project. If shared publicly, use the production key from an allowlisted domain, which is set in the project settings. If not, for instance on localhost, use the development key.

assistantIdstring

The assistant ID.

assistantVersionIdstring

The assistant version ID. If not provided, the default version of the assistant will be used.

model'gpt-4' | 'gpt-4o' | 'gpt-4-turbo-preview' | 'gpt-3.5-turbo'

The completions model to use.

systemPromptstring

System prompt with custom instructions on how the AI should behave when generating a response.

contextobject

Values to provide to the system prompt when used with templates.

outputFormat'markdown' | 'slack' | 'html'

Output format, e.g. Slack-flavored Markdown. Default: markdown.

jsonOutputboolean

If true, return a JSON object instead of text. Default: false.

useAllSourcesForRetrievalboolean

If true, use all connected sources for retrieval. When sources are added, they will automatically be included.

retrievalSourceIdsstring[]

A list of source IDs to use specifically for retrieval.

streamboolean

If true, return the response as a Readable Stream. Otherwise, return as a plain JSON object. Default: true.

doNotInjectContextboolean

If true, do not inject the context in the full prompt unless the context tag is present in the template. Default: false.

threadIdstring

If provided, the messages will be tracked as part of the same thread in the insights.

temperaturenumber

The model temperature. Default: 0.1.

topPnumber

The model top P. Default: 1.

frequencyPenaltynumber

The model frequency penalty. Default: 0.

presencePenaltynumber

The model presence penalty. Default: 0.

maxTokensnumber

The max number of tokens to include in the response.

maxContextTokensnumber

The max number of tokens to include as part of the context. Note that this value will automatically be adjusted to fit within the context window allowed by the model. Default: 10000.

sectionsMatchCountnumber

The number of sections to include in the prompt context.

sectionsMatchThresholdnumber

The similarity threshold between the input question and selected sections. The higher the threshold, the more relevant the sections. If it's too high, it can potentially miss some sections.

toolsChatCompletionTool[]

A list of tools the model may call. Currently, only functions are supported. Use this to provide a list of functions the model may generate JSON inputs for.

toolChoiceChatCompletionToolChoiceOption

Controls which (if any) function is called by the model. "none" means the model will not call a function and instead generates a message. "auto" means the model can pick between generating a message or calling a function. Specifying a particular function via {"type": "function", "function": {"name": "my_function"}} forces the model to call that function. "none" is the default when no functions are present. "auto" is the default if functions are present.

excludeFromInsightsboolean

If true, exclude thread from insights. No message data will be stored. Default: false.

redactboolean

If true, redact sensitive data from messages. Default: false.

debugboolean

If true, the response will contain additional metadata for debug purposes.

Example request

1curl https://api.markprompt.com/chat \
2  -X POST \
3  -H "Authorization: Bearer <TOKEN>" \
4  -H "Content-Type: application/json" \
5  -H "X-Markprompt-API-Version: 2024-05-21" \
6  -d '{
7    "messages": [
8      { "content": "What is Markprompt?", "role": "user" },
9      { "content": "Markprompt is ...", "role": "assistant" },
10      { "content": "Explain this to me as if I was a 3 year old.", "role": "user" }
11    ],
12    "assistantId": "YOUR-ASSISTANT-ID"
13  }'

Response

By default, the response is returned as a ReadableStream of the form:

1So imagine a robot that can answer all the questions you have...

In addition to the stream, the response includes a header named x-markprompt-data, which is an encoded (Uint8Array) JSON object of the form:

1{
2  references: [
3    reference1,
4    reference2,
5    ...
6  ],
7  threadId: "...",
8  messageId: "...",
9  debugInfo: { ... }
10}

It consists of the following:

  • The references (see below) used to generate the answer.
  • A messageId, which is a unique ID representing the response message. It can be used to subsequently attach metadata to the message, such as a CSAT score, via the /messages API.
  • A threadId, which is a unique ID which can be passed to subsequent requests and represents a multi-message thread. It can be used to subsequently attach metadata to the thread, such as user account info, via the /threads API.
  • If the debug parameter is set to true, a debugInfo object containing information about the query, such as the full prompt that was built for the query.

The reference object

A reference is an object of the form:

1type FileSectionReference = {
2  file: {
3    title?: string;
4    path: string;
5    meta?: any;
6    source: Source;
7  };
8  meta?: {
9    leadHeading?: {
10      id?: string;
11      depth?: number;
12      value?: string;
13      slug?: string;
14    };
15  };
16};

and is meant to provide enough information for the client to be able to generate descriptive links to cited sources, including section slugs.

Parsing the header

Here is some example code in JavaScript to decode the references from the response header:

1const res = await fetch('https://api.markprompt.com/chat', {
2  /*...*/
3});
4
5// JSON payload
6const encodedPayload = res.headers.get('x-markprompt-data');
7const headerArray = new Uint8Array(encodedPayload.split(',').map(Number));
8const decoder = new TextDecoder();
9const decodedValue = decoder.decode(headerArray);
10const payload = JSON.parse(decodedValue);
11// ...

If the stream flag is set to false, the response is returned as a plain JSON object with a text field containing the completion, and a references field containing the list of references used to create the completion:

1{
2  "text": "Completion response...",
3  "references": [reference1, reference2, ...]
4}

where references are objects of the form described above.

When querying chat completions, do not use the bearer token if the code is exposed publicly, for instance on a public website. Instead, use the project production key, and make the request from a whitelisted domain. Obtaining the project production key and whitelisting the domain is done in the project settings.

Here is a working example of how to consume the stream in JavaScript. Note the use of projectKey and no authorization header: this code can be shared publicly, and will work from a domain you have whitelisted in the project settings.

1const res = await fetch('https://api.markprompt.com/chat', {
2  method: 'POST',
3  headers: {
4    'Content-Type': 'application/json',
5    'X-Markprompt-API-Version': '2024-05-21',
6  },
7  body: JSON.stringify({
8    messages: [{ content: 'What is Markprompt?', role: 'user' }],
9    projectKey: 'YOUR-PROJECT-KEY',
10    assistantId: 'YOUR-ASSISTANT-ID',
11    assistantVersionId: 'YOUR-ASSISTANT-VERSION-ID',
12  }),
13});
14
15if (!res.ok || !res.body) {
16  console.error('Error:', await res.text());
17  return;
18}
19
20// JSON payload
21const encodedPayload = res.headers.get('x-markprompt-data');
22const headerArray = new Uint8Array(encodedPayload.split(',').map(Number));
23const decoder = new TextDecoder();
24const decodedValue = decoder.decode(headerArray);
25const { references } = JSON.parse(decodedValue);
26
27const reader = res.body.getReader();
28const decoder = new TextDecoder();
29let response = '';
30
31while (true) {
32  const { value, done } = await reader.read();
33  const chunk = decoder.decode(value);
34  response = response + chunk;
35  if (done) {
36    break;
37  }
38}
39
40console.info('Answer:', response);

Using templates

When using assistants, Handlebar templates are supported. This can be used to customize instructions, for instance based on specific user-information. Here is an example:

1- You are a friendly AI assistant.
2- You are helping {{person.firstName}} {{person.lastName}}.
3{{#if isHobbyUser}}
4- They are on the Hobby plan.
5{{else}}
6- They are on the Business plan.
7{{/if}}

Now we can call the API providing the variables to replace via the context property:

1curl https://api.markprompt.com/chat \
2  -X POST \
3  -H "Authorization: Bearer <TOKEN>" \
4  -H "Content-Type: application/json" \
5  -H "X-Markprompt-API-Version: 2024-05-21" \
6  -d '{
7    "messages": [ { "content": "What is my plan type?", "role": "user" } ],
8    "assistantId": "YOUR-ASSISTANT-ID",
9    "context": {
10      "person": {
11        "firstName": "Steve",
12        "lastName": "Jobs"
13      },
14      "isHobbyUser": true
15    }
16  }'

This will transform the template into the following fully-specified instructions:

1- You are a friendly AI assistant.
2- You are helping Steve Jobs.
3- They are on the Hobby plan.

To which Markprompt may reply:

1Hi Steve! You are on the Hobby plan, which gives you access to 1 million free credits.