The rag use case uses candidate documents that are inserted into a LLM’s context to ground the generated response to those documents instead of generating an answer from details stored in the LLM’s trained weights. This type of search adds guardrails so the LLM can search private data collections.
The RAG search can perform queries against external documents passed in as part of the request.
The POST request obtains and indexes prediction information related to the specified use case, and returns a unique predictionId
and status
of the request. The predictionId
can be used later in the GET request to retrieve the results.
The authentication and authorization access token.
application/json
"application/json"
Unique identifier for the model.
"6a092bd4-5098-466c-94aa-40bf6829430\""
OK
This is the response to the POST prediction request submitted for a specific useCase
and modelId
.