REST V2 connector
alfresco-v1.json
, such as the authentication, data fetched, requests configured (endpoints, query params, pagination) and variables needed. Terminology is also provided as a reference.
alfresco-v1.json
is configured to retrieve the initial folders from the root folder -root-
(Company Home) and then retrieve the folders, nestedFolders and files.
_system
but can be changed to any parser based on index needs.
${LW_BATCH_SIZE}
- Used with pagination feature. Used to set the maxItems
query parameter, which controls the number of entries (folder/files) that are returned in the response.${LW_INDEX_START}
- Used with pagination feature. Used to set the skipCount
query parameter, which is used to traverse the pagination.${LW_PARENT_DATA_KEY}
- Used with the Child Request Configurations. In crawl-time, this variable is replaced with the parent object ID value extracted by setting the property ‘Parent Data Key’. Note: The parent object is retrieved with a previous request (parent-request).${LW_INDEX_START}
${LW_BATCH_SIZE}
${LW_BATCH_SIZE}
, where ${LW_BATCH_SIZE}
is replaced with the value of property BatchSize
. For more information about maxItems
, see Alfresco documentation for limiting-result-items${LW_INDEX_START}
, where ${LW_INDEX_START}
is replaced with the value of property IndexStart
to request the first page, then internally replaced with ‘IndexStart + BatchSize’ to request next pages. For more information about skipCount
, see Alfresco documentation for skipping-result-itemsrequestConfigurations
in the alfresco-v1.json` file)Request type | ObjectType | Parent ObjectType | Endpoint | HTTP operation | Query parameters | Description |
---|---|---|---|---|---|---|
Root Request | INITIAL_FOLDER | /alfresco/api/-default-/public/alfresco/versions/1/nodes/-root-/children | GET | include=path,properties&skipCount=${LW_INDEX_START}&maxItems=${LW_BATCH_SIZE}&where=(isFolder=true | Returns the folders from -root- folder (Company Home) | |
Child Request | FOLDER | INITIAL_FOLDER | /alfresco/api/-default-/public/alfresco/versions/1/nodes/${LW_PARENT_DATA_KEY}/children | GET | include=path,properties&skipCount=${LW_INDEX_START}&maxItems=${LW_BATCH_SIZE}&where=(isFolder=true) | Return children folders from each parent folder retrieved with the previous request ‘INITIAL_FOLDER’. Internally, the variable ${LW_PARENT_DATA_KEY} is replaced with the ‘id’ of the parent folder, which is extracted by setting the property Response Handling → parentDataKey=entry.id . This request enable the property ‘Recursive Request’. |
Child Request | FILE | FOLDER | /alfresco/api/-default-/public/alfresco/versions/1/nodes/${LW_PARENT_DATA_KEY}/children | GET | include=path,properties&skipCount=${LW_INDEX_START}&maxItems=${LW_BATCH_SIZE}&where=(isFile=true) | Returns children files from each parent folder retrieved with the previous request ‘FOLDER’. Internally, the variable ${LW_PARENT_DATA_KEY} is replaced with the ‘id’ of the parent folder, which is extracted by setting the property Response Handling → parentDataKey=entry.id . This request enable the property ‘Skip Indexation’. |
Child Request | FILE_DOWNLOAD | FILE | /alfresco/api/-default-/public/alfresco/versions/1/nodes/${LW_PARENT_DATA_KEY}/content | GET | Download the content from each file retrieved with the previous request ‘FILE’. Internally, the variable ${LW_PARENT_DATA_KEY} is replaced with the ‘id’ of the file, which is extracted by setting the property Response Handling → parentDataKey=entry.id |
responseConfiguration
in the alfresco-v1.json
file)
INITIAL_FOLDER
, FOLDER
, FILE
Response Handling → Data ID, Data Path
, Parent Data Key can be configured to extract certain information from the Objects parsed (see section Terminology for more informationResponse Handling → Parse Binary Data
(binaryResponse
in the alfresco-v1.json` file). Send the whole response to the Fusion Parsers. If disabled (default), the response is parsed as a JSON objectFILE_DOWNLOAD
FILE
, to retrieve a list of files metadata. The request is needed to discover the IDs of files to be downloaded in a following request.FILE_DOWNLOAD
to download the binary content from the files found previously'/Company Home/Sites/sample1'
, add a key-value pair to the exclusion list:
key
= entry.content.sizeInBytes
minimum
, all sizes below the minimum will be excluded.maximum
, all sizes above the maximum will be excluded (Set to -1 when there should be noTerm | Description |
---|---|
List of Requests Configuration | Configure List of Requests to extract data from the Rest source. Requests are linked hierarchically by using the properties ObjectType and ParentObjectType. |
Object Type | The unique name to identify the request. |
Parent Object Type | Reference an existent Object Type. Create a parent-child hierarchy, where the current request becomes the child of the specified Parent Object Type. If blank, the current request is considered a Root-Request. |
Root Request | The request to retrieve the initial objects. |
Child Request | The type of request to retrieve additional information for the root data objects. The child requests will be performed per each root data object. |
Recursive Request | When enabled, extra-requests are performed to retrieve nested objects within the objects found with the current-request. For example, the request ObjectType=FOLDER enable this property, then extra-request is made per Folder found to retrieve NestedFolders. This process will continue until no more NestedFolders are found. |
Skip Indexation | When enabled, the response is not indexed. Useful when requests of objects are needed only to discover child-objects, without need to index the object itself. |
Response Handling | The responseConfiguration Defines the mapping between the response and data objects to be indexed. |
Data Path | The path to access a specific data object within a response. For example, to access a list of elements named with key objects , the DataPath would be objects . If not provided, the entire response body will be indexed. This property accepts JsonPath expressions e.g. objects , objects[*] , or list.entries to extract the list of alfresco objects. |
Data ID | The identifier key for the data objects extracted with ‘Data Path’. This value will be used to build the solr-document’s ID. If not provided, a random UUID will be used. This property accepts JsonPath expressions, e.g. entry.id to extract the ID of the alfresco file/folder |
Parent Data Key | Only configure with Child Requests. Set the ‘key’ to extract the ID of the root/parent response, which value is used to replace the ${LW_PARENT_DATA_KEY} variable in the child request configuration (endpoint, query params or body). For example, /alfresco/api/-default-/public/alfresco/versions/1/nodes/${LW_PARENT_DATA_KEY} /content |
Parse Binary Data | Enable to send the whole response to the Fusion Parsers. If enabled, properties Data Path, Data ID will be ignored and pagination will not happen. |