GitHub

The GitHub recipe retrieves data from a single GitHub repository via the GitHub REST API. You can view the configuration details and JSON recipe at the public REST configuration repository on GitHub in addition to this page.

This recipe uses hierarchical requests and requires version 1.1.0 or later of the REST V2 connector.

GitHub REST configuration

The JSON template file github-repo.json crawls a single specific repository via /repos/{owner}/{repository}. To crawl multiple repositories, create one datasource per repository. The GitHub REST configuration indexes each GitHub object listed below as a separate Solr document:

Repositories
Issues
Pull requests
Branches
Commits (per-branch via BRANCH parent)
Commit diffs (per-file change details for each commit)
Tags
Milestones
Collaborators
Releases
Comments (issues and PR comments)
Commit comments
Content (root directory listing)
Folders (recursive directory traversal)
Blobs (file content via Contents API binary parsing)

The configuration uses the GitHub REST API. The endpoints do not explicitly specify a version; they default to 2022-11-28. See API Versions in the GitHub documentation for details. The configuration was tested with GitHub Cloud. For GitHub Enterprise, update the serviceURL property to point to the enterprise instance API URL, such as https://github.example.com/api/v3.

Authentication methods

The GitHub REST recipe supports basic authentication using the GitHub username and a Personal Access Token (PAT) as the password. For more information, see Managing your personal access tokens in the GitHub documentation.

Classic personal access token (PAT)

For public repositories only, most endpoints require no additional scopes. However, crawling private repositories or organization-private repositories requires the following scopes:

repo: Full control of private repositories (grants access to all repository data).
read:org: Read organization membership (required for organization-private repositories and the collaborators endpoint).

Minimum recommended scopes:

Public repositories only: public_repo and read:org
Private repositories: repo and read:org

Fine-grained personal access token

The GitHub recipe supports authentication through fine-grained personal access tokens. Set the token’s repository access to target the desired repositories, then grant the following read-only permissions:

Metadata (read): Required for repository listing, collaborators, tags, and commit comments.
Contents (read): Required for commits, branches, and releases.
Issues (read): Required for issues, milestones, and issue comments.
Pull requests (read): Required for pull requests and PR review comments.

Permissions by endpoint

The following table shows the exact permissions required per endpoint for each token type:

Endpoint	Fine-Grained Permission	Classic PAT (Public Repos)	Classic PAT (Private Repos)
`/repos/{owner}/{repo}`	Metadata: read	No scope needed	`repo`
`/repos/{o}/{r}/issues`	Issues: read	No scope needed	`repo`
`/repos/{o}/{r}/pulls`	Pull requests: read	No scope needed	`repo`
`/repos/{o}/{r}/commits`	Contents: read	No scope needed	`repo`
`/repos/{o}/{r}/commits/{sha}`	Contents: read	No scope needed	`repo`
`/repos/{o}/{r}/branches`	Contents: read	No scope needed	`repo`
`/repos/{o}/{r}/tags`	Metadata: read	No scope needed	`repo`
`/repos/{o}/{r}/milestones`	Issues: read	No scope needed	`repo`
`/repos/{o}/{r}/collaborators`	Metadata: read	`repo` + `read:org`	`repo` + `read:org`
`/repos/{o}/{r}/releases`	Contents: read	No scope needed	`repo`
`/repos/{o}/{r}/issues/comments`	Issues: read	No scope needed	`repo`
`/repos/{o}/{r}/comments`	Metadata: read	No scope needed	`repo`
`/repos/{o}/{r}/contents/{path}`	Contents: read	No scope needed	`repo`

In addition to the scopes listed in the preceding table, the authenticated user must have push (write), **maintain, or admin access to the repository to use the /repos/{o}/{r}/collaborators endpoint. Without this level of access, the endpoint returns an HTTP 403 error regardless of token scopes. If the crawl account does not have write access to all repositories, consider removing the collaborator request configuration from the JSON recipe to avoid 403 errors.Draft releases are only visible to users with push (write) access to the repository.

Supported crawl options

For a full crawl, all the content from the source is fetched. For a re-crawl, all the content from the source is retrieved as if it were a full crawl. Orphan objects (deleted in the GitHub source that are not retrieved with a current crawl), are deleted from the index using stray content deletion, which runs after a crawl finishes.

Rate limiting

GitHub enforces a rate limit of 5,000 authenticated requests per hour. Retry properties (retryCount, maxDelayTime) can be configured under the datasource’s retry settings to provide resilience against transient errors and rate limit responses (HTTP 403 or 429), though the default values are typically sufficient. Unauthenticated requests are limited to 60 requests per hour. Always use an authenticated token to avoid rate limiting during crawls. For repositories with large amounts of data, consider the total request volume: each repository triggers up to 15 child requests (one per entity type), and each child request paginates at 100 items per page. Additionally, COMMIT crawls per-branch (child of BRANCH), so the total commit API calls scale with the number of branches. For GitHub Enterprise instances, rate limits may differ. Consult your administrator.

The CONTENT, FOLDER, and BLOB requests use the GitHub Contents API (/repos/{owner}/{repo}/contents/{path}) to crawl repository file content. Each directory level requires a separate API call (one request per directory). File downloads via /contents/{path} with the Accept: application/vnd.github.raw+json header count against the API rate limit. For large repositories with many directories and files, this can consume the 5,000 requests/hour rate limit. Consider removing the CONTENT, FOLDER, and BLOB request configurations from the JSON if file content indexing is not needed.

Pagination setup

Pagination by Batch Size is configured per child request that returns paginated arrays, using a page-number approach. The following child requests use pagination: ISSUE, PULL_REQUEST, COMMIT, BRANCH, TAG, MILESTONE, COLLABORATOR, RELEASE, COMMENT, COMMIT_COMMENT. GitHub REST API uses page-based pagination with page and per_page query parameters. The API returns bare JSON arrays. When there are no more results, an empty array [] is returned.

Configure the pagination by batch size properties

IndexStart: 1: GitHub pages are 1-indexed. The first page is page 1.
BatchSize: 1: Used to increment the page number by 1 each iteration. The ${LW_INDEX_START} variable produces values 1, 2, 3, etc.
Stop Condition Key: $: References the root response (bare JSON array).
Stop Condition Value: []: Pagination stops when the response is an empty array.

Configure query parameters

per_page=100: The maximum number of items GitHub returns per page.
page=${LW_INDEX_START}: The current page number, auto-incremented by the connector.

The batchSize=1 setting is a technique to generate sequential page numbers (1, 2, 3…) from the ${LW_INDEX_START} variable, since GitHub uses page-number pagination rather than offset-based pagination. The actual number of items per page is controlled by the fixed per_page=100 query parameter.

Variables used

The GitHub REST configuration variables used are:

${LW_INDEX_START}: Used with pagination feature. This variable is used to set the page query parameter, which is the page number to retrieve. GitHub pagination is 1-indexed. The connector increments this value by increasing the batchSize by 1 after each page request, producing page numbers 1, 2, 3, etc.
${LW_PARENT_DATA_KEY}: Used with Child Request Configuration. This variable is replaced with the value of the parentIdKey field from the parent object’s response.

Endpoints configuration

The following table describes the GitHub REST endpoints needed and how those are configured with the REST connector. Each request is configured under the property List of Requests Configuration (requestConfigurations in the JSON files).

Request type	ObjectType	Parent ObjectType	Endpoint	Query parameters	Description
Root Request	REPOSITORY		GET `/repos/{owner}/{repo-name}`	(none)	Returns a single repository object. Replace `{owner}` and `{repo-name}` with the target repository. No pagination is needed since the endpoint returns a single JSON object. To crawl multiple repositories, create one datasource per repository.
Child Request	ISSUE	REPOSITORY	GET `/repos/{owner}/{repo}/issues`	`per_page=100&page=${LW_INDEX_START}&state=all&filter=all`	Returns all issues (open and closed) for the repository. Note: GitHub’s issues endpoint also returns pull requests since every PR is an issue; PR objects can be identified by the presence of a `pull_request` field.
Child Request	PULL_REQUEST	REPOSITORY	GET `/repos/{owner}/{repo}/pulls`	`per_page=100&page=${LW_INDEX_START}&state=all`	Returns all pull requests (open, closed, and merged) for the repository. Provides PR-specific fields such as `diff_url`, `merge_commit_sha`, `draft`, `head`, and `base`.
Child Request	BRANCH	REPOSITORY	GET `/repos/{owner}/{repo}/branches`	`per_page=100&page=${LW_INDEX_START}`	Returns all branches for the repository. Uses `name` as the Data ID since branches do not have an `html_url` in the list response. Sets `parentIdKey=name` so its COMMIT child receives the branch name via `${LW_PARENT_DATA_KEY}` in the `sha` query parameter.
Child Request	COMMIT	BRANCH	GET `/repos/{owner}/{repo}/commits`	`sha=${LW_PARENT_DATA_KEY}&per_page=100&page=${LW_INDEX_START}`	Returns commits per branch. The `sha` query parameter receives the branch name from the parent BRANCH entity via `${LW_PARENT_DATA_KEY}` (parentIdKey=name). Note: commits reachable from multiple branches will be indexed once per branch.
Child Request	COMMIT_DIFF	COMMIT	GET `/repos/{owner}/{repo}/commits/${LW_PARENT_DATA_KEY}`	(none)	Fetches the single-commit detail and indexes all the modified files. Uses `dataPath=files` to extract the `files` array, creating a separate Solr document for each file entry with fields such as `filename`, `status`, `additions`, `deletions`, `changes`, `patch`, `blob_url`, `raw_url`, and `contents_url`. Uses `sha` as the Data ID. The `${LW_PARENT_DATA_KEY}` is replaced with the commit `sha` from the parent COMMIT object.
Child Request	TAG	REPOSITORY	GET `/repos/{owner}/{repo}/tags`	`per_page=100&page=${LW_INDEX_START}`	Returns all tags for the repository. Uses `name` as the Data ID since tags do not have an `html_url` in the list response.
Child Request	MILESTONE	REPOSITORY	GET `/repos/{owner}/{repo}/milestones`	`per_page=100&page=${LW_INDEX_START}&state=all`	Returns all milestones (open and closed) for the repository.
Child Request	COLLABORATOR	REPOSITORY	GET `/repos/{owner}/{repo}/collaborators`	`per_page=100&page=${LW_INDEX_START}`	Returns all collaborators for the repository. Requires the PAT to have push access to the repository; otherwise returns HTTP 403.
Child Request	RELEASE	REPOSITORY	GET `/repos/{owner}/{repo}/releases`	`per_page=100&page=${LW_INDEX_START}`	Returns all releases for the repository, including draft releases.
Child Request	COMMENT	REPOSITORY	GET `/repos/{owner}/{repo}/issues/comments`	`per_page=100&page=${LW_INDEX_START}`	Returns all comments on all issues (and pull requests) for the entire repository. Each comment includes an `issue_url` field linking back to the parent issue. Uses the repo-level endpoint to avoid nested parent key requirements.
Child Request	COMMIT_COMMENT	REPOSITORY	GET `/repos/{owner}/{repo}/comments`	`per_page=100&page=${LW_INDEX_START}`	Returns all comments on all commits for the entire repository. Each comment includes `commit_id` linking back to the parent commit. Uses the repo-level endpoint.
Child Request	CONTENT	REPOSITORY	GET `/repos/{owner}/{repo}/contents`	(none)	Lists the root directory entries of the repository’s default branch using the Contents API. Each entry includes `name`, `path`, `type` (file or dir), `size`, `sha`, and `html_url`. Uses `skipIndexation=true` — exists only for discovery.
Child Request	FOLDER	CONTENT	GET `/repos/{owner}/{repo}/contents/${LW_PARENT_DATA_KEY}`	(none)	Recursively walks subdirectories. Sets `parentIdKey=path` to extract the `path` field from the parent CONTENT object; `${LW_PARENT_DATA_KEY}` resolves to this value in the endpoint. Uses `recursiveRequest=true` to traverse all directory levels. Uses `skipIndexation=true` — exists only for discovery.
Child Request	BLOB	FOLDER	GET `/repos/{owner}/{repo}/contents/${LW_PARENT_DATA_KEY}`	(none)	Downloads raw file content for each file discovered by FOLDER. Sets `parentIdKey=path` to extract the `path` field from the parent FOLDER object; `${LW_PARENT_DATA_KEY}` resolves to this value in the endpoint. Uses the `Accept: application/vnd.github.raw+json` header and `binaryResponse=true` to download the binary content. Uses `path` as the Data ID.

Notes

The requests are linked hierarchically using the ObjectType and ParentObjectType properties.
When objects are indexed, the field _lw_rest_parent_object_ss keeps the list of parents related to an object.
Comment endpoints use repository-level listing (/issues/comments, /comments) rather than per-issue, per-pull request, or per-commit endpoints. This design avoids the need for nested parent key substitution. The parent entity can be identified using fields within each comment:
- Issue comments: issue_url field
- Commit comments: commit_id field

Response parsing configuration

Per request, configure the Response Handling property to specify how to parse the response. This field is responseConfiguration in the JSON recipe.

Plugin parsing

This parsing happens by default. The responses are parsed as a JSON object structure using JsonPath.
Plugin parsing applies to all the requests listed in the endpoints configuration table, except the BLOB request which uses binary parsing. The CONTENT and FOLDER requests also use plugin parsing but with skipIndexation=true. These requests parse the JSON response to discover files and directories without creating Solr documents.
The Response Handling -> Data ID properties are configured to extract unique identifiers from the objects parsed. For most entities, html_url provides a globally unique, human-readable URL.
- For branches and tags, name is used since these entities lack html_url in list responses.
- For COMMIT_DIFF, blob_url is used since dataPath=files extracts per-file entries that each have a unique blob_url.
The Response Handling -> Parent Data Key property is required by the connector on all child requests. It specifies which field to extract from the parent response object; that value replaces ${LW_PARENT_DATA_KEY} in the child request’s endpoint or query parameters.
- The COMMIT request uses parentIdKey=name to extract the branch name from its parent BRANCH object, passing it as ${LW_PARENT_DATA_KEY} in the sha query parameter.
- The COMMIT_DIFF request uses parentIdKey=sha to extract the commit SHA from its parent COMMIT object, passing it as ${LW_PARENT_DATA_KEY} in the endpoint.
- The FOLDER request uses parentIdKey=path to extract the directory path from its parent CONTENT object.
- The BLOB request uses parentIdKey=path to extract the file path from its parent FOLDER object.

Binary parsing

The BLOB request uses binaryResponse=true to enable binary parsing. The request includes the Accept: application/vnd.github.raw+json header so the GitHub Contents API endpoint (/repos/{owner}/{repo}/contents/{path}) returns raw binary file content instead of a base64-encoded JSON object. With binary response enabled, the connector downloads the raw content and sends it to Fusion’s parser stages.

Terminology

The following terms are provided as a reference.

Term	Description
List of Requests Configuration	Configure List of Requests to extract data from the REST source. Requests are linked hierarchically using the properties Parent-Child Request Link -> ObjectType and ParentObjectType.
Object Type	The unique name to identify the request.
Parent Object Type	Reference an existent Object Type. Create a parent-child hierarchy, where the current request becomes the child of the specified Parent Object Type. If blank, the current request is considered a Root-Request.
Root Request	The type of request-configuration to retrieve the initial parent objects.
Child Request	The type of request-configuration to retrieve children objects per each parent object. A child-request can be a parent of another child-request.
Response Handling	The responseConfiguration defines the mapping between the response and data objects to be indexed.
Data Path	The path to access a specific data object within a response. For GitHub endpoints that return bare JSON arrays, set to an empty string. The COMMIT_DIFF request uses `dataPath=files` to extract the `files` array from the single-commit detail response, creating one document per changed file. This property accepts JsonPath expressions such as `results`, `items[*]`.
Data ID	The identifier key for the data objects extracted with ‘Data Path’. This value is used to build the Solr document’s ID. If not provided, a random UUID is used. This property accepts JsonPath expressions such as `html_url` to extract the unique URL of an object.
Parent Data Key	Required for all Child Requests. Map to a key from the parent object, whose value is used to replace the `${LW_PARENT_DATA_KEY}` variable in the child request configuration (endpoint, query parameters, or body). In the repo template, most children set `parentIdKey=full_name` (required by the connector) but hardcode the `{owner}/{repo}` path in the endpoint.
_lw_rest_object_type_s	All objects index this field, which contains the ‘ObjectType’ of the request that retrieved the object, such as `REPOSITORY`, `PULL_REQUEST`, `COMMIT`, `BRANCH`.
_lw_rest_object_s	All objects index this field, which contains the object ID extracted with the data ID. For example, for a repository, indexes `_lw_rest_object_s: "https://github.com/owner/repo"`. For a pull request, indexes `_lw_rest_object_s: "https://github.com/owner/repo/pull/42"`.
_lw_rest_parent_object_ss	All objects index this field, which contains a list of the object IDs inherited from all their parents, and the object IDs from the object itself. For example, for a pull request, indexes `_lw_rest_parent_object_ss: ["https://github.com/owner/repo", "https://github.com/owner/repo/pull/42"]`.

Recipe

Replace the following values in the recipe:

pipeline with your Fusion pipeline
collection with your Fusion collection
id with the name of a Fusion datasource if you want to use a different name than the one provided
password with your GitHub personal access token
user with your GitHub username
{add owner here} with the GitHub owner name
{add repo name here} with the GitHub repository name

{
  "parserId": "_system",
  "pipeline": "{add pipeline here}",
  "connector": "lucidworks.rest",
  "coreProperties": {},
  "id": "rest-github-repo",
  "type": "lucidworks.rest",
  "properties": {
    "collection": "{add collection here}",
    "serviceURL": "https://api.github.com",
    "authenticationMode": {
      "basicAuth": {
        "password": "{add personal access token here}",
        "user": "{add github username here}"
      }
    },
    "requestConfigurations": [
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "objectType": "REPOSITORY"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}",
            "httpMethod": "GET"
          },
          "responseConfiguration": {
            "dataId": "name",
            "binaryResponse": false,
            "dataPath": ""
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "ISSUE"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/issues",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              },
              {
                "queryKey": "state",
                "queryValue": "all"
              },
              {
                "queryKey": "filter",
                "queryValue": "all"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "number",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "PULL_REQUEST"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/pulls",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              },
              {
                "queryKey": "state",
                "queryValue": "all"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "number",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "BRANCH"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/branches",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "name",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "BRANCH",
            "objectType": "COMMIT"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/commits",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "sha",
                "queryValue": "${LW_PARENT_DATA_KEY}"
              },
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "sha",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "COMMIT",
            "objectType": "COMMIT_DIFF"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/commits/${LW_PARENT_DATA_KEY}",
            "httpMethod": "GET"
          },
          "responseConfiguration": {
            "dataId": "sha",
            "binaryResponse": false,
            "dataPath": "files",
            "parentIdKey": "sha"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "TAG"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/tags",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "name",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "MILESTONE"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/milestones",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              },
              {
                "queryKey": "state",
                "queryValue": "all"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "title",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "COLLABORATOR"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/collaborators",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "login",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "RELEASE"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/releases",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "tag_name",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "COMMENT"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/issues/comments",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "id",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "COMMIT_COMMENT"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/comments",
            "pagination": {
              "paginationByBatchSize": {
                "paginationStopConditionValue": "[]",
                "paginationStopConditionKey": "$",
                "batchSize": 1,
                "indexStart": 1
              }
            },
            "httpMethod": "GET",
            "queries": [
              {
                "queryKey": "per_page",
                "queryValue": "100"
              },
              {
                "queryKey": "page",
                "queryValue": "${LW_INDEX_START}"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "id",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "full_name"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "REPOSITORY",
            "objectType": "CONTENT"
          },
          "skipIndexation": true,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/contents",
            "httpMethod": "GET"
          },
          "responseConfiguration": {
            "dataId": "name",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "path"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": true,
          "linkRequest": {
            "parentObjectType": "CONTENT",
            "objectType": "FOLDER"
          },
          "skipIndexation": true,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/contents/${LW_PARENT_DATA_KEY}",
            "httpMethod": "GET"
          },
          "responseConfiguration": {
            "dataId": "path",
            "binaryResponse": false,
            "dataPath": "",
            "parentIdKey": "path"
          }
        }
      },
      {
        "request": {
          "recursiveRequest": false,
          "linkRequest": {
            "parentObjectType": "FOLDER",
            "objectType": "BLOB"
          },
          "skipIndexation": false,
          "requestConfiguration": {
            "endpoint": "/repos/{add owner here}/{add repo name here}/contents/${LW_PARENT_DATA_KEY}",
            "httpMethod": "GET",
            "headers": [
              {
                "headerKey": "Accept",
                "headerValue": "application/vnd.github.raw+json"
              }
            ]
          },
          "responseConfiguration": {
            "dataId": "path",
            "binaryResponse": true,
            "dataPath": "",
            "parentIdKey": "path"
          }
        }
      }
    ]
  }
}

Concepts

Connectors

Developers

Downloads

GitHub REST configuration

Authentication methods

Classic personal access token (PAT)

Fine-grained personal access token

Permissions by endpoint

Supported crawl options

Rate limiting

Configure query parameters

Variables used

Endpoints configuration

Notes

Response parsing configuration

Plugin parsing

Binary parsing

Terminology

Recipe

Concepts

Connectors

Developers

Downloads

Documentation Index

​GitHub REST configuration

​Authentication methods

​Classic personal access token (PAT)

​Fine-grained personal access token

​Permissions by endpoint

​Supported crawl options

​Rate limiting

​Pagination setup

​Configure the pagination by batch size properties

​Configure query parameters

​Variables used

​Endpoints configuration

​Notes

​Response parsing configuration

​Plugin parsing

​Binary parsing

​Terminology

​Recipe

GitHub REST configuration

Authentication methods

Classic personal access token (PAT)

Fine-grained personal access token

Permissions by endpoint

Supported crawl options

Rate limiting

Pagination setup

Configure the pagination by batch size properties

Configure query parameters

Variables used

Endpoints configuration

Notes

Response parsing configuration

Plugin parsing

Binary parsing

Terminology

Recipe