Pagination

With version 2020.12 Corteza moved away from offset pagination to the key-based pagination.

Offset-based pagination

From 2020.12 on, this is no longer supported.

Pagination is done with two parameters: page and per page (sometimes offset and limit). When a specific page is requested, a simple formula is used (per page * page) to calculate the offset parameter. This parameter is then used to fetch results from the database.

The database always starts with the 1st row, checks it and if can be returned (offset=0) it skips it. If the offset is 100, this step is repeated 100-times; if the offset is 100k, this step is repeated 100k-times.

Key-based pagination

Also known as pagination with relative cursors.

Cursor values are opaque; changing internal values could lead to anomalies, such as empty results.

We are working with limit, next page, and prev page cursors. The cursor gets calculated from the first and the last item in the returned set of resources. The cursor receives some additional metadata (used indexes and sorting direction).

When fetching the next page, only the items with greater values are returned; when fetching the previous page, only the items with lesser values are returned.

For example, if we sort the data by column-one and id, the cursor for the next page contains values for both fields from the last item; the cursor for the previous page includes values for both fields from the first item.

When we request a limited set of resources (we provide a limit parameter), the system collects as many items as possible to fulfil that limit.

The page cursors are provided in the response:

If the resulting set has fewer items then the requested limit, the next page cursor is omitted.
If the page cursor parameter is not provided, we are accessing the first page, so the previous page cursor is omitted.

In general, all resources are sorted by the primary-key, ascending (this also provides the most optimal performance).

Performance

With key-based pagination (when index are used optimally) response times (for most cases) remain consistent — the access time for the n-th page is the same as the access time for the 1st page.

DevNote provide some references, benchmarks, …

Collecting resources

Resource collecting code is written in the way that it can perform multiple fetches to the database until it satisfies the page. This is needed in cases when application-level filtering (security) removes items from the result set.

This allows us to implement additional security logic on the application level when accessing resources.

Changed approach to page navigation and total item count

With the described changes it’s not possible to jump on an arbitrary page without a known cursor. Page navigation need to be generated and cursors for each page collected. Same for total item count.

We’ve extended Corteza REST API and added two new flags: include page navigation (incPageNavgitation) and include total number of items (incTotal).

When any of these two is enabled, Corteza walks through entire matching dataset of records counts it and/or collects pointers to pages.

Use this carefully on a bigger datasets as it can hurt performance of the entire deployment.