Syncing data

The process of updating data on the destination node, based on the updated data source. The destination node has the ability to determine what field from the origin node maps into what field in the destination node.

Data sync operates directly on compose services and storage layers, but is designed to be decoupled and moved away from the main corteza service.

The two nodes must be paired prior to this. See [rfc:federation:node-pair].

Syncing data

After the two nodes are paired and the structure syncing has finished, we can proceed with the data sync.

Data syncing uses already established compose services along with it’s storage layer. This removes the need for any additional storage layer modifications.

Figure 1. Outline of the data sync process on the destination node.

Figure 2. Outline of the data sync process on the origin node.

Destination node requests the information about any data changes: Origin node provides a set of endpoints that the destination node can use to fetch newly created, updated, and deleted data. The destination node provides some additional filtering parameters; such as last sync timestamp; to exclude any unchanged data.

$TOKEN_B is the token that was generated during the handshake and is used to authenticate the user on the Origin node (the one who shares the data) by the Destination node.

Used variables

# Base url for the federation api
$BASE_URL

# JWT of the user
$JWT

# Node id of the destination node (?exposed) or the origin node (?shared)
$NODE_ID

# Federation module id
$MODULE_ID

# Node B auth token
$TOKEN_B

Example request

curl -X GET "$BASE_URL/federation/exposed/records?after=$AFTER_TIMESTAMP" \
  -H "Authorization: Bearer $TOKEN_B";

Example response

{
    "response": {
        "filter": {
            "query": "after=1600109447",
            "page": 1,
            "perPage": 20,
            "count": 97,
            "deleted": 0
        },
        "set": [
            {
                "type": "GET",
                "rel": "Lead",
                "href": "$BASE_URL/federation/exposed/records/$MODULE_ID?after=1600109447"
            },
            {
                "type": "GET",
                "rel": "Contact",
                "href": "$BASE_URL/federation/exposed/records/$MODULE_ID?after=1600109447"
            }
        ]
    }
}

Destination node requests changed data: The destination node fetches the data on per-module basis from the above provided set of endpoints.

$TOKEN_B is the token that was generated during the handshake and is used to authenticate the user on the Origin node (the one who shares the data) by the Destination node.

Used variables

# Base url for the federation api
$BASE_URL

# JWT of the user
$JWT

# Node id of the destination node (?exposed) or the origin node (?shared)
$NODE_ID

# Federation module id
$MODULE_ID

# Node B auth token
$TOKEN_B

Example request

curl -X GET "$BASE_URL/exposed/modules/$MODULE_ID/records?after=$AFTER_TIMESTAMP" \
  -H "Authorization: Bearer $TOKEN_B";

Example response

{
    "response": {
        "filter": {
            "moduleID": "132954639472525355",
            "query": "",
            "sort": "createdAt DESC",
            "page": 1,
            "perPage": 20,
            "count": 97,
            "deleted": 0
        },
        "set": [
            {
                "recordID": "161135411379307175",
                "moduleID": "132954639472525355",
                "values": [
                    {
                        "name": "name",
                        "value": "John"
                    },
                    {
                        "name": "surname",
                        "value": "Doe"
                    }
                ],
                "createdAt": "2020-09-08T19:56:14Z",
                "updatedAt": "2020-09-09T18:05:33Z"
            },
            {
                "recordID": "161134990657061543",
                "moduleID": "132954639472525355",
                "values": [
                    {
                        "name": "name",
                        "value": "Walter"
                    },
                    {
                        "name": "surname",
                        "value": "White"
                    }
                ],
                "createdAt": "2020-09-08T19:52:03Z",
                "updatedAt": "2020-09-11T19:44:03Z"
            }
        ]
    }
}

Origin node provides changed data for the requested structure: The origin node provides a set of changes based on base filter parameters such as timestamp, and requested structure to determine what data the destination node would like to receive. The filtered data, along with federated structure definitions are then passed into internal data manipulation systems to transform the data into the desired output, such as Activity Pub, and JSON (this also removes any fields that are not exposed by the origin node). The response also includes any additional filtering and pagination related parameters so the data can be fetched in chunks, or re-fetched if it any issues occurred.
Destination node transforms the provided data into internal resource structures: The destination node uses data manipulation systems, along with module mapping definitions (see [rfc:federation:structure-sync:field-mapping]) to transform the provided data set into internal resource structures. These are then used to update the destination node’s storage.
Destination node updates the nodes status: The node’s status is updated to indicate when the last successful data sync has occurred.