DataHub is built on the concept of connections. Authenticated connections are accounts that have been verified with the cloud-based or network-based platform when created. The connection can be made using a specific user account (typically an email address) and its corresponding password, or, where supported, the connection can be made using OAuth 2.0 authorization (where a token is generated based on the granting authorization to DataHub through a user login). This authorization allows DataHub access to the user's drive information (files and folder) on the platform.
Connections “connect” to a platform as a specific user account. The user account requires the proper permissions to the platform to read/write/update/delete the content, according to what actions the DataHub job is to perform. The connector user account should also be set up so that the password does not expire; otherwise, the connection will no longer be able to access the platform until the connection has been refreshed with the new password.
A connection is made to the source platform, and another connection is made to the destination platform. Next, a job is created to tie the two platforms together. When DataHub connects to a content platform, it does so by using the publicly available Application Programming Interface (API) for the specific platform. This ensures that DataHub is “playing by the rules” for each platform.
Creating a Connection
Creating a connection in the DataHub Platform user-interface is easy! Simply add a connection, select your platform, and enter the requested information. DataHub will securely validate your credentials and connect to your source content.
When transferring data between a source and destination there are a number of factors which can limit the transfer speed. Most cloud providers have rate limitations that reduce the transfer rate, but if those limits are account based and it supports impersonation, DataHub can create a pool of accounts that issues commands in a round-robin format across all the accounts connected to the pool. Any modifications to the connection pool will used on the next job run.
For example, if a connection pool has 2 accounts, all commands will be alternated between them. If a third account is added to the pool, the next run of the job will use all three accounts.
|Connection pooling is not supported with "My Computer" and Network File Share (NFS) connections.|