Syncplicity Support

Search our knowledgebase to get the help you need, today

Follow

Box to Syncplicity

DataHub is an enterprise data integration platform that enables organizations to maximize business value and productivity from their content.  

It connects disparate storage platforms and business applications together, allowing organizations to move, copy, synchronize, gather and organize files as well as their related data across any system. 

DataHub empowers your users with unified access to the most relevant, complete and up-to-date content – no matter where it resides.    

DataHub delivers a user-friendly web-based experience that is optimized for PC, tablet and mobile phone interfaces—so you can monitor and control your file transfers anywhere, from any device.   

DataHub’s true bi-directional hybrid/sync capabilities enable organizations to leverage and preserve content across on-premises systems and any cloud service. Seamless to users, new files/file changes from either system are automatically reflected in the other.  

How does DataHub Work? 

Cloud storage and collaboration platforms continue to be the driving force of digital transformation within the enterprise. However, users need to readily access the content that resides within your existing network file systems, ECM, and other storage platforms – enabling them to be productive, wherever they are.  DataHub is purpose-built to provide boundless enterprise content integration possibilities, the DataHub Platform is 100% open and provides a highly-scalable architecture that enables enterprises to easily meet evolving technology and user demands—no matter how complex. 

The DataHub platform provides: 

  • A low risk approach to moving content to the cloud while maintaining on-premises systems

  • No impact to users, IT staff, business operations or existing storage integrations

  • Ability to extend cloud storage anywhere/any device capabilities to locally-stored content

  • Easy integration of newly acquired business storage platforms into existing infrastructures

The Engine 

DataHub’s bi-directional synchronization engine enables your enterprise to fully-integrate and synchronize your existing on-premises platforms with any cloud service. 

It empowers your users to freely access the content they need while IT staff maintains full governance and control. DataHub integrates with each system's published Application Program Interface (API) at the deepest level—optimizing transfer speeds and preserving all file attributes. 

Security 

DataHub’s 100 percent security-neutral model does not incorporate or use any type of proxy cloud service or other intermediary presence point. All content and related data is streamed directly via HTTPS [256-bit encryption] from the origin to the destination system(s). Additionally, DataHub works with native database encryption. 

Analyzer - Simulation Mode

The DataHub analyzer is a powerful enterprise file transfer simulation that eliminates the guesswork. You will gain granular insight into your entire content landscape including its structure, the use of your files, how old and what type they are, what the metadata contains and more, no matter where the files are located—whether in local storage, remote offices or on user desktops. 

Simulation mode allows you to create a job with all desired configuration options set and execute it as a dry-run. In this mode, no data will actually transfer, no permissions will be set, no changes will be made to either the source or the destination. This can be useful in answering several questions about your content prior to actually running any jobs against your content.

Features and Functionality 

The DataHub Platform enables you with complete integration and control over: 

  • User accounts 

  • User networked home drives 

  • User and group permissions 

  • Document types, notes, and file attributes 

  • Timestamps 

  • Versions 

  • Departmental, project, and team folders 

  • Defined and custom metadata 

Architecture and Performance 

The DataHub platform is built upon a pluggable, content–streaming architecture that enables highly–automated file/data transfer and synchronization capabilities for up to billions of files. File bytes stream in from one connection endpoint (defined by the administrator), across a customer owned and operated network and/or cloud service, then streamed out to a second connection endpoint. Content can also flow bidirectionally across two endpoints, rather than solely from an "origin" to a "destination". 

Supported Features

The DataHub Platform Comparison tool allows you to compare platform features and technical details to determine which are supported for your transfer scenario.

Viewing the Platform Comparison results for your integration will display a list of features of each platform and provide insight early in the integration planning process on what details may need further investigation.

The Platform Comparison tool is available via the Connection, Platforms menu options.

Connection Setup

DataHub is built on a concept of connections.  A connection is made to the source platform and then another connection is made to the destination platform. A job is created to tie the two platforms together. 

When DataHub connects to a content platform, it does so by using the publicly available Application Programming Interface (API) for the specific platform.  This ensures that DataHub is “playing by the rules” for each platform. 

Connections “connect” to a platform as a specific user account. The user account requires the proper permissions to the platform to read/write/update/delete the content, according to what actions the DataHub job is to perform. 

The connection user account should also be set up so that the password does not expire, otherwise the connection will no longer be able to access the platform until the connection has been refreshed with the new password. 

Most connections require a specific user account and its corresponding password.  The user account is typically an email address. 

Authenticated Connections

Authenticated Connections are accounts that have been verified with the cloud-based or network-based platform when created. The connection can be user/password-based or done through OAuth2 flow, where a token is generated based on the granting authorization to DataHub through a user login. This authorization allows DataHub access to the user's drive information (files and folder) on the platform. These connections are used as the source or the destination authentication to transfer your content.

OAuth2 Interactive (Web) Flow

  • Connectors such as Box, Google Drive and Dropbox use the OAuth2 interactive (or web) flow

OAuth2 Client Credentials Flow

  • Connections such as Syncplicity and GSuite uses the OAuth2 client credentials flow

SharePoint

  • SharePoint (all versions, CSOM) uses a custom username/password authentication model

OAuth2 Interactive (Web) Flow

You will need the following information when creating a connection to Network File System, Box, Dropbox and Dropbox for Business:

  • A name for the connection

  • The account User ID, such as jsmith@company.com

  • The password for the User ID

Create Connections

DataHub is built on a concept of connections. A connection is made to the source platform and then another connection is made to the destination platform. Next a job is created to tie the two platforms together.

When DataHub connects to a content platform, it does so by using the publicly available Application Programming Interface (API) for the specific platform. This ensures that DataHub is “playing by the rules” for each platform.

Connections “connect” to a platform as a specific user account. The user account requires the proper permissions to the platform to read/write/update/delete the content, according to what actions the DataHub job is to perform.

The connector user account should also be set up so that the password does not expire, otherwise, the connection will no longer be able to access the platform until the connection has been refreshed with the new password.

Most connections require a specific user account and its corresponding password. The user account is typically an email address.

Creating a Connection

Creating a connection in the DataHub Platform user-interface is easy! Simply add  a connection, select your platform and enter the requested information. DataHub will securely validate your credentials and connect to your source content. 

 

Create a Box Connection

DataHub connections to the Box platform can be made by using a standard account or with a Box Service Account. There are several ways to transfer content, this document outlines each of those situations and how to configure them.

Please refer to Box Service Account for information regarding how to create a Box Service Account connection.

Create Connection - DataHub Application User-Interface

  1. Select Connections > Add connection.

  2. Select Box as the platform on the Add connection modal.

  3. Enter the connection information. Reference the table below for details about each field.

  4. Select Sign in with Box and enter the Email Address and Password required to log on to the Box account. 

  5. Select Grant access to Box when prompted to authorize the connection.

  6. You will see a "Connection test succeeded" message on the Add connection modal. (If you don't see this message, repeat the sign in and authorization steps above.)

  7. Select Done to finish creating the connection. 

Field

Description

Optional/Required

Display as

Enter the display name for the connection. If you will be creating multiple connections, ensure the name readily identifies the connection. The name displays in the application, and you can use it to search for the connection and filter lists.

If you do not add a display name, the connection will automatically be named using the account owner's name. For example, Box (John Doe). If it will be useful for you to reference the connection by account, you should use the default name.

Optional

User type

Required

Standard user

Select this option if the Box account credentials you will supply provide access to one individual account.

 

Account administrator

Select this option if the Box account credentials you will supply are for an administrator (or co-administrator) that has access to all accounts in the organization. This option is often used along with impersonation to simplify transferring multiple user accounts.

 

Platform API client credentials

Required

Use the system default client credentials 

Select this option to use the default DataHub client application.

 

Use custom client credentials 

Select this option to use custom client credentials provided by your administrator. When selected, two additional fields will be available to enter the credentials. Your administrator can use the information provided in the following link to obtain the credentials: Box Documentation - Setting up a JWT app.

 

Client ID 

This field displays only when you select Use Custom Client Credentials. This value will be provided by your administrator. Your administrator can use the information provided in the following link to obtain the credentials: Box Documentation - Setting up a JWT app.

Optional

Client Secret 

This field displays only when you select Use Custom Client Credentials. This value will be provided by your administrator. Your administrator can use the information provided in the following link to obtain the credentials: Box Documentation - Setting up a JWT app.

Optional

Features and Limitations 

Platforms all have unique features and limitations. DataHub’s transfer engine manages these differences between platforms and allows you to configure actions based on Job Policies and Behaviors. Utilize the Platform Comparison tool to see how your integration platforms may interact regarding features and limitations. 

Files and Folders

Below is list of Box's supported and unsupported features  as well as additional file/folder restrictions. 

Supported Features 

Unsupported Features 

Other Features/Limitations 

Version preservation

Path length maximum

Invalid characters: \  /  

Timestamp preservation

Restricted types

File size maximum: 15 GB

Author/Owner preservation

 

Segment path length: 255

File lock propagation

 

No leading spaces in file name/folder names

Mirror lock ownership

 

No trailing spaces in folder names, file names, or file extensions

Account map

 

No non-printable characters
Any non-printable ASCII characters will not be filtered by DataHub.

Group map

 

Box has download limitations for the number of folders and files contained in one folder. Please consult Box documentation for further details.

Permission preservation

 

Box accounts that do not have administrator-level access cannot remove group permissions on files during a job transfer.

User impersonation

 

Google document types natively created on Box can be moved and will maintain formatting. However, they will have the native Google file extensions (.gdoc, .gsheet, etc.).

Metadata map

 

The maximum tag size in Box is 255 characters. You can enter more characters than the maximum, but Box will truncate it down to 255 characters.

Tags map

 

 

Box Notes

When transferring information from Box to SharePoint Online, Box Note files will transfer; however, edits to Box Notes will not be picked up, and manual intervention is required. Below is an outline of how Box Notes are handled on the initial and subsequent job runs. 

First Job Run

  • Box Note files transfer to SharePoint Online as expected, preserving the extension.

  • The file version history transferred to the destination reflects every edit that occurred on the Box Note file in the Box source application.

Second Job Run

  • If changes are made to the Box Note file on the source in between job runs, the new version is not transferred to the destination.

Resolution

  • Perform a soft job reset and run the job again. This will re-evaluate the new edits/versions that occurred on the source Box Note file. (See Job | Reset Options for information on how to reset jobs.)

Box Enterprise Plus

Box Enterprise Plus offers a maximum file size upload limit of 32 GB. DataHub requests the maximum file size limit from Box; there are no artificial limits placed by DataHub. When creating any Box connection, DataHub will evaluate the user profile and retrieve the max_upload_size parameter.

Owner Permissions

DataHub doesn’t expose owner permissions when migrating from Box. When the account running the job is the owner of the content but the user map between that account and the destination account don’t match, DataHub won’t grant privileges to the audit trail creator, so the owner will not be able to access the content.

Server Time

When installing DataHub, you must ensure the time on the server running DataHub is set to the same time as the Box platform or, preferably, a minute or two behind. The Box platform uses UNIX time; you can find the UNIX timestamp by visiting https://www.unixtimestamp.com/. (Enter the current time of the DataHub server on this site to get the UNIX time of Box and the application server. (Refer to the image below.)

Your Box connection from DataHub will fail to make a successful connection if the DataHub server time is ahead of the Box platform time because the access token will be expired by the time it is returned from Box.

 

Create a Syncplicity Connection

The Syncplicity connector in DataHub allows you to analyze, migrate, copy, and synchronize files from your Syncplicity service to cloud storage repositories and on-premise network file shares. DataHub connections to Syncplicity require OAuth 2.0 access. In order to create a connection from DataHub to Syncplicity, you will need to complete configuration on the Syncplicity side, and you will need to provide several pieces of authentication information.  To learn more, click here

Create a Syncplicity Connection

  1. Select Connections > Add connection.

  2. Select Syncplicity as the platform on the Add connection modal.

  3. Enter the connection information. Reference the table below for details about each field.

  4. Test the connection to ensure DataHub can connect using the information entered.

  5. Select Done.

Field

Value

Description

Optional/Required

Display as

User-Defined Text Field

Enter the display name for the connection. If you will be creating multiple connections, ensure the name readily identifies the connection. The name displays in the application, and you can use it to search for the connection and filter lists.

Required

Application token

Provided by your Syncplicity administrator

Each user can provision a personal application token, which may be used to authenticate in UI-less scenarios via API. This is especially useful for account administration tasks that run in a headless session. If provisioned, an application token is the only information required to log in a user using OAuth 2.0 resource owner credentials grant. You should protect this token.To learn more, click here.

Required

App key

Provided by your Syncplicity administrator

dentifier of the third-party application as defined by OAuth 2.0.  To learn more, click here.

Required

App secret

Provided by your Syncplicity administrator

The secret (password) of the third-party application as defined by OAuth 2.0. Used with an application key to authenticate a third-party application.  To learn more, click here.

Required

New SyncPoint type

Syncpoint type choice

This option instructs DataHub as to what type of folder should be created when a top level folder is created through a DataHub process.  To learn more about these options, click here.

Optional

Features and Limitations

Platforms all have unique features and limitations. DataHub’s transfer engine manages these differences between platforms and allows you to configure actions based on Job Policies and Behaviors. Utilize the Platform Comparison tool to see how your integration platforms may interact regarding features and limitations. 

Supported Features 

Unsupported Features 

Other Features/Limitations 

Version preservation

File lock propagation

Segment path length: 260

Timestamp preservation

Mirror lock ownership

No leading spaces in file name/folder names

Author/Owner preservation

File size maximum

No trailing spaces before or after file extensions

Account map

Path length maximum

No non-printable ASCII characters

Group map

Restricted types

Invalid characters: \  /  <  > 

Permission preservation

Metadata map

Only syncpoints can be shared with other users and have permissions persist.

User impersonation

Tags map

Users with a large number of syncpoints are not supported by Syncplicity.

If you are creating a new impersonation job with a Syncplicity connection and the source or destination location is empty, the user you are impersonating has too many syncpoints. You will need to delete the syncpoints before you can create the job.

Connection Pooling

When transferring data between a source and destination, there are a number of factors that can limit the transfer speed.  Most cloud providers have rate limitations that reduce the transfer rate, but if those limits are account based and it supports impersonation, DataHub can create a pool of accounts that issues commands in a round-robin format across all of the accounts connected to the pool. Any modifications to the connection pool will used on the next job run. 

For example, if a connection pool has two accounts, all commands will be alternated between them. If a third account is added to the pool, the next run of the job will use all three accounts.

Not Supported:

  • "My Computer" and Network File Share (NFS) connections are not supported with Connection Pooling.

User & Group Maps 

A user account or group map provides the ability to explicitly associate users and groups for the purposes of setting ownership and permissions on items transferred.  These mappings can happen automatically using rules or explicitly using an exception.  Accounts or groups can be excluded by specifying an exclusion, and unmapped users can be defaulted to a known user.  

Here are a few things to consider when creating an account or group map: 

  • A source and destination connection are required and need to match the source and destination of the job that will be referencing the user or group map. 

  • A map can be created before or during the creation of the job. 

  • A map can be used across multiple jobs. 

  • Once updated, the updates will not be reapplied to content that has already been transferred. 

User & Group Map Import Templates

Please see Account Map / Group Map | CSV File Guidelines for map templates and sample downloads.

User & Group Map Exceptions

A user or group map exception provides the ability to explicitly map a specific user from one platform to another.  These are exceptions to the automatic account or group mapping policies specified.  User account or group map exceptions can be defined during the creation of the map or can be imported from a comma-separated values (CSV) file. 

User & Group Map Exclusions

A user or group map exclusion provides the ability to explicitly exclude an account or group from owner or permissions preservation.  User account or group map exclusions can be defined during the creation of the map or can be imported from a comma-separated values (CSV) file.  

Transfer Planner 

 At the start of a project, it is common to begin planning with questions like "How long should I expect this to take?" 

Transfer Planner allows you to outline the basic assumptions of any integration, primarily around the initial content copy at the beginning of a migration or first synchronization.  It uses basic assumptions to begin visualization of the process, without requiring any setup of connections or jobs. 

The tool estimates and graphs a time line to complete the transfer based on the information entered in the Assumptions area. The time line assumes a start date of today and uses the values in the Assumptions section to model the content transfer.  

The Transfer Planner automatically recalculates the predicted time line if you change any of the values, making simple “what if?” scenario evaluations.  Press Reset to restore default values for the transfer planner tool. 

The window displays projected Total Transfer in dark blue and Daily Transfer Rate in light blue. Hovering the mouse pointer over the graph displays estimated transfer details for that day. 

You can see the impact on the project timeline by changing the values in the Assumptions area. The graph will redraw to reflect your new values. 

Note that the Transfer Planner is primarily driven by the amount of data needing to be processed. DataHub has various tools for transferring versions of files (if the platform supports this feature), which can increase the size of your data set. It also has the ability to filter out specific files by their type or by other rules you set. At this stage, a rough estimate of total size is recommended as it can refined later using Simulation Mode.

Simulation Mode

Simulation mode allows you to create a job with all desired configuration options set and execute it as a dry run. In this mode, no data will actually transfer, no permissions will be set, and no changes will be made to either the source or the destination.

This can be useful in answering several questions about your content prior to actually running any jobs against your content.

How much content do I have?

  • An important first step in any migration is to determine how much content you actually have, as this can help in determining how long a migration will take.

What kinds of content do I have?

  • Another important step in any migration is to determine what kinds of content you actually have.

  • Many organizations have accumulated a lot of content and some of that may not be useful on the desired destination platform.

  • The results of a simulation mode job can help you determine if you should introduce any filter rules to narrow the scope of the job.

  • An example would be if you should exclude executable files (.exe or .bat files) or exclude files older than 3 years old.

What kinds of issues should I expect to run into?

  • During the course of a migration, there are many things to consider and unknown issues that can arise, many of which will only present themselves once you start doing something with the source and destination.

  • Running a job in simulation mode can help you identify some of those issues before you actually start transferring content.

Examples can include:

  • Are my user mappings configured correctly?

  • Does the scope of the job capture everything that I expected it to capture?

  • Do I have files that are too large for the destination platform?

  • Do I have permissions that are incompatible with the destination platform (i.e. ACL vs waterfall)?

  • Do I have files or folders that are too long or contain invalid characters that the destination platform will not accept?

Create a Simulation Job

During the job creation workflow, the last stage before creating the job there will be an option to enable simulation mode.

When a job is in simulation mode, it can be run and scheduled like any other job, but no data will be transferred. 

Transition a Simulation Job to Transfer Content

After review, a simulation job can be transitioned to a live job that will begin to transfer your content to the destination platform.

Create a Job

DataHub delivers a user-friendly web-based experience that is optimized for PC, tablet and mobile phone interfaces—so you can monitor and control your file transfers anywhere, from any device.   

DataHub’s true bi-directional hybrid/sync capabilities enable organizations to leverage and preserve content across on-premises systems and any cloud service. Seamless to users, new files/file changes from either system are automatically reflected in the other.  

DataHub uses jobs to perform specific actions between the source and destination platforms. The most common type of jobs are copy and sync; please see Create New Job | Transfer Direction for more information.

All jobs can be configured to run manually or on a defined schedule. This option will be presented as the last configuration step.

To create a job, select the Jobs option from the left menu and click on Create Job. DataHub will lead you through a wizard to select all the applicable options for your scenario.

The main job creation steps include:

  • Selecting a Job Type

  • Configuring Locations

  • Defining Transfer Policies

  • Defining Job Transfer Behaviors

  • Advanced Options

  • Summary | Review, Create Job, and Schedule

Job Type 

Job type defines the kind of job and the actions the job will perform with the content.  There are two main job types available: basic transfer and folder Mapping. 

Basic Transfer - Transfer items between one connection and another

This will copy all content (files, folders) from the source to the destination. Each Job run will detect any new content on the Source and copy to the Destination

For more information, please see Create New Job | Transfer Direction.

Define Source & Destination Locations

All platform connections made in the DataHub Platform application will be available in the locations drop-down lists when creating a job. 

  • If your connections were created with Administrative privileges, you may also have the ability to impersonate another user within your organization.

  • Source defines the location of your current content you wish to transfer.

  • Destination defines the location of where you would like your content to go.

Configuring Your Locations - Impersonation

Impersonation allows a site admin access to all the folders on the site, including those that belong to other users. With DataHub, a job can be set up using the username and password of the site admin to sync/migrate/copy files to or from a different user's account without ever having the username or password of that user.

How and why would I use impersonation? 

This allows a site admin access to all the folders on the site, including those that belong to other users.  Within DataHub, a job can be set up using the username and password of the site admin to sync/migrate/copy files to or from a different user's account without ever having the username or password of that user.

Enable Run as user...

Choose Source User

Job Category

The category function allows for the logical grouping of jobs for reporting and filtering purposes.  The category is optional and does not alter the job function in any way.

DataHub comes with two default job categories:

Maintenance: DataHub maintenance jobs only. This category allows you to view the report of background maintenance jobs and is not intended for newly created transfer jobs.

Default: When a category is not defined during job creation, it will automatically be given the default category. This option allows you to create a report for all jobs that a custom category was not assigned.

Create Job Category

Enable feature and select from existing job categories or create a new category.

From the jobs grid, filter by category

Job Policies

Define what should happen once items have been successfully transferred and set up rules around how to deal with content as it is updated on your resources while the job is running.

  • DataHub works on the concept of “deltas” where the transfer engine only transfers files after they have been updated.

  • File version conflicts occur when the same file on the source and destination platforms have been updated in between job executions.

  • Policies define how DataHub handles file version conflicts and whether or not it persists a detected file deletion. 

  • Each job has its own policies defined and the settings are NOT global across all jobs.

Conflict Policy - File Version Conflicts

When a conflict is detected on either the source or the destination, Conflict Policy determines how DataHub will behave.

For more information, please see Conflict Policy.

Delete Policy - Deleted Items

When a delete is detected on either the Source or the Destination, Delete Policy determines how DataHub will behave. 

For more information, please see Delete Policy.

Behaviors

Behaviors determine how this job should execute and what course of action to take in different scenarios. All behaviors are enabled by default as recommended settings to ensure content is transferred successfully to the destination.

Zip Unsupported Files / Restricted Content

Enabling this behavior allows DataHub to compress any file that is not supported on the destination into a .zip format before being transferred. This will be done instead of flagging the item for manual remediation and halting the transfer of the file.

For example, if you attempt to transfer the file "db123.cmd" from a Network File Share to SharePoint, DataHub will compress the file to "db123.zip" before transferring it over, avoiding an error message. 

Allow unsupported file names to be changed

Segment Transformation policy controls if DataHub can change folder and file names to comply to platform's restrictions.

Enabling this behavior allows DataHub to change the names of folders and files that contain characters that are not supported by the destination before transferring the file. This will be done instead of flagging the file for manual remediation and preventing it from being transferred.

When this occurs, the unsupported character will be transformed into a underscore.

For example, if you attempted to transfer the file "Congrats!.txt" from box to NFS, it would be transformed to "Congrats_.txt" and appear that way on the destination.

Preserve file versioning between locations

DataHub will preserve and transfer all versions of a file on supported platforms.

Advanced

These optional job configurations determine what features you want to preserve, filter or add during your content transfer. 

Filtering

Filtering defines rules for determining which items are included or excluded during transfer. For more

information, please see Job Filters.

Job Filters | Filter By Name Pattern

Job Filters | Filter By Extensions or Type

Job Filters | Filter By Size

Job Filters | Filter By Date Range or Age

Job Filters | Filter by Metadata

Job Filters | Metadata Conjunctions

Permission Preservation

This setting enables DataHub to determine how permissions are transferred across platforms.

Permissions | Author / Owner Preservation

Permissions | Permissions Preservation

Permissions | Permissions Import

Permissions | Preserve Shared Links

Metadata Mapping

Metadata mapping allows you to document your source metadata and map how you want it applied to the destination in CSV format. Enabling this feature will offer the ability to import the CSV file and apply it during job creation.

For more information, please see Metadata Import.

Scripting

Some DataHub features are not available yet in the user-interface. The scripting feature allows the advanced DataHub user to enter advanced transfer features by inserting JSON formatted job controls. 

Enabling this option will allow you to leverage these features and apply it during job creation.

Job Summary - Review your job configuration

Before you create your job, review all your configurations and adjust as needed. Modifying your job after creation is not supported; however, the option to duplicate your current job will allow you to make any adjustments without starting from the beginning. 

  • The Edit option will take you directly to the configuration to make changes.

Define Job Schedule

During job creation, the final step is to define when the job will run and what criteria will define when it stops. 

  • Save job will launch the job scheduler.

  • Save job and run it right now will trigger the job to start immediately. It will run every 15 mins after the last execution completes.

Schedule Stop Policies 

Stop policies determine when a job should stop running.  If none of the stop policies are enabled, a scheduled job will continue to run until it is manually stopped or removed. 

The options for the stop policy are: 

Stop after a number of total runs

The number of total executions before the job will move to "complete" status

Stop after a number of runs with no changes

The job has run and detected no further changes; all content has transferred successfully. 

If new content is added to the source and the job runs again, this will not increment your stop policy count. However, job executions that detect no changes do not need to be consecutive to increment your stop policy count.

Stop after a number of failures

Most failures are resolved through automatic retries. If the retries fail to resolve the failures, then manual intervention is required. This policy takes the job out of rotation so that the issue can be investigated. 

Job executions that detect failures do not need to be consecutive to increment your stop policy count.

Stop after a specific date

The job will "complete" on the date defined

Reports

Reporting is paramount with the DataHub Platform.  Whether you chose to utilize the DataHub manager application, CLI, or ReST API, reporting options are available to help you manage and surface data about your content in real-time.  

 Out-of-the-box reports include: 

  • Dashboard: Provides an overview of what is happening across all your content

  • Job Overview: Detailed job information including source, destination, schedule and current status

  • Flagged Items: Content that did not transfer and requires attention

  • Content Insights: Breakdown of your transferred data

  • Sharing Insights: Breakdown of all permissions associated based on your source content

  • User Mappings: The permission associations of your content

  • Item Report: Information on each item that transferred

  • Validation: At any time, you may run a validation run, which will trigger a full inspection of all content relating to the option you select for the next run only.

Job Overview Report

This report provides detailed transfer information for the individual job.

Schedule: Provides information on how many times the job has executed, when the job will run again and progress towards meeting the job stop policy defined

Transfer Details | Identified Chart: Reflects content identified on the source platform and the status summary for items

Transfer Details | Revised Chart: Reflects content that DataHub revised during transfer to meet destination requirements and user-defined job configurations

Transfer Details | Flagged Chart: Reflect content that DataHub could not transfer. Manual remediation is required

Run Breakdown Report: Provides job history information for each execution for the given job

  • Note: Last Activity in the Run Breakdown will only appear during the job execution.

In some circumstances, bytes on the destination can be higher than listed on the source. This discrepancy is caused by property promotion on Word documents. For more information, see Report Values | Potential Differences due to Post Processing.

Values in the run breakdown may differ from values presented in the charts. This is because the run breakdown tracks each individual occurrence where as an item can only exist in a single chart category.

Example: When an item is both truncated AND ignored, it would not show up in the "Revised" chart but would show up in the "Revised" run breakdown

The run breakdown also shows both files and folder values. The charts display files and folder values separately, with the "Transfer Details" dropdown available to switch between display values.

Job Content Insights Report

This report provides detailed content information for the individual job.

Use the drop-down options to change the chart views.

Job Sharing Insights Report

This report offers a breakdown of all permissions associated to your content. The values presented are based on the source content.

On the Shared Insights tab for a job, the value "Not Shared" represents both items that have no permissions as well as content shared by inheritance from the parent folder. At this time, DataHub only tracks permissions applied during transfer, not permissions that result from inheritance within the hierarchy.

Job User Mappings Report

The User Mappings report for a given job presents the permission breakdown of your content. 

If any of the following features are enabled, User Mapping report will populate:

Job Validation Report

Control the level of tracking and reporting for content that exists on both the source and destination platform, including content that has been configured to be excluded from transfer and content that existed on the destination prior to the initial transfer. 

Items that have been ignored / skipped by policy or not shown because they already existed on the destination can now be seen on reports with the defined categories.

The default validation option is inspect none. This option does not need to be configured in the application user-interface or through the ReST API; it is the system default.

This configuration will not track all items but will offer additional tracking with performance in mind. Inspect none will track all items on the source at all levels of the hierarchy but not including those configured to be ignored/skipped through policy. For the destination, all content in the root (files & folders) that existed prior to the initial transfer will be tracked as destination only items and reported as ignored/skipped.

This option has the following features:

Source: All content (files and folders) at all levels in the hierarchy, but not including those configured to be ignored/skipped through policy. However, if the connection does not have access to a given folder in the hierarchy, we cannot track and report these items.

Destination: All content in the root (files and folders) that existed prior to the initial transfer will be tracked as destination only items and reported as ignored/skipped.

Destination: All content (files and folders) lower depths of the directory (sub-folders) that existed prior to the initial transfer will not be tracked.

If the connection does not have access to a given folder in the hierarchy, we cannot track and report these items.

Job Reports - Validation tab: At any time, you may run a validation run, which will trigger a full inspection of all content relating to the option you select for the next run only.

Generate Job Reports

DataHub Reports provide several options to combine many jobs into a single report for review. Reports are generated by category, individually selected jobs or by convention job parent (user account mapping, network home drive mapping or folder mapping job types). 

Reports are separated by two tabs so you can clearly distinguish between jobs that are actively transferring content and simulation jobs that imitate transfer. 

If no category is defined during job creation, it will be assigned to the default job category. 

Generate Report

Select Report Type

Define what the report will contain

  • Category: Defined during job creation

  • Parent jobs: Relating to convention jobs such as user account mapping, network home drive mapping or folder mapping job types

  • Manually select jobs: Choose each job individually that you want in your report

Remediation

Items that were unable to be transferred by the DataHub Platform will be flagged for manual remediation. Items can be flagged for many reasons, and in some cases, still transferred to the destination platform. Each item is a package, consisting of the media itself, version history, author, sharing and any other metadata. DataHub ensures all pieces of the item package are transferred to the destination to preserve data integrity. When an item is flagged, DataHub is indicating that all or some portion of this failed to migrate.

All migrations require some amount of manual intervention by the client to move content that fails to transfer automatically.

  • Note that one of the uses of simulation mode is to get an understanding prior to a live transfer of how many files might fail to transfer and the reasons.

  • This can be used to adjust the job parameters to achieve a higher number of automatic remediation successes.

General Reasons Content does not Transfer

Errors from Source & Destination Platforms

This is a broad error category that indicates DataHub was prevented from reading, downloading, uploading or writing content during content transfer by either the source or destination platform provider. Each situation is dependent on the storage provider rejection reason and will require manual investigation to resolve.

Insufficient Permissions

Many platforms may require additional permissions in order to perform certain functions, even for site administrator accounts. These permissions typically require a special request from the storage provider. For example, content that has been locked, hidden or has been flagged to disable download may require this special permission request from your storage provider.

Scenario-Specific Configuration

Content on your source storage platform is diverse, and users across your business will structure their data in a wide-variety of different ways. A single one-size-fits-all project configuration may not be suitable and can result in some content not transferring to the destination platform. DataHub will assist in assessing these situations to help provide custom, scenario-specific configuration that may workaround the issue that is preventing the transfer.

Disparate Platform Features

Each platform provider has a given set of features that are generally shared concepts in the storage business industry. However, within each storage platform, there can be behavioral or rule differences within these features, and aligning these discrepancies can be challenging. Features such as permission levels (edit, view, view+upload, etc.) may not align as an exact match to the destination platform, file size restrictions or file names may need to be altered to conform to meet the destination platform's policies. DataHub will attempt to accommodate these restrictions through configurations in the system; however, not all scenarios can be covered in a diverse data set.

Interruption in Service

DataHub must maintain connection to the database at all times during the transfer process. If there is an interruption in service, DataHub will fail the transfer as it is unable to track / write to the database.

How do I validate my content transferred successfully?

Verify the destination

DataHub will report all content that has transferred to the destination. Log into your destination platform and verify the content is located as expected.

DataHub is reporting items in "pending" or "retrying" status, what are my next steps?

Run the job again

DataHub defaults to retrying the job 3 times to reconcile items that are in pending/retry status. Depending on your job configuration, this may occur with the defined schedule or you can start the job manually.

Review the log message

DataHub logs a reason why the item is in pending/retry status. On the job "Overview" tab, click on the Transfer Details breakdown status "retrying". This will direct you to the filtered "Items" list. Select the item then click the "View item history" link on the right toolbox.

DataHub is reporting items in "Flagged" status, what are my next steps?

When an item is in "flagged" status, this means DataHub has made all attempts to transfer the file without success, and it requires manual remediation. 

Review the log message

DataHub logs a reason why the item has been flagged. On the job "Overview" tab, click on the Transfer Details breakdown status "flagged". This will direct you to the filtered "Items" list. Select the item and click the "View item history" link on the right toolbox.

Review the message and determine if you can resolve on the source platform.

Review all flagged items

These are the recommended ways to view all flagged items: export the flagged item report or review the "Flagged Items" page. 

Export report:

  • Job Report → Items tab → Filter by Status: Flagged

  • Click "Export this report" → Save CSV file for review

Review "Flagged Items" page:

  • Retry or Ignore individual items

  • View Item History for individual items

  • Link back to the job the flagged items is associated with

  • Export all Flagged Items report

 

Powered by Zendesk