DataHub uses jobs to perform specific actions between the source and destination platforms. The most common type of jobs are copy and sync.
Copy - Transfer in one direction from a Source to a Destination
This will copy all content (files, folders) from the source to the destination. Each Job run will detect any new content on the Source and copy to the Destination
Using the transfer type "Copy", we will never modify any content on the Source. We simply evaluate the Source content and create matching content on the Destination.
When the Copy Job is run again, we will evaluate the Source for any new content. Only this new content will be copied to the Destination.
When a failure occurs during a Copy transfer, the Job will continue and flag the failures at the end for review. When a failure occurs at the folder level, we will not traverse this folder's content. It will be skipped and the next folder will be evaluated for Copy transfer.
"transfer_type": "copy"
Synchronize - Transfer in both directions, syncing between two locations
This will ensure all content (files, folders) on both the source and the destination are the same. Each Job run will verify that they are in-sync
Using the transfer type "Synchronize", DataHub will modify content on both the Source and the Destination to ensure both sides are reflective of each other.
When the Sync Job is run again, we will evaluate both the Source and Destination for any new content. Only the new content will transfer, to either side, to ensure the Source and the Destination are consistent.
When a failure occurs during a Synchronize transfer, the Job will continue and flag the failures at the end for review. When a failure occurs at the folder level, we will not traverse this folder's content. It will be skipped and the next folder will be evaluated for synchronization.
"transfer_type": "sync"
Move - Transfer in one direction from a Source to a Destination; delete Source content after all content has been moved
This will copy all content (files, folders) from the Source to the Destination. After verifying that everything was moved to the destination successfully, DataHub will delete all content from the source
Using the transfer type "Move", any files that were not successfully copied to the Destination in the initial transfer will not be deleted from the Source.
When the Move Job is run again, we will evaluate the Source for any new content. The new content will be copied to the Destination. After this transfer is successful, the files on the Source will be deleted.
When a failure occurs during a Move transfer, the Job will continue and flag the failures at the end for review. When a failure occurs at the folder level, we will not traverse this folder's content. It will be skipped and the next folder will be evaluated for Move transfer.
"transfer_type": "move"
Migrate - Transfer in one direction from a Source to a Destination; do not pick up any changes that occur on the Source on subsequent runs
This will copy all content (files, folders) from the Source to the Destination. Subsequent Job runs will not pick up any new content; only retry errors encountered. Ideal for large migration jobs
Using the transfer type "Migrate", we will never modify any content on the Source. We simply evaluate the Source content and create matching content on the Destination.
Selecting Migrate as the transfer type is recommended for large transfer Jobs where the content on the Source is not expected to change. Migrate is more efficient because when subsequent Jobs are run, it will not evaluate the entire structure again; only retry errors from the previous run.
When a failure occurs during a Migrate transfer, the Job will continue and flag the failures at the end for review. When a failure occurs at the folder level, we will not traverse this folder's content. It will be skipped and the next folder will be evaluated for Migrate transfer.
"transfer_type": "migrate"
Publish - Transfer in one direction from a Source to a Destination; new content identified on the Source will be transferred, delete Destination content that does not match the Source
This will ensure everything on the Destination mirrors the Source. Content on the Destination that does not match the source will be deleted. New content on the Source will be transferred to the Destination
Using the transfer type "Publish", we will never modify any content on the Source. We simply evaluate the Source content and create matching content on the Destination..
When a Publish Job is run again, we will evaluate the Source for any new content. The new Source content will be copied to the Destination. We will also evaluate the Destination for new content. If new content exists on the Destination but is not present on the Source, we will delete this content from the Destination. Similarly, if content on the Destination is newer than the Source, this content will be reverted.
When a failure occurs during a Publish transfer, the Job will continue and flag the failures at the end for review. When a failure occurs at the folder level, we will not traverse this folder's content. It will be skipped and the next folder will be evaluated for Publish transfer.
"transfer_type": "publish"
Taxonomy - Replicate the folder structure of a Source onto a Destination
This will copy the folder structure only from the Source to the Destination. No files will be transferred
Using the transfer type "Copy Folder Structure", we will never modify any content on the Source or the Destination. We simply evaluate the folder hierarchy on the Source and create matching structure on the Destination.
When a Copy Folder Structure Job is run again, we will evaluate the Source for any new folders and/or sub-folders. Only new empty directories will be created on the Destination.
When a failure occurs during a Copy Folder Structure transfer, the Job will continue and flag the failures at the end for review. When a failure occurs at the parent folder level, we will not traverse this folder's sub-folders. It will be skipped and the next folder will be evaluated for Copy Folder Structure transfer.
"transfer_type": "taxonomy"