Follow

Storage Connector health metrics definitions

The Syncplicity Storage Connector generates several different events and metrics related to the performance and overall health of the Storage Connector. With the v.2.7.0.5 release these events and metrics are available for consumption by 3rd party monitoring tools.

The following glossary includes a list of every metric and its definition, grouped by category, which can be helpful in building desired graphs, thresholds and alerts for system health and/or performance monitoring. Storage Connector metrics are provided in JMX format. For all time-based metrics the unit of measure is msec (unless stated otherwise). 

Storage Metrics 

This class of metrics are primarily used to monitor performance, capacity utilization and operational health of the Storage Connector. 

syncRequests.concurrentRequestsCounter

The number of concurrent requests that can be processed by the Storage Connector node. Each Storage Connector throttles the total number of upload and download requests. The throttling limit is set using the configuration value syncplicity.request.limit in the configuration file. 

syncRequests.concurrentRequestsCounterMin

This counter represents the minimum of the remaining capacity of the Storage Connector to process concurrent requests against the throttling limit. It is calculated based on the highest sum of concurrent upload and download requests during the last sampling period.

syncRequests.concurrentRequestsCounterMax

This counter represents the maximum of the remaining capacity of the Storage Connector to process concurrent requests against the throttling limit. It is calculated based on the lowest sum of concurrent upload and download requests during the last sampling period.

syncRequests.currentDownloadRequestsCounter

The current number of download requests at the end of sampling interval.

syncRequests.currentUploadRequestsCounter

The current number of upload requests at the end of a sampling interval.

syncRequests.downloadRequestsForDurationCounter

The total number of download requests that were processed during the last sampling interval.

syncRequests.uploadRequestsForDurationCounter

The total number of upload requests that were processed during the last sampling interval.

cleanup.files

The number of files that were successfully processed for clean up during the last sampling interval. The Storage Connector periodically cleans up partial files, which are typically due to incomplete file uploads, from storage.

transfer.in.bytes

The number of bytes transferred from clients to storage during last sampling interval.

transfer.in.msec

The amount of time Storage Connector spent transferring bytes from clients to storage over the last sampling interval.

transfer.out.bytes

The number of bytes transferred from storage to clients during last sampling interval.

transfer.out.msec

The amount of time Storage Connector spent transferring bytes from storage to clients over the last sampling interval.

storage.delete.count

The number of DELETE requests made by the Storage Connector to storage.

storage.delete.time

The sum of response time of DELETE requests to storage that occurred during the last sampling interval.

storage.get.count

The number of GET requests made by the Storage Connector to storage.

storage.get.time

The sum of response time of GET requests made by the Storage Connector to storage that occurred during the last sampling interval.

storage.move.count

Deprecated

storage.move.time

Deprecated

storage.put.count

The number of PUT requests made by the Storage Connector to storage.

storage.put.time

The sum of response time of PUT requests made by the Storage Connector to storage that occurred during last sampling interval.

Storage Services Metrics

This class of metrics are useful in monitoring the health of the Storage Connector and the communication with client endpoints and storage services such as rights management APIs.

File Download

The following metrics are used to monitor actual file download health. This are one of the most important metrics for  the storage administrator to have. They are most useful to look  at when there’s a need to access Storage Connector file download performance or to investigate file download overall health. The most common errors are fileDownload.error.404, which indicates that a file is not present in storage, and fileDownload.error.500, which means the file couldn’t be downloaded from storage typically caused by the unavailability of the storage layer or Syncplicity Orchestration. All counters in this section show number of events that occurred during the last sampling interval.

fileDownload.entire

The number of full file download requests completed.

fileDownload.ranged

The number of completed download requests for a range of the file as specified by the request header.

fileDownload.error.400

The number of failed file download requests due to Bad Request errors. The most common causes for this error include missing headers or arguments which are required, use of an invalid offset, or the file is not present in storage.

fileDownload.error.402

Deprecated

fileDownload.error.403

The number of failed file download requests related to rights management protection. The possible causes include a misconfigured rights management server, the file being requested for download is unable to be encrypted, or there are network issues communicating with the rights management server.

fileDownload.error.404

The number of failed file download requests due to Not Found errors. This indicates that the file requested for download does not exist in storage.

fileDownload.error.500

The number of failed file download requests typically due to the unavailability of the storage layer or Syncplicity Orchestration. Other potential causes for this error include the file was marked as corrupt (no longer present in storage), the access key was invalid or there were unexpected internal errors.

fileDownload.success.200

The number of full file download requests that completed successfully. 

File Upload

The following metrics are used to monitor actual file upload health. This are one of the most important metrics for  the storage administrator to have. They are most useful to look  at when there’s a need to access Storage Connector file upload performance or to investigate file upload overall health.  The most common error is fileUpload.error.500 which is an indicator of a loss of connectivity or availability to the storage layer or Syncplicity Orchestration. All counters in this section show number of events that occurred during the last sampling interval. 

fileUpload.error.400

The number of failed file upload requests due to Bad Request errors. The most common causes for this error are a missing directory ID or pathname, invalid form data, missing headers or arguments which are required, or the use of an invalid upload offset.

fileUpload.error.402

The number of failed file upload requests due to exceeding quota limits. An out of quota error occurs when the user does not have enough space to upload the file requested.

fileUpload.error.403

Deprecated

fileUpload.error.500

The number of failed file upload requests typically due to the unavailability of the storage layer or Syncplicity Orchestration. Other potential causes for this error include the access key was invalid or there were unexpected internal errors.

fileUpload.initiated

The number of initiated file upload requests. For files that are larger than 5MB, this counter is incremented only for the first chunk. 

fileUpload.resumed

For files larger than 5MB, the # of chunks uploaded for a file after the first chunk (which is counted by fileUpload.initiated). 

fileUpload.success.200

The number of full file upload requests that completed successfully. For example, when a client initiates an upload of a 15MB file, it will upload in 3 5MB chunks (default chunk size). Once all chunks are uploaded this counter is incremented by 1. 

fileUpload.success.308

The number of chunk file upload requests that completed successfully. For example, when a client initiates an upload of an 18MB file, it will upload in 4 5MB chunks (default chunk size). The value of this counter would increment +1 for each chunk that is successfully uploaded. This counter applies to Syncplicity desktop and mobile clients and does not apply to the web online file browser. 

Device Authorization

These metrics are deprecated.

device.error.400

device.error.403

device.error.404

device.error.500

device.success.200

Token Authorization 

The deviceToken metrics are useful in monitoring the token authorization service health. Each client must authenticate itself to Syncplicity before it can access files in storage. The primary method of authentication is with a short-lived security token, which is commonly performed when the client begins an upload or download request. Upon receiving this request, the Storage Connector responds with a security token that the client must include in all subsequent requests it makes.

When deviceToken.error.40x or deviceToken.error.500 counts are present with frequency or high volume this is indicative of an issue with one or more client endpoints. The most common cause is that Syncplicity Orchestration cannot generate a device token due to missing required headers, parameters or access denied.

deviceToken.error.400

deviceToken.error.403 - Deprecated

deviceToken.error.404 - Deprecated

deviceToken.error.500 - Deprecated

deviceToken.success.200

Rights Management

Syncplicity is integrated with the EMC IRM plugin to provide protection to secured content, enabling organizations to maintain control of information rights beyond the firewall (more Information about EMC IRM). 

When irm.error.403 or irm.error.500 counts are present with frequency or high volume this is indicative of an issue with the IRM RMS server that typically indicates that IRM is misconfigured, the file can’t be protected, or the RMS server is unavailable.

irm.cached

irm.error.403

irm.error.500

irm.protected

irm.success.200

Shared Link 

The following metrics are used to measure the health of shared links. The Shared Link API provides a method to authenticate user-initiated file shares issued by Syncplicity.

When link.error.40x or link.error.500 counts are present with frequency or high volume this is indicative of issues with shared link generation. Typical causes include the unavailability of Syncplicity Orchestration, missing required headers or parameters, or unauthorized access.

link.error.400

link.error.401

link.error.403

link.error.500

link.success.200


Web Preview
 

Syncplicity Web Preview is only available in the Syncplicity public cloud. The following metrics are not used for on-premise Storage Connector deployments.

preview.doc.attributes.count

preview.doc.attributes.time

preview.doc.bytes

preview.doc.get.count

preview.doc.get.time

preview.doc.put.count

preview.doc.put.time

preview.error.403

preview.error.480

preview.error.500

preview.page.attributes.count

preview.page.attributes.time

preview.page.bytes

preview.page.content.count

 

preview.page.content.time

preview.success.200

 

Storage Authorization

These metrics are used to monitor storage authorization health. The Storage Authorization API validates the authenticity of a Storage Connector instance.

When storageAuthorization.error.40x or storageAuthorization.error.500 counts are present with frequency or high volume this is indicative of an issue with missing headers or parameters that are required, an invalid storage token, invalid or restricted email, or Syncplicity Orchestration is unavailable.

storageAuthorization.error.400

storageAuthorization.error.401

storageAuthorization.error.403

storageAuthorization.error.500

storageAuthorization.success.200

 

Storage Password

These metrics are useful to monitor storage password health. The Storage Password Service is used for client authentication when second layer authentication or SSO authentication is enabled.

When storagePassword.error.40x or storagePassword.error.500 counts are present with frequency or high volume this is indicative of an issue at the second layer authentication or SSO level. For example, this could indicate there are missing headers or parameters that are required, an invalid or missing storage password, or Syncplicity Orchestration is unavailable.

storagePassword.error.400

storagePassword.error.403

storagePassword.error.404

storagePassword.error.500

storagePassword.success.200

 

Thumbnail

The following metrics are used to monitor thumbnail health. The thumbnail API is called when a Syncplicity client issues a request to retrieve an image thumbnail or generate a thumbnail.

When thumbnail.error.40x or thumbnail.error.500 counts are present with frequency or high volume this is indicative of an error during thumbnail generation, the thumbnail requested does not exist, or either storage or Syncplicity Orchestration was unavailable to service the request.

thumbnail.error.400

thumbnail.error.403

thumbnail.error.404

thumbnail.error.500

thumbnail.success.200

 

Syncplicity Services Metrics

The following metrics may appear in your data stream but should be ignored at this time as they are for future integrations. They measure performance of Storage Connector calls to other Syncplicity web services.

orchestration.authTokenService.count

orchestration.authTokenService.time

orchestration.chunkService.count

orchestration.chunkService.time

orchestration.cleanupService.count

orchestration.cleanupService.time

orchestration.deviceService.count

orchestration.deviceService.time

orchestration.deviceTokenService.count

orchestration.deviceTokenService.time

orchestration.fileService.count

orchestration.fileService.time

orchestration.irmService.count

orchestration.irmService.time

orchestration.linkService.count

orchestration.linkService.time

orchestration.storageAuthorizationService.count

orchestration.storageAuthorizationService.time

orchestration.storagePasswordService.count

orchestration.storagePasswordService.time

orchestration.thumbnailService.count

orchestration.thumbnailService.time

 

JVM Metrics 

jvm.heapUsed

JVM total heap used

jvm.threadCount

The total number of active threads in the JVM

jvm.uptime

JVM uptime

system.cpu.current.load.value

The current CPU load

system.process.cpu.load.value

The lapsed CPU time since the last screen update, expressed as a percentage of total CPU time.

system.process.cpu.time.value

The JVM share of the elapsed CPU time

system.freePhysicalMemory

The total free physical memory available

system.load

The system load average for the last minute multiplied by 100

system.totalPhysicalMemory

The total physical memory

jvm.classloading.loaded.value

The number of loaded classes

jvm.classloading.unloaded.value

The number of of unloaded classes

jvm.memory.total.init.value

The total initial memory

jvm.memory.total.used.value

The total memory used

jvm.memory.total.max.value

The maximum available jvm memory

jvm.memory.total.commited.value

The total memory committed

jvm.memory.heap.init.value

The initial heap memory

jvm.memory.heap.used.value

Total heap memory used

jvm.memory.heap.max.value

The maximum available heap memory

jvm.memory.heap.commited.value

The total heap memory committed

jvm.memory.non-heap.init.value

The initial non-heap memory

jvm.memory.non-heap.used.value

The total non-heap memory used

jvm.memory.non-heap.max.value

The maximum available non-heap memory

jvm.memory.non-heap.commited.value

The total non-heap memory committed

system.file.descriptors.open.value

The total number of open OS file descriptors

system.file.descriptors.max.value

The maximum available system open OS file descriptors

jvm.thread.state.%state name%.value

The number of threads in a specific state

jvm.gc.%gc name%.count.value

The number of invocations of a garbage collector

jvm.gc.%gc name%.time.value

The total time spent in a garbage collector

Introducing Storage Connector Health Metrics

Powered by Zendesk