Info
Access logs to instances containing PHI are maintained via infrastructure, application and operating system logging mechanisms. Monitoring, audit controls and system activity review is documented and complies with `45 CFR 164.308(a)(5)(ii)(C), 45 CFR 164.312(b)` and `45 CFR 164.308(a)(1)(ii)(D)`.

Live Search

spaceKey	@self
additional	page excerpt
placeholder	Search this space

Uptime, monitoring, and alerts

Alerting and Status

Tidepool integrates multiple internal alerting mechanisms for notification of problems and issues via ChatOps, email, cell phone/SMS. For Public status information, please see:

Atlassian StatusPage - public alerting to anyone interested in Tidepool system status
An "on-call" rotation schedule for engineers is maintained to ensure that there is always a primary and multiple backup employee to respond to potential issues, 24x7.

Logging

Tidepool implements remote logging to a HIPAA-compliant service for all application, security, audit, and compliance logs.

Monitoring

Tidepool monitors systems proactively for the following concerns, though this is not an exhaustive list. Tidepool continuously evaluates environment and risk criteria and updates monitoring and alerting based on risk-based analysis.

Network Performance - latency, response time, errors
System Performance - CPU, memory, disk, network usage
Application Performance - latency, errors, critical conditions
Security - anomalous connections, suspicious connections, intrusion detection, admin activity, logins/logouts/lockouts, audit, policy changes, logging
Capacity - system resource usage, overhead, disk usage, failover and redundancy

Monitoring tools and services

DataDog
- service availability/uptime
- system availability metrics
MongoDB Atlas
- performance - slow queries and application performance
- security monitoring - access changes
- availability - cluster operations and performance
Prometheus/Alert Manager and Grafana (Kubernetes)
- logging/metrics/system health
- custom alerting
Sumo Logic
- aggregate and archive logs
- logging/metrics/system health
- custom alerting
AWS Services
CloudTrail - captures application, access, audit and activity logs

CloudWatch

- alerting on events

SNS - distribute notifications to services and humans (email, web hooks)
SQS - distribute notifications via queueing mechanisms
Config - monitor and detect changes in configuration or posture
GuardDuty - threat detection
Inspector - automated security analysis of network, configuration, security posture
SecurityHub - monitors and aggregates posture and threat information from these AWS services:
- GuardDuty
- Inspector
- Identity and Access Management (IAM) Access Analyzer
- Firewall Manager

An "on-call" rotation schedule is maintained to ensure that there is always a primary and multiple backup employees to respond to potential issues, 24x7.

Tip
Tidepool is a fully distributed and remote company, employing engineers in multiple Time Zones. As a result, an engineer is always available.

Based on Pingdom (legacy uptime monitoring platform), DataDog, and Statuspage metrics, our monitoring tools, Tidepool has maintained 100% user-facing uptime of our production environment over the last year, and over 99.9% uptime since inception.

Individual instances are only taken down momentarily for rolling software installations. User app No downtime for software/system updates

Under normal circumstances, all User application and API requests continue to be fulfilled by redundant instances and updates are rolled back via automation in case of deployment problems

Alerting

Tidepool integrates multiple alerting mechanisms for notification of problems and issues via ChatOps, email, cell phone/SMS:

PagerDuty - On-call scheduling, alerting, and incident tracking
Sumo Logic - log aggregation, metrics, and alerts
AWS - metrics, usage, and alerting
DataDog - MongoDB Atlas performance and security monitoring
Prometheus/AlertManager - Kubernetes alerts
Slack - Alert delivery/notification from PagerDuty, Sumo Logic, DataDog, MongoDB
Atlassian StatusPage - public alerting to anyone interested in Tidepool system status.

In accordance with legal, statutory, and regulatory compliance obligations, the availability, quality, and adequate capacity and resources are planned, prepared, and measured to deliver the required system performance.

Page Tree

Versions Compared

Old Version 1

New Version 2

Key

Alerting and Status

Logging

Monitoring

Monitoring tools and services

Alerting

Page Comparison

Versions Compared

Old Version 1

New Version 2

Key

Alerting and Status

Logging

Monitoring

Monitoring tools and services

Alerting