Skip to main content

System Monitoring

HyperStudy provides comprehensive monitoring tools for administrators to track system usage, experiment activity, and platform health. This guide covers the monitoring dashboard and its features.

Accessing the Monitoring Dashboard

Admin Access

To access monitoring features:

  1. Log in with an administrator account
  2. Navigate to the Admin Dashboard
  3. Click on "Monitoring" or "System Status"
  4. Select the monitoring view you need

Note: Monitoring features are only available to users with administrator privileges.

Real-Time Monitoring

Active Experiments Dashboard

Monitor currently running experiments:

  1. Live Experiment List

    • Shows all active experiments
    • Number of participants in each
    • Current state/phase
    • Start time and duration
    • Resource usage
  2. Participant Status

    • Total participants online
    • Distribution across experiments
    • Connection quality indicators
    • Geographic distribution
    • Device/browser breakdown
  3. System Metrics

    • Server load and performance
    • WebSocket connections
    • API request rates
    • Database queries per second
    • Media streaming bandwidth

Live Activity Feed

Real-time stream of system events:

[14:23:45] New experiment started: "Social Decision Study"
[14:23:52] 3 participants joined waiting room
[14:24:01] Experiment session began with 4 participants
[14:24:15] Video recording started for session
[14:25:30] Participant disconnected and reconnected
[14:26:45] Experiment state transition: intro → main_task

Experiment Monitoring

Experiment Overview

For each experiment, monitor:

  1. Participation Metrics

    • Total participants enrolled
    • Active participants
    • Completed sessions
    • Dropout rate
    • Average session duration
  2. Technical Performance

    • Page load times
    • API response times
    • Error rates
    • Media loading success
    • WebRTC connection quality
  3. Data Collection

    • Responses collected
    • Data quality metrics
    • Storage usage
    • Export history

Session Monitoring

Detailed view of individual sessions:

  1. Session Timeline

    • Start and end times
    • State transitions
    • Participant actions
    • System events
    • Error occurrences
  2. Participant Details

    • Anonymous ID
    • Role assignment
    • Connection quality
    • Browser/device info
    • Response rate
  3. Technical Diagnostics

    • Network latency
    • Bandwidth usage
    • CPU/memory usage
    • WebSocket stability
    • Media stream quality

Performance Monitoring

System Performance Metrics

Track overall system health:

  1. Server Metrics

    • CPU utilization
    • Memory usage
    • Disk I/O
    • Network traffic
    • Process health
  2. Application Metrics

    • Request processing times
    • Database query performance
    • Cache hit rates
    • Queue lengths
    • Error rates
  3. Infrastructure Status

    • Service availability
    • Database connections
    • Redis cache status
    • Media server status
    • Storage capacity

Performance Alerts

Configure alerts for:

  • High CPU usage (>80%)
  • Memory pressure
  • Slow response times
  • High error rates
  • Service failures
  • Storage limits

User Activity Monitoring

User Analytics

Track platform usage:

  1. User Statistics

    • Total registered users
    • Active users (daily/weekly/monthly)
    • New registrations
    • User retention
    • Geographic distribution
  2. Experiment Creation

    • New experiments created
    • Active experiments
    • Completed experiments
    • Average experiment duration
    • Popular component types
  3. Resource Usage

    • Storage per user
    • Bandwidth consumption
    • API usage
    • Computation time

Audit Logs

Comprehensive audit trail:

2024-01-15 10:30:45 | User: admin@example.com | Action: Created experiment "Memory Study"
2024-01-15 10:31:12 | User: admin@example.com | Action: Modified experiment settings
2024-01-15 10:45:30 | User: researcher@uni.edu | Action: Exported data
2024-01-15 11:00:00 | System | Action: Automated backup completed

Error Monitoring

Error Tracking

Monitor and diagnose issues:

  1. Error Dashboard

    • Error frequency graph
    • Error types breakdown
    • Affected users count
    • Error trends
    • Critical error alerts
  2. Error Details

    • Full error stack traces
    • User context
    • Browser/device information
    • Reproduction steps
    • Related logs
  3. Common Error Patterns

    • WebSocket disconnections
    • Media loading failures
    • API timeout errors
    • Database connection issues
    • Authentication problems

Error Resolution

Tools for addressing issues:

  1. Quick Actions

    • Restart services
    • Clear caches
    • Reset connections
    • Notify affected users
  2. Diagnostic Tools

    • Log searcher
    • Database query analyzer
    • Network trace viewer
    • Performance profiler

Resource Monitoring

Storage Management

Monitor storage usage:

  1. Storage Metrics

    • Total capacity
    • Used space
    • Growth rate
    • File count
    • Largest files
  2. Storage Breakdown

    • Experiment data
    • Media files
    • Recordings
    • Backups
    • Temporary files
  3. Cleanup Tools

    • Identify old data
    • Archive completed experiments
    • Remove temporary files
    • Compress recordings

Bandwidth Monitoring

Track network usage:

  1. Bandwidth Metrics

    • Current usage
    • Peak usage times
    • Geographic distribution
    • Protocol breakdown
  2. Optimization Opportunities

    • Compress media files
    • Enable caching
    • CDN utilization
    • Connection pooling

Monitoring Best Practices

Regular Monitoring

  1. Daily Checks

    • Review error logs
    • Check active experiments
    • Monitor resource usage
    • Verify backups
  2. Weekly Reviews

    • Analyze trends
    • Review performance metrics
    • Check user feedback
    • Plan maintenance
  3. Monthly Analysis

    • Usage reports
    • Capacity planning
    • Cost analysis
    • Security review

Alert Configuration

  1. Priority Levels

    • Critical: Immediate attention required
    • Warning: Investigate soon
    • Info: Awareness only
  2. Notification Channels

    • Email alerts
    • SMS for critical issues
    • Slack integration
    • Dashboard notifications

Documentation

  1. Incident Logs

    • Record all incidents
    • Document resolution steps
    • Note prevention measures
    • Track recurring issues
  2. Performance Baselines

    • Establish normal metrics
    • Document expected ranges
    • Set threshold values
    • Review regularly

Troubleshooting Common Issues

High Resource Usage

Symptoms: Slow performance, timeouts Check:

  • Active experiment count
  • Participant numbers
  • Media streaming load
  • Database queries

Actions:

  • Scale resources if needed
  • Optimize queries
  • Enable caching
  • Limit concurrent sessions

Connection Issues

Symptoms: Participant disconnections Check:

  • WebSocket server status
  • Network latency
  • Firewall rules
  • SSL certificates

Actions:

  • Restart WebSocket service
  • Check network configuration
  • Review security settings
  • Update certificates

Data Inconsistencies

Symptoms: Missing or duplicate data Check:

  • Database replication
  • Transaction logs
  • API error rates
  • Client-side errors

Actions:

  • Verify data integrity
  • Review transaction logs
  • Fix synchronization issues
  • Implement data validation

Advanced Monitoring

Custom Dashboards

Create specialized monitoring views:

  1. Select metrics to display
  2. Choose visualization types
  3. Set refresh intervals
  4. Save dashboard configuration
  5. Share with team members

API Monitoring

For programmatic access:

GET /api/admin/monitoring/status
Authorization: Bearer ADMIN_TOKEN

Response:
{
"status": "healthy",
"activeExperiments": 12,
"onlineParticipants": 47,
"systemLoad": 0.65,
"errorRate": 0.02
}

Integration with External Tools

Connect to monitoring services:

  • Datadog
  • New Relic
  • Prometheus
  • Grafana
  • CloudWatch

Security Monitoring

Access Monitoring

Track system access:

  1. Login Activity

    • Successful logins
    • Failed attempts
    • Suspicious patterns
    • Geographic anomalies
  2. Permission Changes

    • Role modifications
    • Access grants
    • Permission revocations
    • Admin actions
  3. Data Access

    • Export activities
    • API usage
    • Bulk operations
    • Sensitive data access

Next Steps