System Monitoring
HyperStudy provides comprehensive monitoring tools for administrators to track system usage, experiment activity, and platform health. This guide covers the monitoring dashboard and its features.
Accessing the Monitoring Dashboard
Admin Access
To access monitoring features:
- Log in with an administrator account
- Navigate to the Admin Dashboard
- Click on "Monitoring" or "System Status"
- Select the monitoring view you need
Note: Monitoring features are only available to users with administrator privileges.
Real-Time Monitoring
Active Experiments Dashboard
Monitor currently running experiments:
-
Live Experiment List
- Shows all active experiments
- Number of participants in each
- Current state/phase
- Start time and duration
- Resource usage
-
Participant Status
- Total participants online
- Distribution across experiments
- Connection quality indicators
- Geographic distribution
- Device/browser breakdown
-
System Metrics
- Server load and performance
- WebSocket connections
- API request rates
- Database queries per second
- Media streaming bandwidth
Live Activity Feed
Real-time stream of system events:
[14:23:45] New experiment started: "Social Decision Study"
[14:23:52] 3 participants joined waiting room
[14:24:01] Experiment session began with 4 participants
[14:24:15] Video recording started for session
[14:25:30] Participant disconnected and reconnected
[14:26:45] Experiment state transition: intro → main_task
Experiment Monitoring
Experiment Overview
For each experiment, monitor:
-
Participation Metrics
- Total participants enrolled
- Active participants
- Completed sessions
- Dropout rate
- Average session duration
-
Technical Performance
- Page load times
- API response times
- Error rates
- Media loading success
- WebRTC connection quality
-
Data Collection
- Responses collected
- Data quality metrics
- Storage usage
- Export history
Session Monitoring
Detailed view of individual sessions:
-
Session Timeline
- Start and end times
- State transitions
- Participant actions
- System events
- Error occurrences
-
Participant Details
- Anonymous ID
- Role assignment
- Connection quality
- Browser/device info
- Response rate
-
Technical Diagnostics
- Network latency
- Bandwidth usage
- CPU/memory usage
- WebSocket stability
- Media stream quality
Performance Monitoring
System Performance Metrics
Track overall system health:
-
Server Metrics
- CPU utilization
- Memory usage
- Disk I/O
- Network traffic
- Process health
-
Application Metrics
- Request processing times
- Database query performance
- Cache hit rates
- Queue lengths
- Error rates
-
Infrastructure Status
- Service availability
- Database connections
- Redis cache status
- Media server status
- Storage capacity
Performance Alerts
Configure alerts for:
- High CPU usage (>80%)
- Memory pressure
- Slow response times
- High error rates
- Service failures
- Storage limits
User Activity Monitoring
User Analytics
Track platform usage:
-
User Statistics
- Total registered users
- Active users (daily/weekly/monthly)
- New registrations
- User retention
- Geographic distribution
-
Experiment Creation
- New experiments created
- Active experiments
- Completed experiments
- Average experiment duration
- Popular component types
-
Resource Usage
- Storage per user
- Bandwidth consumption
- API usage
- Computation time
Audit Logs
Comprehensive audit trail:
2024-01-15 10:30:45 | User: admin@example.com | Action: Created experiment "Memory Study"
2024-01-15 10:31:12 | User: admin@example.com | Action: Modified experiment settings
2024-01-15 10:45:30 | User: researcher@uni.edu | Action: Exported data
2024-01-15 11:00:00 | System | Action: Automated backup completed
Error Monitoring
Error Tracking
Monitor and diagnose issues:
-
Error Dashboard
- Error frequency graph
- Error types breakdown
- Affected users count
- Error trends
- Critical error alerts
-
Error Details
- Full error stack traces
- User context
- Browser/device information
- Reproduction steps
- Related logs
-
Common Error Patterns
- WebSocket disconnections
- Media loading failures
- API timeout errors
- Database connection issues
- Authentication problems
Error Resolution
Tools for addressing issues:
-
Quick Actions
- Restart services
- Clear caches
- Reset connections
- Notify affected users
-
Diagnostic Tools
- Log searcher
- Database query analyzer
- Network trace viewer
- Performance profiler
Resource Monitoring
Storage Management
Monitor storage usage:
-
Storage Metrics
- Total capacity
- Used space
- Growth rate
- File count
- Largest files
-
Storage Breakdown
- Experiment data
- Media files
- Recordings
- Backups
- Temporary files
-
Cleanup Tools
- Identify old data
- Archive completed experiments
- Remove temporary files
- Compress recordings
Bandwidth Monitoring
Track network usage:
-
Bandwidth Metrics
- Current usage
- Peak usage times
- Geographic distribution
- Protocol breakdown
-
Optimization Opportunities
- Compress media files
- Enable caching
- CDN utilization
- Connection pooling
Monitoring Best Practices
Regular Monitoring
-
Daily Checks
- Review error logs
- Check active experiments
- Monitor resource usage
- Verify backups
-
Weekly Reviews
- Analyze trends
- Review performance metrics
- Check user feedback
- Plan maintenance
-
Monthly Analysis
- Usage reports
- Capacity planning
- Cost analysis
- Security review
Alert Configuration
-
Priority Levels
- Critical: Immediate attention required
- Warning: Investigate soon
- Info: Awareness only
-
Notification Channels
- Email alerts
- SMS for critical issues
- Slack integration
- Dashboard notifications
Documentation
-
Incident Logs
- Record all incidents
- Document resolution steps
- Note prevention measures
- Track recurring issues
-
Performance Baselines
- Establish normal metrics
- Document expected ranges
- Set threshold values
- Review regularly
Troubleshooting Common Issues
High Resource Usage
Symptoms: Slow performance, timeouts Check:
- Active experiment count
- Participant numbers
- Media streaming load
- Database queries
Actions:
- Scale resources if needed
- Optimize queries
- Enable caching
- Limit concurrent sessions
Connection Issues
Symptoms: Participant disconnections Check:
- WebSocket server status
- Network latency
- Firewall rules
- SSL certificates
Actions:
- Restart WebSocket service
- Check network configuration
- Review security settings
- Update certificates
Data Inconsistencies
Symptoms: Missing or duplicate data Check:
- Database replication
- Transaction logs
- API error rates
- Client-side errors
Actions:
- Verify data integrity
- Review transaction logs
- Fix synchronization issues
- Implement data validation
Advanced Monitoring
Custom Dashboards
Create specialized monitoring views:
- Select metrics to display
- Choose visualization types
- Set refresh intervals
- Save dashboard configuration
- Share with team members
API Monitoring
For programmatic access:
GET /api/admin/monitoring/status
Authorization: Bearer ADMIN_TOKEN
Response:
{
"status": "healthy",
"activeExperiments": 12,
"onlineParticipants": 47,
"systemLoad": 0.65,
"errorRate": 0.02
}
Integration with External Tools
Connect to monitoring services:
- Datadog
- New Relic
- Prometheus
- Grafana
- CloudWatch
Security Monitoring
Access Monitoring
Track system access:
-
Login Activity
- Successful logins
- Failed attempts
- Suspicious patterns
- Geographic anomalies
-
Permission Changes
- Role modifications
- Access grants
- Permission revocations
- Admin actions
-
Data Access
- Export activities
- API usage
- Bulk operations
- Sensitive data access