What is Argus?
Argus is your intelligent assistant for trace analysis. It can:- Explain Anomalies - Identify and explain unusual patterns or behaviors in your traces
- Analyze Metrics Failures - Help you understand why specific evaluations or metrics failed
- Deep Trace Analysis - Provide detailed insights into individual traces or groups of traces
- Compare Traces - Analyze differences between successful and failed traces
- Application-Wide Analysis - Examine patterns across all traces from one or multiple applications
Key Features
Natural Language Queries
Ask questions in plain English, such as:- “Why did this trace fail the hallucination check?”
- “What’s causing the spike in latency for my chatbot application?”
- “Explain the anomalies in traces from the last hour”
- “Compare successful and failed traces for the recommendation system”
- “What are the common patterns in traces with high token usage?”
Comprehensive Analysis
Argus can analyze:- Single Traces - Deep dive into a specific trace’s execution, spans, and evaluation results
- Multiple Traces - Identify patterns across groups of traces
- All Application Traces - Analyze trends and issues across your entire application
- Multi-Application Analysis - Compare behavior across different applications
Contextual Understanding
Argus understands:- Span hierarchies and execution flows
- Evaluation results and failure reasons
- System metadata and health indicators
- Token usage, costs, and performance metrics
- Session-level conversational patterns
Getting Started with Argus
Chat with Argus
- Navigate to your Trusys Traces tab or trace details
- Click Ask Argus to start a new analysis session
- Argus will greet you and ask what you’d like to analyze
- Type your question or describe what you want to investigate
- Argus will analyze your traces and provide insights
Example Conversations
Analyzing Failed Evaluations:Managing Your Chats
Access all your previous conversations with Argus:- Navigate to the Argus section on the header
- Your chat history appears
- Click on any chat from your history
- The full conversation loads
- Continue asking questions in the same context
- Argus remembers the previous discussion and traces analyzed
Searching Chat History
- Use the Search bar at the top of the chat history
- Enter keywords, trace IDs, or topics
- Argus filters chats containing your search terms
- Click on relevant results to open that conversation
- “hallucination failures”
- “trace abc-123”
- “latency spike”
- “production deployment”
- “token usage analysis”
What Argus Can Analyze?
Trace-Level Analysis
- Span Hierarchies - Understand execution flow and dependencies
- Performance Metrics - Identify bottlenecks and slow operations
- Evaluation Results - Explain why traces passed or failed checks
- Token Usage - Analyze token consumption and costs
- Error Patterns - Identify common failure modes
Application-Level Insights
- Trend Analysis - Spot patterns over time
- Anomaly Detection - Highlight unusual behaviors
- Comparison Analysis - Compare different time periods or versions
- Resource Utilization - Track token usage, costs, and API calls
- Quality Metrics - Evaluate overall application performance
Multi-Application Analysis
- Cross-Application Patterns - Find common issues across applications
- Performance Comparison - Compare metrics between applications
- Resource Distribution - Understand resource usage across your portfolio
- Best Practices - Identify what works well in one app to apply elsewhere
Use Cases
Debugging Failed Evaluations When traces fail quality checks:- Ask Argus to explain the failures
- Identify common patterns in failed traces
- Get recommendations for fixes
- Compare failed vs. successful traces
- Identify latency bottlenecks
- Analyze token usage patterns
- Find opportunities to reduce costs
- Optimize prompt engineering based on insights
- Investigate sudden changes in metrics
- Explain anomalies in real-time
- Analyze deployment impacts
- Track quality trends over time
- Trace problems to their source
- Understand cascading failures
- Identify external dependencies causing issues
- Get actionable recommendations
- Analyze evaluation coverage
- Identify edge cases and gaps
- Compare test vs. production behavior
- Validate improvements after changes
Tips for Effective Use
Be Specific Instead of: “Why are things failing?” Try: “Why are traces from my chatbot application failing the relevance check in the last 24 hours?” Provide Context Help Argus understand what you’re investigating:- Mention specific applications or trace IDs
- Include time ranges when relevant
- Specify which metrics or evaluations you care about
- Note any recent changes (deployments, config updates)
- Drill deeper into specific findings
- Ask for examples or evidence
- Request different analysis angles
- Explore recommendations in detail
- Copy the Trace/ Span or Sessio ID from the Trace or Session details section
- Paste it in your question to Argus
- Get detailed breakdown of that specific execution
- Start with broad questions
- Narrow down based on initial insights
- Ask Argus to compare different scenarios
- Request actionable next steps
- Chat Privacy - Your chats are private to your account
- Data Security - All conversations are encrypted
- Trace Access - Argus only accesses traces you have permission to view
- No Data Sharing - Your analysis sessions are not shared with other users