Skip to main content
Argus is an AI chatbot built into Trusys that helps you analyze and understand your application traces through natural conversation. Instead of manually reviewing traces and metrics, you can ask Argus questions in plain language and get instant insights about your AI application’s performance, failures, and anomalies.

What is Argus?

Argus is your intelligent assistant for trace analysis. It can:
  • Explain Anomalies - Identify and explain unusual patterns or behaviors in your traces
  • Analyze Metrics Failures - Help you understand why specific evaluations or metrics failed
  • Deep Trace Analysis - Provide detailed insights into individual traces or groups of traces
  • Compare Traces - Analyze differences between successful and failed traces
  • Application-Wide Analysis - Examine patterns across all traces from one or multiple applications

Key Features

Natural Language Queries

Ask questions in plain English, such as:
  • “Why did this trace fail the hallucination check?”
  • “What’s causing the spike in latency for my chatbot application?”
  • “Explain the anomalies in traces from the last hour”
  • “Compare successful and failed traces for the recommendation system”
  • “What are the common patterns in traces with high token usage?”

Comprehensive Analysis

Argus can analyze:
  • Single Traces - Deep dive into a specific trace’s execution, spans, and evaluation results
  • Multiple Traces - Identify patterns across groups of traces
  • All Application Traces - Analyze trends and issues across your entire application
  • Multi-Application Analysis - Compare behavior across different applications

Contextual Understanding

Argus understands:
  • Span hierarchies and execution flows
  • Evaluation results and failure reasons
  • System metadata and health indicators
  • Token usage, costs, and performance metrics
  • Session-level conversational patterns

Getting Started with Argus

Chat with Argus

  1. Navigate to your Trusys Traces tab or trace details
  2. Click Ask Argus to start a new analysis session
  3. Argus will greet you and ask what you’d like to analyze
  4. Type your question or describe what you want to investigate
  5. Argus will analyze your traces and provide insights

Example Conversations

Analyzing Failed Evaluations:
You: Why are my traces failing the relevance check?

Argus: I've analyzed your recent traces and found that 15 out of 23 
traces failed the relevance evaluation. The main issues are:

1. Context mismatch: 60% of failures show the retrieved context 
   doesn't match the user query topic
2. Incomplete responses: 30% are missing key information requested
3. Off-topic responses: 10% deviated from the original question

The pattern started 3 hours ago after deployment v2.3.1. Would you 
like me to show specific examples?
Investigating Anomalies:
You: Explain the anomalies in my application traces from yesterday

Argus: I've identified 3 anomalies in your traces from yesterday:

1. Latency Spike (2:15 PM - 2:45 PM): Average response time jumped 
   from 800ms to 3.2s. Affected 127 traces. Root cause: Vector 
   database query timeout increased.

2. High Token Usage (5:30 PM - 6:00 PM): Token consumption 3x higher 
   than baseline. 45 traces affected. Pattern shows unusually long 
   context windows being passed to the LLM.

3. Evaluation Failures (8:00 PM - 9:00 PM): Sudden increase in PII 
   detection failures. 23 traces flagged. User queries contained 
   more personal information than usual.

Would you like me to dive deeper into any of these?
Deep Trace Analysis:
You: Analyze trace ID abc-123-xyz in detail

Argus: Here's a detailed analysis of trace abc-123-xyz:

Overview:
- Total Duration: 2.3 seconds
- Status: Failed (Hallucination Check)
- Token Usage: 1,247 tokens ($0.0062)
- Spans: 7 total (5 LLM calls, 2 vector DB queries)

Execution Flow:
1. User query received: "What are the benefits of exercise?"
2. Vector DB retrieval: Found 3 relevant documents
3. LLM call 1: Context processing (340ms)
4. LLM call 2: Response generation (1.8s)
5. Hallucination check: FAILED

Failure Reason:
The response included a claim that "exercise cures diabetes" which 
is not supported by the retrieved context. The context only mentions 
exercise helps "manage" diabetes, not cure it.

Recommendation:
Add stricter grounding instructions in your prompt to ensure claims 
are directly supported by provided context.

Managing Your Chats

Access all your previous conversations with Argus:
  1. Navigate to the Argus section on the header
  2. Your chat history appears
  3. Click on any chat from your history
  4. The full conversation loads
  5. Continue asking questions in the same context
  6. Argus remembers the previous discussion and traces analyzed

Searching Chat History

  1. Use the Search bar at the top of the chat history
  2. Enter keywords, trace IDs, or topics
  3. Argus filters chats containing your search terms
  4. Click on relevant results to open that conversation
Search Examples:
  • “hallucination failures”
  • “trace abc-123”
  • “latency spike”
  • “production deployment”
  • “token usage analysis”

What Argus Can Analyze?

Trace-Level Analysis

  • Span Hierarchies - Understand execution flow and dependencies
  • Performance Metrics - Identify bottlenecks and slow operations
  • Evaluation Results - Explain why traces passed or failed checks
  • Token Usage - Analyze token consumption and costs
  • Error Patterns - Identify common failure modes

Application-Level Insights

  • Trend Analysis - Spot patterns over time
  • Anomaly Detection - Highlight unusual behaviors
  • Comparison Analysis - Compare different time periods or versions
  • Resource Utilization - Track token usage, costs, and API calls
  • Quality Metrics - Evaluate overall application performance

Multi-Application Analysis

  • Cross-Application Patterns - Find common issues across applications
  • Performance Comparison - Compare metrics between applications
  • Resource Distribution - Understand resource usage across your portfolio
  • Best Practices - Identify what works well in one app to apply elsewhere

Use Cases

Debugging Failed Evaluations When traces fail quality checks:
  • Ask Argus to explain the failures
  • Identify common patterns in failed traces
  • Get recommendations for fixes
  • Compare failed vs. successful traces
Performance Optimization To improve application performance:
  • Identify latency bottlenecks
  • Analyze token usage patterns
  • Find opportunities to reduce costs
  • Optimize prompt engineering based on insights
Production Monitoring For ongoing monitoring:
  • Investigate sudden changes in metrics
  • Explain anomalies in real-time
  • Analyze deployment impacts
  • Track quality trends over time
Root Cause Analysis When issues occur:
  • Trace problems to their source
  • Understand cascading failures
  • Identify external dependencies causing issues
  • Get actionable recommendations
Quality Assurance For testing and validation:
  • Analyze evaluation coverage
  • Identify edge cases and gaps
  • Compare test vs. production behavior
  • Validate improvements after changes

Tips for Effective Use

Be Specific Instead of: “Why are things failing?” Try: “Why are traces from my chatbot application failing the relevance check in the last 24 hours?” Provide Context Help Argus understand what you’re investigating:
  • Mention specific applications or trace IDs
  • Include time ranges when relevant
  • Specify which metrics or evaluations you care about
  • Note any recent changes (deployments, config updates)
Ask Follow-Up Questions Argus maintains conversation context:
  • Drill deeper into specific findings
  • Ask for examples or evidence
  • Request different analysis angles
  • Explore recommendations in detail
Use Trace/ Span or Session IDs For specific analysis:
  • Copy the Trace/ Span or Sessio ID from the Trace or Session details section
  • Paste it in your question to Argus
  • Get detailed breakdown of that specific execution
Iterate and Refine
  • Start with broad questions
  • Narrow down based on initial insights
  • Ask Argus to compare different scenarios
  • Request actionable next steps
Privacy and Security
  • Chat Privacy - Your chats are private to your account
  • Data Security - All conversations are encrypted
  • Trace Access - Argus only accesses traces you have permission to view
  • No Data Sharing - Your analysis sessions are not shared with other users