Media AI Analysis Guide¶
Introduction to AI Media Analysis¶
The Wickson API's file analysis capabilities allow you to leverage advanced AI to analyze media files and answer specific questions about their content, without storing the files as vector embeddings. This functionality is perfect for one-time analysis, extracting specific insights, or exploring content before deciding to store it permanently.
Key Concepts¶
Analysis vs. Storage¶
- Analysis: One-time operation that extracts insights without permanent storage
- Storage: Processes media and stores vector embeddings for future search and retrieval
The analyze endpoint is ideal when you:
- Need a quick answer about file content
- Want to explore content before committing to storage
- Have a specific question about a document, image, 3d model, video, or audio file
- Don't need to search through the content later
AI-Powered Analysis¶
Our system employs advanced AI models to understand and analyze your media:
- Extract key insights from documents
- Identify objects and scenes in images
- Comprehend spoken content in audio and video
- Process and understand 3D models
- Answer specific questions about almost any media type
File Handling During Analysis¶
- Your original file upload is only used for the analysis process and your original file is not stored
- Only the analysis results and file metadata are returned
- No vectors, content, or any other data from your file are saved or stored as a result of using the media analysis endpoint
Working with File Analysis¶
Complete Analysis Example¶
import requests
# Configuration
api_key = "YOUR_API_KEY"
file_path = "document.pdf"
query = "What are the key arguments in this document?"
# Perform analysis
with open(file_path, "rb") as file:
response = requests.post(
"https://api.wickson.ai/v1/ai/analyze",
headers={"X-Api-Key": api_key},
files={"file": file},
data={
"query": query,
"response_format": "json",
"system_prompt": "Analyze this as a policy document with attention to key proposals."
}
)
# Process response
if response.status_code == 200:
data = response.json()["data"]
# Print file and processing information
print(f"Analyzed: {data['media_info']['filename']} ({data['media_info']['media_type']})")
print(f"Processing time: {data['processing_info']['processing_time_ms']}ms")
print(f"Cost: ${data['processing_info']['cost']}")
# Print analysis results
print("\nANALYSIS RESULTS:")
print(f"Summary: {data['analysis']['summary']}")
print("\nKey elements:")
for element in data['analysis']['key_elements']:
print(f"- {element}")
# Print answer to query
print(f"\nQUERY: {query}")
print(f"ANSWER: {data['answer']['main_response']}")
# Print insights
if "insights" in data["answer"]:
print("\nKEY INSIGHTS:")
for insight in data["answer"]["insights"]:
print(f"Observation: {insight['observation']}")
print(f"Analysis: {insight['analysis']}")
print(f"Implication: {insight['implication']}")
print()
else:
print(f"Error {response.status_code}: {response.text}")
cURL Example¶
curl -X POST https://api.wickson.ai/v1/ai/analyze \
-H "X-Api-Key: YOUR_API_KEY" \
-F "file=@document.pdf" \
-F "query=What are the key arguments in this document?" \
-F "system_prompt=Analyze this as a policy document with attention to key proposals." \
-F "response_format=json"
Reusable Analysis Function¶
import requests
def analyze_file(file_path, query, system_prompt=None, response_format="text"):
"""
Analyze a media file using the Wickson API.
Parameters:
file_path (str): Path to the media file
query (str): Question or query about the media content
system_prompt (str, optional): Instructions for the AI model
response_format (str, optional): Format for the response ('text' or 'json')
Returns:
dict: The analysis results from the API
"""
url = "https://api.wickson.ai/v1/ai/analyze"
headers = {"X-Api-Key": "YOUR_API_KEY"}
data = {
"query": query,
"response_format": response_format
}
if system_prompt:
data["system_prompt"] = system_prompt
# Use context manager to ensure file is properly closed
with open(file_path, "rb") as file_obj:
files = {"file": file_obj}
response = requests.post(url, headers=headers, data=data, files=files)
if response.status_code == 200:
return response.json()
else:
raise Exception(f"Error {response.status_code}: {response.text}")
# Example usage with error handling
try:
result = analyze_file(
"quarterly_report.pdf",
"What are the key financial metrics and how did they change from previous quarters?",
system_prompt="Focus on revenue, profit margins, and growth trends."
)
# Print the main response
print(f"Analysis of {result['data']['media_info']['filename']}:")
print(result["data"]["answer"]["main_response"])
except Exception as e:
print(f"Analysis failed: {e}")
Customizing Analysis with System Prompts¶
System prompts guide the AI's approach to analysis:
# Legal analysis
legal_analysis = analyze_file(
"contract.pdf",
"What potential liabilities should I be aware of?",
system_prompt="Analyze this from a legal perspective, identifying potential risks, liabilities, and obligations. Focus on unclear terms and potential loopholes."
)
# Technical analysis
technical_analysis = analyze_file(
"schematic.jpg",
"Explain how this circuit works",
system_prompt="Analyze this as an electrical engineer would, focusing on component functionality, signal flow, and potential design considerations."
)
# Financial analysis
financial_analysis = analyze_file(
"annual_report.pdf",
"What are the company's growth prospects?",
system_prompt="Analyze this as a financial analyst, focusing on revenue trends, market position, competitive advantages, and financial health indicators."
)
Getting Structured Responses¶
For programmatic use, request JSON-formatted responses:
try:
structured_result = analyze_file(
"research_paper.pdf",
"Summarize the methodology and key findings",
response_format="json"
)
# Access structured components
data = structured_result["data"]
media_info = data["media_info"]
main_response = data["answer"]["main_response"]
insights = data["answer"]["insights"]
# Process insights programmatically
print(f"Analysis of {media_info['filename']} ({media_info['format']}):")
print(f"\nMain response: {main_response}")
print("\nInsights:")
for insight in insights:
print(f"Observation: {insight['observation']}")
print(f"Analysis: {insight['analysis']}")
print(f"Implication: {insight['implication']}")
print()
# Extract and use content summary
if "analysis" in data:
summary = data["analysis"]["summary"]
print(f"Summary: {summary}")
if "key_elements" in data["analysis"]:
print("\nKey elements:")
for element in data["analysis"]["key_elements"]:
print(f"- {element}")
except Exception as e:
print(f"Error processing analysis: {e}")
Supported Media Types and Limits¶
| Media Type | Formats | Size Limits | Content Limits |
|---|---|---|---|
| Documents | PDF, DOCX, TXT, MD, etc. | 20MB | 75k tokens |
| Images | JPG, PNG, GIF, WEBP, etc. | 20MB | - |
| Video | MP4, MOV, AVI, etc. | 100MB | 10 minutes |
| Audio | MP3, WAV, FLAC, etc. | 50MB | 10 minutes |
| 3D Models | GLB, GLTF, OBJ, etc. | 50MB | - |
Best Practices¶
Crafting Effective Queries¶
Be Specific:
- Ask clear, focused questions
- Less effective: "Tell me about this document"
- More effective: "What are the three main arguments presented in this research paper?"
Use Context-Appropriate Questions:
- Tailor questions to the media type
- Documents: Ask about content, arguments, structure
- Images: Ask about objects, scenes, compositions
- Video/Audio: Ask about spoken content, scenes, timeline
- 3D Models: Ask about structure, components, spatial relationships
Use Multi-Part Questions Sparingly:
- Although the system handles complex queries, breaking them into separate and focused questions often works better
Optimizing System Prompts¶
System prompts can dramatically improve analysis quality by providing:
- Domain Context: "Analyze this as a financial advisor would..."
- Analysis Focus: "Focus on risk factors and mitigation strategies..."
- Output Preferences: "Provide a detailed technical breakdown with numerical assessments..."
- Specialized Knowledge Application: "Apply principles of architectural design when analyzing this blueprint..."
Media-Specific Tips¶
Documents¶
- Ensure text is properly OCR'd in PDFs
- For long documents, ask about specific sections
- Use system prompts to guide focus to relevant areas
Images¶
- Higher resolution images enable better detail analysis
- Consider image context when asking questions
- For diagrams/charts, specifically request data interpretation
Video/Audio¶
- Shorter clips get more thorough analysis
- Consider mentioning specific timeframes in questions
- For multilingual content, specify language if needed
3D Models¶
- Simpler models get more accurate analysis
- Ask about specific components or areas
- Request measurements or spatial relationships explicitly
Understanding Analysis Results¶
Analysis responses include several components:
Media Information:
- Filename, format, size, etc.
Processing Information:
- AI model used, processing time, status
Analysis Content:
- Summary: Concise overall summary
- Key Elements: Important components identified
- Relevant Details: Specific information related to your query
Answer:
- Main Response: Direct answer to your query
- Insights: Deeper observations with analysis and implications
- Additional Considerations: Related points worth noting
Example Response Structure¶
{
"success": true,
"message": "Media analysis completed successfully",
"data": {
"media_info": {
"filename": "contract.pdf",
"format": ".pdf",
"media_type": "document",
"processed_at": "2024-12-19T15:30:00Z",
"size": 1234567
},
"processing_info": {
"model": "gemini",
"processing_time_ms": 1200,
"status": "success",
"cost": 0.03,
"balance": 9.97
},
"query_info": {
"query": "What are the key takeaways from this document?",
"response_format": "json",
"system_prompt": "Analyze this document from a legal perspective..."
},
"analysis": {
"summary": "This contract outlines...",
"key_elements": ["Clause 1...", "Clause 2..."],
"relevant_details": "..."
},
"answer": {
"main_response": "The key takeaways are...",
"insights": [
{"observation": "...", "analysis": "...", "implication": "..."},
{"observation": "...", "analysis": "...", "implication": "..."}
],
"additional_considerations": "..."
}
}
}
Error Handling¶
import requests
import time
def analyze_with_error_handling(file_path, query):
url = "https://api.wickson.ai/v1/ai/analyze"
headers = {"X-Api-Key": "YOUR_API_KEY"}
try:
with open(file_path, "rb") as file:
response = requests.post(
url,
headers=headers,
files={"file": file},
data={"query": query},
timeout=180 # Longer timeout for large files
)
# Check for HTTP errors
response.raise_for_status()
# Parse response
result = response.json()
# Check for API-level errors
if not result.get("success", False):
error_msg = result.get("message", "Unknown error")
error_code = result.get("code", "unknown")
raise Exception(f"API error ({error_code}): {error_msg}")
return result["data"]
except FileNotFoundError:
print(f"Error: File '{file_path}' not found")
except requests.exceptions.RequestException as e:
print(f"Request error: {str(e)}")
if hasattr(e, 'response') and e.response is not None:
try:
error_data = e.response.json()
print(f"API details: {error_data.get('message', 'No details available')}")
except:
print(f"Status code: {e.response.status_code}")
except Exception as e:
print(f"Error: {str(e)}")
return None
# Usage example with retry
def analyze_with_retry(file_path, query, max_retries=3):
for attempt in range(max_retries):
result = analyze_with_error_handling(file_path, query)
if result:
return result
if attempt < max_retries - 1:
print(f"Retrying... (Attempt {attempt + 1} of {max_retries})")
time.sleep(2 ** attempt) # Exponential backoff
print(f"Failed after {max_retries} attempts")
return None
Cost Considerations¶
| Operation | Cost | Notes |
|---|---|---|
| Media Analysis | $0.03 per query | Flat rate regardless of media type |
Unlike media processing for storage, analysis is a one-time operation with no storage costs.
Troubleshooting¶
| Issue | Solutions |
|---|---|
| Analysis too general |
|
| Missing specific details |
|
| Inaccurate analysis |
|
| File rejected |
|
| HTTP 400 errors |
|
| HTTP 401/403 errors |
|
| HTTP 429 errors |
|
Practical Applications¶
Document Analysis¶
# Extract key business risks
risks_analysis = analyze_file(
"business_plan.pdf",
"What are the top five risks to this business model?",
system_prompt="Analyze as a business strategist, focusing on market risks, operational challenges, competitive threats, financial vulnerabilities, and scaling issues."
)
# Analyze contract terms
contract_analysis = analyze_file(
"agreement.pdf",
"What are the key obligations and termination conditions?",
system_prompt="Analyze this contract from a legal perspective, identifying obligations, liabilities, termination clauses, and potential areas of concern."
)
Image Analysis¶
# Architectural analysis
architecture_analysis = analyze_file(
"building_plan.jpg",
"What architectural style is this and what are its key features?",
system_prompt="Analyze this as an architect would, identifying style, notable features, potential structural considerations, and design principles."
)
# Wildlife analysis
bird_analysis = analyze_file(
"bird_wildlife_.jpg",
"What birds are visible in this image?",
system_prompt="Analyze this image and attempt to accurately identify any bird species present."
)