Skip to content

Vector Search API

The powerful vector search capabilities in the Wickson API let you semantically search across your stored media using natural language. This endpoint supports both basic semantic search for quick results and advanced deep search for more comprehensive exploration of your data, with flexible collection targeting options.

Endpoint

POST https://api.wickson.ai/v1/search

Authentication

Include your API key in the header of all requests:

X-Api-Key: YOUR_API_KEY

Request Structure

The search endpoint accepts a JSON object with the following structure:

{
  "query": "your search query",
  "type": "basic",              // Type of search: 'basic' or 'advanced'

  // Collection targeting options (choose ONE approach):
  "collections": "default",     // Option 1: Single collection as string
  "collections": ["docs", "images"], // Option 2: Multiple collections as array
  "collections": "all",         // Option 3: Search across all collections

  // Common parameters:
  "max_results": 10,            // Maximum number of results to return
  "min_score": 0.6,             // Minimum similarity score (0.0-1.0)
  "include_vectors": false,     // Include vector embeddings in results

  // Additional configuration if needed:
  "config": {
    "context_depth": 3,         // For advanced search: exploration depth
    "lexical_weight": 0.35,        // Weight for lexical vs semantic (0.0-1.0)
    "expansion_factor": 0.5     // For advanced search: Controls breadth of exploration (0.1-1.0)
  },

  // Modality-specific configuration:
  "modality": {
    "type": "document",         // document, image, video, audio, model
    "weight": 0.7,              // 0.0-1.0, how much to prioritize this modality
    "features": ["content", "quality"] // modality-specific features to score
  },

  // Filter criteria:
  "filters": {
    "media_type": "document",
    "created_after": "2023-01-01T00:00:00Z"
  }
}

Core Parameters

Required Parameters

Parameter Type Description
query string The text query to search for (required)

Search Type

The type parameter determines the search algorithm used:

Value Description Use Case
basic Fast semantic search optimized for speed and direct matches When you need quick results with high relevance for straightforward queries
advanced Deep contextual search using our R3F technology to find related content and explore connections When you need comprehensive exploration, discovery of content relationships, and cross-modal connections

Collection Targeting

The collections parameter offers flexible ways to specify which collections to search:

Value Format Description Example
String (single collection) Search within a single specified collection "collections": "research"
Array of strings Search across multiple specific collections "collections": ["research", "reports"]
Special value "all" Search across all your collections "collections": "all"

Common Parameters

These parameters can be specified at the top level of the request:

Parameter Type Default Description
max_results integer 10 Maximum number of results to return (1-1000)
min_score float 0.6 Minimum similarity score (0.0-1.0) for including results
include_vectors boolean false Whether to include vector embeddings in results

Modality Configuration

The modality object lets you fine-tune search for specific media types:

"modality": {
  "type": "document", // document, image, video, audio, model
  "weight": 0.7,      // 0.0-1.0, how much to prioritize this modality
  "features": ["content", "quality"] // modality-specific features to score
}

Modality-Specific Features

Each modality supports specific features for targeted search emphasis:

Document Features:

  • content - Prioritizes document content matching
  • structure - Emphasizes document structure and organization
  • quality - Boosts high-quality documents with better formatting

Image Features:

  • visual - Emphasizes visual attributes and composition
  • scene - Prioritizes scene understanding and context
  • quality - Boosts high-quality, clear images

Video Features:

  • visual - Emphasizes visual content
  • audio - Emphasizes audio track content
  • temporal - Emphasizes time-based relationships and sequences

Audio Features:

  • speech - Emphasizes spoken content
  • acoustic - Emphasizes non-speech sounds and music
  • quality - Boosts high-quality audio with clarity

Model Features:

  • geometry - Emphasizes 3D model structure and shape
  • materials - Emphasizes material and texture quality
  • technical - Emphasizes technical aspects like dimensions

Advanced Configuration Options

Additional options can be specified in the config object:

Option Type Default Description
context_depth integer 3 For advanced search only: How deep to explore contextual relationships (0-5)
expansion_factor float 0.5 For advanced search only: Controls breadth of exploration (0.1-1.0)
lexical_weight float 0.35 Weight given to lexical matching vs. semantic matching (0.0-1.0)

Filtering Results

The filters object allows you to narrow down results based on metadata:

"filters": {
  "media_type": "document",  // Filter by media type
  "created_after": "2023-01-01T00:00:00Z", // Filter by creation date
  "topics": ["AI", "machine learning"] // Filter by topics
}

Common filter fields include:

Field Description Example
media_type Type of media "document", "image", "video", "audio", "model"
created_after Items created after date "2023-01-01T00:00:00Z"
created_before Items created before date "2023-12-31T23:59:59Z"
topics Semantic topics ["AI", "machine learning"]
entities Detected entities {"people": ["Albert Einstein"]}

Response Structure

Success Response

For successful searches, you'll receive a JSON response with a structure that varies based on the search type.

Basic Search Response

{
  "success": true,
  "message": "Search completed successfully",
  "data": {
    "meta": {
      "query": "machine learning applications",
      "stats": {
        "total_results": 5,
        "query_time_ms": 138,
        "includes_vectors": false,
        "collection_count": 1
      },
      "collections": {
        "searched": ["default"],
        "result_distribution": {"default": 5},
        "scope_type": "single"
      }
    },
    "results": [
      {
        "id": "vec-abc123",
        "score": 0.92,
        "metadata": {
          "media_type": "document",
          "file_info": {
            "filename": "research-paper.pdf",
            "size_bytes": 1240000,
            "mime_type": "application/pdf"
          },
          "search_metadata": {
            "summary": "Research on generative AI applications",
            "description": "Academic paper exploring practical applications...",
            "semantic_markers": {
              "topics": ["AI", "machine learning", "research"],
              "keywords": ["generative models", "applications", "analysis"]
            },
            "entities": {
              "people": ["John Smith"],
              "organizations": ["AI Research Institute"]
            },
            "quality_metrics": {
              "clarity": 0.85,
              "completeness": 0.92
            }
          }
        },
        "collection": "default",
        "source_store": "hd",
        "relevance_explanation": "Strong match (92.0% confidence) | Matches topics: AI, machine learning"
      }
      // Additional results...
    ],
    "cost": 0.00
  },
  "metadata": {
    "search_type": "basic",
    "api_version": "v1",
    "cost": 0.00,
    "query": "machine learning applications"
  }
}

Advanced Search Response

Advanced search returns a more comprehensive response with additional relationship and cross-modality data:

{
  "success": true,
  "message": "R3F search completed successfully",
  "data": {
    "meta": {
      "query": "machine learning applications",
      "version": "v1",
      "cost": {
        "amount": 0.05,
        "operation_count": 342,
        "depth_reached": 3,
        "rate_limit_remaining": 978
      },
      "stats": {
        "total_results": 12,
        "time_ms": 432,
        "found_types": ["📄 Document (8)", "🖼 Image (3)", "🔊 Audio (1)"],
        "depth_reached": 3,
        "includes_vectors": false,
        "collection_count": 2,
        "cross_modal_connections": 7
      },
      "collections": {
        "searched": ["research", "projects"],
        "result_distribution": {"research": 7, "projects": 5},
        "scope_type": "multiple"
      }
    },
    "results": [
      {
        "id": "vec-abc123",
        "score": 0.92,
        "metadata": {
          // Same metadata structure as basic search...
        },
        "collection": "research",
        "context_depth": 0,
        "expansion_path": [],
        "depth_score": 0.92,
        "diversity_score": 1.0,
        "cross_modal_connections": [],
        "relevance_explanation": "Strong match (92.0% confidence) | Matches topics: AI, machine learning"
      },
      {
        "id": "vec-def456",
        "score": 0.87,
        "metadata": {
          // Metadata for second result...
        },
        "collection": "projects",
        "context_depth": 1,
        "expansion_path": ["vec-abc123"],
        "depth_score": 0.81,
        "diversity_score": 1.2,
        "cross_modal_connections": [
          {
            "target_id": "vec-abc123",
            "relation_type": "semantic_expansion",
            "strength": 0.89,
            "target_modality": "document",
            "source_modality": "image",
            "symbol": "━━",
            "relationship_type": "→"
          }
        ],
        "relevance_explanation": "Strong match (87.0% confidence) | Found through contextual connection",
        "lexical_match": {
          "score": 0.65,
          "semantic_score": 0.92,
          "combined_weight": 0.35
        }
      }
      // Additional results...
    ],
    "relationships": {
      "graph": {
        "nodes": {
          "vec-abc123": {
            "id": "vec-abc123",
            "modality": "document",
            "score": 0.92,
            "depth": 0
          },
          "vec-def456": {
            "id": "vec-def456",
            "modality": "image",
            "score": 0.87,
            "depth": 1
          }
          // Additional nodes...
        },
        "edges": [
          {
            "source": "vec-def456",
            "target": "vec-abc123",
            "type": "semantic_expansion",
            "strength": 0.89,
            "symbol": "━━",
            "relationship_type": "→"
          }
          // Additional edges...
        ]
      },
      "clusters": {
        "clusters": {
          "document": ["vec-abc123", "vec-ghi789"],
          "image": ["vec-def456"]
        },
        "metadata": {
          "document": {
            "modality": "document",
            "avg_score": 0.89,
            "member_count": 2,
            "depth_distribution": {"0": 1, "1": 1}
          },
          "image": {
            "modality": "image",
            "avg_score": 0.87,
            "member_count": 1,
            "depth_distribution": {"1": 1}
          }
        }
      }
    }
  },
  "metadata": {
    "search_type": "advanced",
    "api_version": "v1",
    "query": "machine learning applications"
  }
}

Understanding Result Fields

Each result contains:

Core Fields:

  • id: Unique identifier for the media item
  • score: Relevance score (0.0-1.0)
  • metadata: Detailed information about the media item
  • media_type: Type of media (document, image, video, audio, model)
  • file_info: File metadata (name, size, type)
  • search_metadata: Rich semantic metadata
    • summary: Concise content summary
    • description: Detailed content description
    • semantic_markers: Topics, keywords, categories
    • entities: People, organizations, locations detected
    • quality_metrics: Content quality scores
  • modality_metadata: Media type-specific metadata
  • collection: The collection this result belongs to
  • relevance_explanation: Human-readable explanation of why this result matched

The relevance_explanation field provides a detailed assessment of why the result matched your query. For example:

Strong match (84.5% confidence) | Matches topics: renewable energy, sustainability | Quality scores - clarity: 90%, completeness: 85%

This tells you that the result: - Is a strong match with 84.5% confidence - Matches the specific topics "renewable energy" and "sustainability" - Has high clarity (90%) and completeness (85%) quality metrics

Advanced Search Fields:

  • context_depth: Level of contextual exploration (0 = direct match)
  • expansion_path: Chain of connections leading to this result
  • depth_score: Similarity score adjusted for depth
  • diversity_score: Score reflecting content diversity value
  • cross_modal_connections: Links to related media in different modalities
  • lexical_match: Information about lexical (keyword) matching components

Understanding Relationship Graphs

In advanced search, the relationships object provides a comprehensive view of how your search results are connected:

  1. Graph Structure:
  2. nodes: Each media item as a node in the graph
  3. edges: Connections between media items with relationship types and strengths

  4. Connection Types:

  5. (strong connection): Highly related content
  6. (similar content): Moderately related content
  7. (related content): Loosely related content

  8. Symbols indicate connection strength:

  9. : Very strong connection (0.9+)
  10. ━━: Strong connection (0.8-0.9)
  11. ─→: Moderate connection (0.7-0.8)
  12. ⋯→: Weak connection (0.6-0.7)
  13. ⋅⋅: Very weak connection (<0.6)

Error Handling

In case of errors, you'll receive a response like:

{
  "success": false,
  "message": "Invalid search configuration",
  "code": "INVALID_PARAMETER",
  "details": {
    "parameter": "context_depth",
    "error": "context_depth cannot exceed 5"
  },
  "suggestion": "Set context_depth to a value between 0 and 5"
}

Common error codes:

Code Description
MISSING_PARAMETER A required parameter is missing
INVALID_PARAMETER A parameter has an invalid value
UNAUTHORIZED Invalid or missing API key
INSUFFICIENT_BALANCE Account balance too low for operation
RATE_LIMIT_EXCEEDED Too many requests in time period
SEARCH_ERROR Error during search execution

Code Examples

Basic Search in a Single Collection (Python)

import requests

headers = {
    "X-Api-Key": "YOUR_API_KEY",
    "Content-Type": "application/json"
}

data = {
    "query": "machine learning applications in healthcare",
    "type": "basic",                   # Using simplified type parameter
    "collections": "research-papers",  # Single collection as string
    "min_score": 0.7,                  # Using top-level parameter
    "max_results": 20                  # Using top-level parameter
}

response = requests.post(
    "https://api.wickson.ai/v1/search", 
    headers=headers, 
    json=data
)

if response.status_code == 200:
    results = response.json()["data"]["results"]
    for result in results:
        print(f"Score: {result['score']}")
        print(f"Collection: {result['collection']}")
        print(f"Title: {result['metadata']['file_info']['filename']}")
        print(f"Summary: {result['metadata']['search_metadata']['summary']}")
        print(f"Relevance: {result['relevance_explanation']}")
        print("-" * 40)
else:
    print(f"Error: {response.status_code} - {response.text}")

Search Across Multiple Collections (JavaScript)

const searchMultipleCollections = async () => {
  const headers = {
    "X-Api-Key": "YOUR_API_KEY",
    "Content-Type": "application/json"
  };

  const data = {
    "query": "coral reef conservation",
    "type": "basic",
    "collections": ["marine-research", "environmental-projects", "educational-content"],
    "max_results": 15,
    "config": {
      "lexical_weight": 0.35  // Balance between semantic and lexical search
    }
  };

  try {
    const response = await fetch("https://api.wickson.ai/v1/search", {
      method: "POST",
      headers: headers,
      body: JSON.stringify(data)
    });

    if (!response.ok) {
      throw new Error(`Error: ${response.status} - ${response.statusText}`);
    }

    const result = await response.json();

    console.log(`Found ${result.data.meta.stats.total_results} results across ${result.data.meta.collections.searched.length} collections`);

    // Group results by collection
    const resultsByCollection = {};
    result.data.results.forEach(item => {
      if (!resultsByCollection[item.collection]) {
        resultsByCollection[item.collection] = [];
      }
      resultsByCollection[item.collection].push(item);
    });

    // Display results organized by collection
    for (const [collection, items] of Object.entries(resultsByCollection)) {
      console.log(`\nCollection: ${collection} (${items.length} results)`);
      items.forEach(item => {
        console.log(`- ${item.metadata.file_info.filename}: ${item.score.toFixed(2)}`);
        console.log(`  ${item.metadata.search_metadata.summary}`);
        console.log(`  ${item.relevance_explanation}`);
      });
    }
  } catch (error) {
    console.error("Search failed:", error);
  }
};

searchMultipleCollections();

Advanced Search Across All Collections (Python)

import requests

headers = {
    "X-Api-Key": "YOUR_API_KEY",
    "Content-Type": "application/json"
}

data = {
    "query": "impact of climate change on coral reefs",
    "type": "advanced",
    "collections": "all",     # Search all collections with special "all" value
    "max_results": 20,        # Top-level parameter
    "config": {
        "context_depth": 2,   # Advanced search with 2 levels of exploration
        "lexical_weight": 0.35   # Balanced lexical vs semantic search
    },
    "filters": {
        "topics": ["climate change", "marine biology"]
    }
}

response = requests.post(
    "https://api.wickson.ai/v1/search", 
    headers=headers, 
    json=data
)

if response.status_code == 200:
    result_data = response.json()["data"]
    meta = result_data["meta"]

    print(f"Found {meta['stats']['total_results']} results across {len(meta['collections']['searched'])} collections")
    print(f"Collections searched: {', '.join(meta['collections']['searched'])}")
    print(f"Search depth reached: {meta['stats']['depth_reached']}")
    print(f"Cross-modal connections found: {meta['stats'].get('cross_modal_connections', 0)}")

    # Process results
    for item in result_data["results"]:
        print(f"\nResult: {item['metadata']['file_info']['filename']}")
        print(f"Collection: {item['collection']}")
        print(f"Score: {item['score']:.2f} (Depth: {item['context_depth']})")
        print(f"Summary: {item['metadata']['search_metadata']['summary']}")
        print(f"Explanation: {item['relevance_explanation']}")

        # Show connections for results found through exploration
        if item['cross_modal_connections']:
            print("Connected to:")
            for conn in item['cross_modal_connections']:
                print(f"  - {conn['target_id']} ({conn['relation_type']}, strength: {conn['strength']:.2f})")
                print(f"    {conn['source_modality']}{conn['target_modality']}, symbol: {conn['symbol']}")
else:
    print(f"Error: {response.status_code} - {response.text}")

Modality-Specific Search Example

import requests

# Configuration for image-focused search
search_request = {
    "query": "sunset over mountains landscape photography",
    "type": "basic",
    "collections": "photography",

    # Prioritize image content with modality configuration
    "modality": {
        "type": "image",          # Focus on images
        "weight": 0.9,            # Strong emphasis on image modality
        "features": ["visual", "scene", "quality"]  # Consider these image aspects
    },

    "max_results": 10,
    "min_score": 0.65,

    "config": {
        "lexical_weight": 0.2        # Lower weight favors semantic understanding for visual content
    }
}

# Execute search
response = requests.post(
    "https://api.wickson.ai/v1/search",
    headers={
        "X-Api-Key": "YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json=search_request
)

# Process results to focus on image attributes
if response.status_code == 200:
    data = response.json()["data"]

    for result in data["results"]:
        # Get image-specific metadata
        image_metadata = result["metadata"].get("modality_metadata", {}).get("visual", {})

        print(f"\nImage: {result['metadata']['file_info']['filename']} (Score: {result['score']:.2f})")
        print(f"Relevance: {result['relevance_explanation']}")

        # Display image-specific attributes
        if image_metadata:
            print("Image attributes:")
            if "scene_type" in image_metadata:
                print(f"  - Scene type: {image_metadata['scene_type']}")
            if "composition" in image_metadata:
                print(f"  - Composition: {image_metadata['composition']}")
            if "visual_attributes" in image_metadata:
                print(f"  - Visual elements: {', '.join(image_metadata['visual_attributes'])}")
else:
    print(f"Error: {response.status_code} - {response.text}")

Rate Limits & Costs

Rate Limits

Rate limits are applied based on your API tier:

Tier Limit Reset Period
Basic/Standard 1,000 requests Per hour

Rate limit information is included in response headers:

  • X-RateLimit-Limit: Maximum requests allowed per hour
  • X-RateLimit-Remaining: Remaining requests in current window
  • X-RateLimit-Reset: UTC timestamp for limit reset

Cost Structure

Operation Cost
Basic Search FREE
Advanced Search $0.01 base + $0.01 per depth level

Example costs for advanced search:

  • Depth 1: $0.02 ($0.01 + $0.01)
  • Depth 3: $0.04 ($0.01 + $0.03)

Note: The cost is the same regardless of how many collections you search!

Best Practices

Collection Organization Strategies

Collections provide a powerful way to organize your content for more efficient search:

  1. Content-Type Collections: Organize by media type (documents, images, videos)

    collections: ["documents", "images", "videos"]
    

  2. Project-Based Collections: Group by project or content purpose

    collections: "project-apollo"
    

  3. Topic-Based Collections: Organize by subject matter

    collections: "ai-research"
    

  4. User-Based Collections: For multi-tenant applications, create collections per user

    collections: "user-1234-library"
    

  5. Temporal Collections: Organize by time period

    collections: "2023-q2-data"
    

When to Use Each Collection Search Approach

Single Collection (String):

  • When all relevant content is in one collection
  • For user-specific content in multi-tenant applications
  • When you need the fastest possible search performance

Multiple Collections (Array):

  • When content is organized across logical groups
  • For searching across related but separate projects
  • When you need to search a specific subset of your content

All Collections ("all"):

  • For comprehensive research across your entire content library
  • When looking for rare or unusual connections
  • For administrative tasks that need a global view of content
  • When you're not sure where specific content is stored

When to Use Each Search Type

Basic Search is ideal for:

  • Direct queries where you expect exact matches
  • Speed-critical applications
  • Simple keyword or concept matching
  • Cost-sensitive operations

Advanced Search is ideal for:

  • Research and exploration
  • Finding conceptually related content
  • Discovering connections between different media types
  • Comprehensive information gathering

Understanding the lexical_weight Parameter

The lexical_weight is an important parameter that controls the balance between lexical (keyword) matching and semantic (meaning) matching:

Value Behavior Use Case
0.1 10% lexical, 90% semantic Conceptual exploration, finding related ideas without exact matches
0.35 35% lexical, 65% semantic Default balanced approach, good for most queries
0.7 70% lexical, 30% semantic Technical documentation, when exact terminology matters more

Search Optimization Tips

Use descriptive queries

  • More specific queries yield more precise results
  • Include key concepts directly in your query

Leverage collections

  • Organize your media into logical collections
  • Searching within a specific collection improves relevance and speed

Apply appropriate filters

  • Use filters to narrow results by media type, date, topics, etc.
  • Combine multiple filters for precision

Balance depth in advanced search

  • Higher depths reveal more connections but increase cost and processing time
  • Start with depth=1 and increase as needed

Understand modality configuration

  • Use the modality parameter to emphasize specific media types
  • Utilize modality-specific features for targeted results

Monitor and analyze results

  • Use the relevance_explanation field to understand why results matched
  • Track which collections yield the most relevant results
  • Refine your queries based on result patterns

Performance Considerations

  • Basic search is significantly faster than advanced search
  • Searching a single collection is faster than searching multiple collections
  • Large collections may take longer to search
  • Consider pagination for handling large result sets
  • Set appropriate timeouts for advanced searches (recommended: 10-30 seconds)

Troubleshooting

Issue Potential Solution
No results returned Try relaxing filters, lowering min_score, or using more general query terms
Results missing from known collections Check that you're targeting the correct collections or try collections: "all"
Too many irrelevant results Use more specific queries, add filters, or increase min_score
Search is slow Use basic search instead of advanced, reduce context_depth, search fewer collections, or apply more specific filters
Rate limit errors Implement request queueing with exponential backoff, or upgrade your API tier
Unexpected collection distribution Verify your collection organization and content tagging methodology
Cross-modal connections not appearing Increase context_depth (minimum 2) in advanced search, and ensure you have multiple media types in your collections
This site uses cookies to help us improve the overall documentation and browsing experience. By continuing to use this site, you agree to our Privacy Policy.