Today, I want to dive deep into an impressive implementation that leverages vector embeddings for intelligent issue management. This system, built as a plugin for UbiquityOS, combines modern NLP techniques with robust data storage to create a sophisticated issue tracking and deduplication system.

Architecture Overview

The system is built as a plugin that processes GitHub issues and comments through a series of specialized handlers. At its core, it uses two main services:

  1. Voyage AI for generating text embeddings
  2. Supabase for storing and querying vector embeddings

The plugin architecture is elegantly structured to handle various GitHub events:

if (isIssueCommentEvent(context)) {
  switch (eventName) {
    case "issue_comment.created":      return await addComments(context);    case "issue_comment.deleted":      return await deleteComment(context);    case "issue_comment.edited":      return await updateComment(context);  }
} else if (isIssueEvent(context)) {
  switch (eventName) {
    case "issues.opened":      await addIssue(context);      await issueMatching(context);      return await issueChecker(context);    // ... other issue events  }
}

Vector Embeddings: The Core Technology

The most fascinating aspect of this system is its use of vector embeddings to understand and process text. The implementation uses Voyage AI’s embedding service with their large instruction model:

async createEmbedding(text: string | null, inputType: EmbedRequestInputType = "document"): Promise<number[]> {
  if (text === null) {
    throw new Error("Text is null");  } else {
    const response = await this.client.embed({
      input: text,      model: "voyage-large-2-instruct",      inputType,    });    return (response.data && response.data[0]?.embedding) || [];  }
}

This converts text into high-dimensional vectors that capture semantic meaning, allowing for sophisticated similarity comparisons between issues.

Intelligent Issue Management

The system implements several advanced features for issue management:

1. Issue Deduplication

One of the most powerful features is the ability to find similar issues using vector similarity search. The implementation uses Supabase’s vector similarity capabilities:

async findSimilarIssues({ markdown, currentId, threshold }: FindSimilarIssuesParams): Promise<IssueSimilaritySearchResult[] | null> {
  const embedding = await this.context.adapters.voyage.embedding.createEmbedding(markdown);  const { data, error } = await this.supabase.rpc("find_similar_issues", {
    query_embedding: embedding,    current_id: currentId,    threshold,    top_k: 5,  });  // ... error handling  return data;}

This allows the system to:

2. Privacy-Aware Storage

The system implements privacy-conscious storage of issue data: