Today, I want to dive deep into an impressive implementation that leverages vector embeddings for intelligent issue management. This system, built as a plugin for UbiquityOS, combines modern NLP techniques with robust data storage to create a sophisticated issue tracking and deduplication system.
The system is built as a plugin that processes GitHub issues and comments through a series of specialized handlers. At its core, it uses two main services:
The plugin architecture is elegantly structured to handle various GitHub events:
if (isIssueCommentEvent(context)) {
switch (eventName) {
case "issue_comment.created": return await addComments(context); case "issue_comment.deleted": return await deleteComment(context); case "issue_comment.edited": return await updateComment(context); }
} else if (isIssueEvent(context)) {
switch (eventName) {
case "issues.opened": await addIssue(context); await issueMatching(context); return await issueChecker(context); // ... other issue events }
}
The most fascinating aspect of this system is its use of vector embeddings to understand and process text. The implementation uses Voyage AI’s embedding service with their large instruction model:
async createEmbedding(text: string | null, inputType: EmbedRequestInputType = "document"): Promise<number[]> {
if (text === null) {
throw new Error("Text is null"); } else {
const response = await this.client.embed({
input: text, model: "voyage-large-2-instruct", inputType, }); return (response.data && response.data[0]?.embedding) || []; }
}
This converts text into high-dimensional vectors that capture semantic meaning, allowing for sophisticated similarity comparisons between issues.
The system implements several advanced features for issue management:
One of the most powerful features is the ability to find similar issues using vector similarity search. The implementation uses Supabase’s vector similarity capabilities:
async findSimilarIssues({ markdown, currentId, threshold }: FindSimilarIssuesParams): Promise<IssueSimilaritySearchResult[] | null> {
const embedding = await this.context.adapters.voyage.embedding.createEmbedding(markdown); const { data, error } = await this.supabase.rpc("find_similar_issues", {
query_embedding: embedding, current_id: currentId, threshold, top_k: 5, }); // ... error handling return data;}
This allows the system to:
The system implements privacy-conscious storage of issue data: