Hi, guys, any plans to support vector clustering? It looks like this could be done pretty easy given the index architecture
Last active 4 months ago
29 replies
14 views
- DM
Hi, guys, any plans to support vector clustering? It looks like this could be done pretty easy given the index architecture
- AN
Hi @Dmitry! Could you please elaborate how do you see clustering in the engine?
- DM
maybe, get k cluster centers
- AN
perform clustering on request?
- DM
yes
- AN
but clustering is very time-consuming operation
- DM
I believe the index is a tree, so that I would maybe interested to access the tree nodes
- AN
hnsw index is a graph, there is no well-defined center. Also index is separated into multiple segments, so each segment would have it's own cluster.
- AN
Also which clustering algorithm do you have in mind?
- DM
Now I am taking a tiny random subset of data, do pairwise distance and perform a hierarchical/agglomerative clustering
- DM
that a take a mean vector for each cluster to find an approximate cluster centers
- AN
what qdrant can do is to calculate distance matrix for this subset in one API call, should be pretty fast
- AN
you can use recommendation batch request with filters on subset point ids
- DM
I am taking random subset from postgres
- AN
taking random subset and calculating distances somewhere else might be slower, cause you need to transfer vectors over the network.
- DM
is there a way to get random things from qdrant?
- AN
I don't think so. What you would need to generate Ids of the subset externally
- DM
and query ids as a payload filter?
- AN
yes
- DM
smart
- DM
but how recommendation api would get me a distance matrix?
- AN
you wound need to make recommendation batch request
- AN
so it would be multiple recommendation requests in one api call
- AN
all with same filter
- AN
but different positive ID
- DM
I am following you
- AN
https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/recommendbatchpoints
- AN
internally, we have an optimization to group requests with the same filter, so filter will be reused between request inside the batch
- DM
I will dig into it, thank you
Last active 4 months ago
29 replies
14 views