Logo-amall

I am on 3.8 conda, will try to install separately to see if that's the issue

Last active 5 months ago

75 replies

10 views

  • MA

    I am on 3.8 conda, will try to install separately to see if that's the issue

  • AN

    Hey, were you able to resolve the issue?

  • MA

    not yet, I installed scipiy separately using brew but was still giving the same error trying to do poetry install, will try to remove from poetry and try again

  • AN

    Weird. Ok, let us know.

  • MA

    Hey, I couldn't do this until now. It still doesn't work and gets stuck during poetry install.

    I tried to use docker-compose as well, and get this error:

     => [4/7] COPY poetry.lock pyproject.toml /code/                                                                                                            0.1s
     => ERROR [5/7] RUN poetry config virtualenvs.create false   && poetry install --no-dev --no-interaction --no-ansi                                          0.8s
    ------
     > [5/7] RUN poetry config virtualenvs.create false   && poetry install --no-dev --no-interaction --no-ansi:
    #9 0.713 Traceback (most recent call last):
    #9 0.713   File "/usr/local/bin/poetry", line 5, in 
    #9 0.713     from poetry.console import main
    #9 0.713   File "/usr/local/lib/python3.8/site-packages/poetry/console/__init__.py", line 1, in 
    #9 0.713     from .application import Application
    #9 0.713   File "/usr/local/lib/python3.8/site-packages/poetry/console/application.py", line 7, in 
    #9 0.713     from .commands.about import AboutCommand
    #9 0.713   File "/usr/local/lib/python3.8/site-packages/poetry/console/commands/__init__.py", line 2, in 
    #9 0.714     from .add import AddCommand
    #9 0.714   File "/usr/local/lib/python3.8/site-packages/poetry/console/commands/add.py", line 8, in 
    #9 0.714     from .init import InitCommand
    #9 0.714   File "/usr/local/lib/python3.8/site-packages/poetry/console/commands/init.py", line 16, in 
    #9 0.714     from poetry.core.pyproject import PyProjectException
    #9 0.714 ImportError: cannot import name 'PyProjectException' from 'poetry.core.pyproject' (/usr/local/lib/python3.8/site-packages/poetry/core/pyproject/__init__.py)
    ------
    executor failed running [/bin/sh -c poetry config virtualenvs.create false   && poetry install --no-dev --no-interaction --no-ansi]: exit code: 1
    ERROR: Service 'web' failed to build : Build failed
    
  • MA

    do you know if anyone can help?

  • AN

    @andrey.vasnetsov do you have an idea?

  • MA

    seems to be an issue when using poetry, I installed openblas using brew and then did pip install scipy and pip install sentencepiece then updated the package versions in poetry.lock then poetry install finally worked.

  • AN

    Cool. Hope you can start to explore now.

  • MA

    thanks 🙂

  • MA

    Using sqlite in the search example is just for demo purposes right? If I have a huge text dataset, I can still use Postgres?

  • MA

    index_df method seems to just use to_sql which should work for any sql databases, just wanted to check if I need to look through more on qdrant client on if it changes anything for postgres during search

  • AN

    The text search is just for comparison. You don't have to implement it if you just want to do the neural search. You can put everything into Qdrant.

  • AN

    Here is a tutorial https://qdrant.tech/articles/neural-search-tutorial/

  • MA

    ok I think it makes sense now

  • MA

    thanks

  • MA

    the neural search uploads to locally running qdrant instance right? and not on the qdrant server

  • AN

    if you specify localhost - it would be a local server

  • MA

    ok thanks

  • MA

    I think it works well when I tried on a small dataset.

  • MA

    is there a way to update collection with new records? I see upsert but it takes type.Point as input, so not sure how I can create that?

  • AN

    point is id + vector + payload

  • AN

    usually payload could be just empty

  • MA

    ok I'll try

  • MA

    sorry one more question:

    which api is to find out if all payload and vector points got uploaded? count of some sort

  • AN

    https://qdrant.github.io/qdrant/redoc/index.html?v=master#tag/collections/operation/get_collection

  • MA

    got it thanks

  • MA

    what is the minimum required VM I can use to deploy the docker image? only for a demo purposes at the moment, will be trying to use helm chart/qdrant cloud later on

  • AN

    you can run it on pretty much anything. service with empty collection need only ~40mb

  • MA

    amazing

  • MA

    my metadata vectors.npy are about 500MB, is that why the docker image shows to be 125mb?

  • AN

    docker image does not contain data

  • MA

    ok got it, so the data is just persisted on disk

  • MA

    I was able to deploy, while uplaoding vector and payloads, I got this

    it worked when I had data on my local and qdrant running locally. This time, I was uploading collection from local to a remote qdrant server. Could that be the issue?

  • AN

    it looks like, there is something malformed in metadata, possibly

  • MA

    mhm, it worked on my mac. I'll check the json file again

  • AN

    it might help for debug to set parallel = 1, batch_size = 1 and see on which step it fails

  • MA

    ok

  • MA

    do we have an example of image similarity using qdrant? I saw the food discovery but couldn't find a notebook code. is it using text or visual similarity?

  • AN

    Visual similarity

  • AN

    Here is a tutorial for image search https://lukawskikacper.medium.com/how-to-implement-a-visual-search-at-no-time-5515270d27e3

    https://github.com/qdrant/demo-hnm the code

  • MA

    oh amazing, thanks

  • MA

    getting this error when uploading:

    UnexpectedResponse: Unexpected Response: 422 (Unprocessable Entity) Raw response content: b'{"result":null,"status":{"error":"Json deserialize error: data did not match any variant of untagged enum ExtendedPointId at line 1 column 68"},"time":0.0}'

  • MA

    422 (Unprocessable Entity)

  • MA

    was trying out the sample code

        collection_name=COLLECTION_NAME,
        wait=True,
        point_insert_operations=PointsBatch(
            batch=Batch(
                ids=ids,
                payloads=payloads,
                vectors=vectors,
            )
        ),
    )
    
  • MA

    all of ids, payloads, and vectors seem to have data as well

  • AN

    what is the point ids?

  • MA

    it's not int, it's a string.

  • MA

    is tht a requirement? I

    class Batch(BaseModel): ids: List["ExtendedPointId"] = Field(..., description="") vectors: List[List[float]] = Field(..., description="") payloads: Optional[List["Payload"]] = Field(None, description="")

  • AN

    qdrant only supports int or uuid as ID

  • MA

    ok I see

  • AN

    uuid could be represented as string in python, but the actual restrictions are in OpenAPI https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/upsert_points

  • MA

    ok makes sense. thanks

  • MA

    seems to be working so far 🙂

  • MA

    it might take a while to generate resnet vector encodings, do you think if I parallelize it in python, it would cause any issues? doesn't seem to be using a lot of gpy

  • AN

    it might be using GPU computational resources instead, not just ram

  • AN

    but if the bottleneck is somewhere else yes, it might make sense

  • MA

    ok makes sense

  • MA

    visual similarity working really good, better than the text search. maybe I need to clean up the search text encoding data and try again

  • MA

    is there a standard way to provide user inputs e.g. if I dislike something, any way we can incorporate that feedback in search results/recommendations?

  • AN

    there is an experimental 'negative' param in the recommendation api

  • MA

    ok, currently only using search api. In what instance do you think we should be using recommendation api vs search api for similarity ?

  • MA

    i see, thanks. I'll compare results

  • EI

    Currently getting the same error when trying to create a collection. this is the complete code I'm running:

    from qdrant_client import QdrantClient
    from qdrant_client.http import models
    
    client = QdrantClient(host="localhost", port=6333)
    
    client.recreate_collection(
        collection_name="txt2img",
        vectors_config={
            "image": models.VectorParams(size=512, distance=models.Distance.COSINE),
            "caption": models.VectorParams(size=512, distance=models.Distance.COSINE),
        }
    )
    

    giving the error

    UnexpectedResponse: Unexpected Response: 422 (Unprocessable Entity)
    Raw response content:
    b'{"result":null,"status":{"error":"Json deserialize error: missing field `vector_size` at line 1 column 284"},"time":0.0}'
    

    I'm running qdrant via docker, version 0.9.1. Python client running via conda. Conda says it's version 0.11.5, but qdrant_client.__version__ gives 0.10.0. Is this related to a version mismatch between the docker container and the python client?

  • AN

    qdrant client is outdated, the actual version if 0.11.5 - https://pypi.org/project/qdrant-client/

  • EI

    thanks for the quick response. it looks like there might be some version issues going on?

    my conda install says 0.11.5 is installed, but when I actually import the module it shows 0.10.0.

    I created a new conda env to test, same thing

    I verified the file path is pulling from the correct env, which contains the qdrant_client-0.11.5.dist-info/ artifacts, so it doesn't look like a path issue.

  • AN

    yeah, we didn't update this internal variable. Just removed it from the repo, thanks for noticing!

  • AN

    Once pip shows that it is 0.11.5, it should be fine

  • EI

    it's still throwing the same error on running recreate_collection. this is with the 0.11.5 python client. do I need to update the docker container? curl localhost:6333 returns {"title":"qdrant - vector search engine","version":"0.9.1"}

  • AN

    then you also have an old service

  • EI

    makes sense. is there a way to update the service while maintaining the current collections I have on it?

  • AN

    there is a breaking change in 0.10.1 - https://github.com/qdrant/qdrant/releases/tag/v0.10.1

  • AN

    you can try to update versions one-by-one though

  • AN

    we guarantee backward compatibility only between two consequent versions

  • EI

    ah I see. I bumped the python client version because I wanted to create a multi-vector collection and saw that was added to the python client. didn't think about the service. I'll sort through the versions.
    thanks for the quick responses!

Last active 5 months ago

75 replies

10 views