Logo-amall

seeing more of this broken pipe and socket timeout errors which we had noticed before , https://discord.com/channels/907569970500743200/928649303898075196/1059893176082645043 the only thing which we are having diff is the efs , and not storing data in ebs. ( is anyone else has noticed this errors) now i am seeing at a retrofit calls as well at read time , and happening at read time makes it critical. sharing the stacktrace at readtime

Last active 15 days ago

10 replies

0 views

  • SN

    seeing more of this broken pipe and socket timeout errors which we had noticed before ,
    https://discord.com/channels/907569970500743200/928649303898075196/1059893176082645043
    the only thing which we are having diff is the efs , and not storing data in ebs.

    ( is anyone else has noticed this errors) now i am seeing at a retrofit calls as well at read time , and happening at read time makes it critical.

    sharing the stacktrace at readtime

  • KA

    Hi, could you please provide the details of the request you send to Qdrant? Is there a chance you're setting the timeout on your end?

  • SN

    qdrant.connectTimeout=6000
    qdrant.readTimeout=20000

    its in ms. but yeah we do have timeout setup. this is for hitting recommend api from java app as a retrofit call .( for whihc i have shared the stacktrace )

    and for writing data to qdrant in python app , its 300 . and its set like this. ( qdrant client version is 0.11.5 )

    self.client = QdrantClient(host=app.config['VECTOR_DB_HOST'], port=app.config['VECTOR_DB_PORT'], timeout=Timeout(timeout=app.config['VECTOR_DB_TIMEOUT']))

  • KA

    And how about the recommendation API call? How many negative/positive examples do you provide? Is there any additonal filtering done? How many results do you expect?

  • AN

    The bottleneck could also eventually be EFS cause it's unable to provide the low disk latency per IO operation like EBS.

  • SW

    Yeah 💯

  • SN

    Sorry , for late reply its only one point in positive .I also suspect efs only 😅 but my concern is how can i prove that as if i suggest that as solution ,and issue still persost

  • SN

    Also efs have a benifit of sharing and if its easier to take a backup using aws backup with but with ebs backup would be only s3 and would udeally want to take data backup daily.

  • FA

    You can take ebs backups also daily and even incrementally. I haven't worked much with efs, but it is not meant to be used for workloads like databases and I am not 100% sure if the backups would even be consistent. Ebs Snapshots are convenient to do and can also easily be transferred and recovered.

  • FA

    Proving that efs is the bottleneck is a bit tricky, but you could try and take a look at io_wait for example

Last active 15 days ago

10 replies

0 views