Overview

This blog just contains some musings about vector databases that you may not have considered. I'll keep it brief and to the point.

The obvious scaling pitfall of vector databases

Let's say you're a startup who has just signed a contract with a large enterprise customer and what this customer wants is retrieval augmented generation over their entire document store.

You say no problem to the following requirements:

.. and because you're moving quickly and aren't afraid to be scrappy, you use Open AI's vector embedding API.

Let's work backwards from how many vectors we think we'll have:

$$N_v = N_{docs} \times N_{pages\ per\ doc} \times N_{paragraphs \ per\ page}$$
$$N_v = 10^5 \times 50 \times 10 = 500,000,000$$

Easy peasy. I think. How much space do I need?

$$S_v = S_{bytes\ per\ vector} \times N_v = 1536 \times 4 \times 500,000,000 \approx 3\ TB$$

"Wait", I hear you say. "Wait, I was just going to run this in Docker and deploy it to ECS".

Nope.

It takes 3TB of RAM to host a database of 100k documents. This is why, when given the choice, you buy an existing solution. Don't try to build this yourself, use Qdrant.

Appendix

Recommendation

I've used Qdrant in the past, they are excellent.

Band-aids

The following are bandaids and although might sound appealing, can often result in you digging a deeper hole.