Overview
This blog just contains some musings about vector databases that you may not have considered. I'll keep it brief and to the point.
The obvious scaling pitfall of vector databases
Let's say you're a startup who has just signed a contract with a large enterprise customer and what this customer wants is retrieval augmented generation over their entire document store.
You say no problem to the following requirements:
.. and because you're moving quickly and aren't afraid to be scrappy, you use Open AI's vector embedding API.
Let's work backwards from how many vectors we think we'll have:
Easy peasy. I think. How much space do I need?
"Wait", I hear you say. "Wait, I was just going to run this in Docker and deploy it to ECS".
Nope.
It takes 3TB of RAM to host a database of 100k documents. This is why, when given the choice, you buy an existing solution. Don't try to build this yourself, use Qdrant.
Appendix
Recommendation
I've used Qdrant in the past, they are excellent.
Band-aids
The following are bandaids and although might sound appealing, can often result in you digging a deeper hole.