I was recently asked by somebody to answer some questions regarding MongoDB. Unfortunately, I have yet to use it in production, but Ara, Zach and I have put it through quite a few paces at this point ...
Nature of Use:
Would be useful if you can mention the nature of application (for ex. reporting or analytics ?) you are using MongoDB for?
We use MongoDB for high volume logging. After what we need is logged, we use Python/PyMongo to transform the data into chunks suitable for Postgres. Postgres is our central data store used for our Django application and all its associated models.What were the other NoSQL storage solutions were evaluated and why MongoDB was chosen against the others?
Cassandra was the other one that we got pretty far with. In terms of maturity and scalability, Cassandra appeared to be the winner. However, Cassandra has extremely limited query capabilities that weren't sufficient for us. In addition, MongoDB has plans to focus on scalability which suited our needs fine.Robustness: How long you have been running MongoDB in production ?
Have not run it in production yet.Did you encounter any issues on stability front (any crashes or restart needed) ?
One issue is how best to keep it 'living' without human intervention. So far, the tools have been very straightforward and simpler than solutions for other products. However, we haven't tested the quality of backups under high load nor have we really pressured the system in the wild. We architected MongoDB in our system so that we could lose it and all we would lose is incoming data while it was down, not historical data or reporting capabilities (which is ok for us for a few hours).Performance: What has been your experience on performance side like (queries/sec for the hardware configuration being used)?
We hit 30 inserts per second on a high cpu (the lowest 64 bit) Amazon ec2 instance. However, the bottleneck was in our Python listener, so we don't know how much higher MongoDB could go. We suspect quite alot as the load average was under .2 during this test.Did the performance degraded when datasize grew?
We haven't sufficiently tested this yet.Scalability: What is the rough datasize (number of records, number of collections, size on the disk?) Mongo is being used for?
The goal is to hit 1k inserts/second with real time processing (i.e. using their upsert functionality which is something like INSERT ... ELSE UPDATE) and to hold onto 10M+ records in a collection. If we weren't confident in that being possible, we would not have chosen MongoDB.Does all the data sit in one MongoDB server or you are using MongoDB in a clustered environment ?. If being used in sharded environment, would like to know your experience because MongoDB does not support auto-sharding out of the box?
We are using sharding, but again, we have not pushed it to the limit. Although it does not support auto-sharding, manually setting up a shard is pretty straightforward. This is one of the advantages Cassandra has.DataReplication/Persistence: Did you use data-replication in Mongo? What has been the general experience with it?
We are planning to use replication but are not. As referenced above, we have the option of losing MongoDB for a few hours and not incurring a major business penalty.Regarding persistence of data, did you encounter any issues given that MongoDB does lazy writes to the file system?
No, but again it has not been pushed enough for me to feel confident that this is a non-issue. We are planning using XFS however which does have journaling to account for problems at the file block level.Search: Did your application required text-searches on the documents stored in Mongo? Since MongoDB does not support text-search out of the box, how did you take care of search?
We aren't using full text search. Our goal with regards to that is to setup Sphinx or something similar when we need something like that. That seems like the right architectural solution.Support: Regarding resolving issues related to Mongo, did you rely on the open-source community or signed up for the paid-support? What has been your experience ?
Community.Client-side tools: Which libraries did you use talking to MongoDB server ? We have web-app to be running in Python and there are two libraries available for Python.
PyMongo.Would be great if you can share(pointers) to client-side tools you are using with MongoDB ?
The Mongo interface is a bit chunky (the way it uses JSON for everything), so often I just use PyMongo since all of our real code uses that anyway. Our plan is to only have a small number of collections so any necessary queries would happen through our code, not in an ad hoc way requiring a client gui or something like that.