MongoDB is one of the most popular open source NoSQL database available. It avoid the traditional table-based relational SQL database structure to favor of JSON-like documents with dynamic schemas, In this article, Jenny Richards explains why MongoDB is a good NoSQL implementation.
Author: Jenny Richards, Remote DBA Support
What is MongoDB?
MongoDB is a popular distributed, schema-less, document-oriented storage solution that has a lot of similarities to CouchDB and CouchBase makes use of JSON-like documents for representing, modifying, and querying data. Inside MongoDB, data is stored in the binary JSON (BSON) format. The NoSQL solution supports several languages, among the most popular being C++, PHP, Java, Python, and Ruby. 10gen has been improving MongoDB since 2007. MongoDB is now an open source project that has had an AGPL license since 2009.
What is NoSQL?
To understand MongoDB, you need to understand that NoSQL (often called “not only SQL”) is as an alternative to RDBMs (relational database management systems). An effective way of doing this is comparing the two.
- Whereas RDBMs have ACID rules and are transaction-based, a NoSQL system like MongoDB has no ACID rules support and the concept of transactions is non-existent.
- In RDBMs, data is represented in fixed columns in tables. In NoSQL systems, there is no requirement that data be in fixed columns in tables.
- As the name suggests, NoSQL systems do not use the SQL definition and query language.
- In RDBMs, it is not a must that you disintegrate data based on primary key. In NoSQL systems, access of data is over primary keys.
The other reasons why companies likes of Remote DBA Support are offering NoSQL solutions like MongoDB as opposed to SQL solutions are that NoSQL provides for horizontal scaling and is simple in design.
Why MongoDB?
MongoDB is the industry leader when it comes to NoSQL because it is easy to use and has the right features mix to qualify as a prototypical NoSQL solution. In other NoSQL solutions, ease-of-use, especially when it comes to client drivers, seems to be an afterthought. Other benefits that place MongoDB above the rest are:
- MongoDB has a large community. This means any question that you may have about the tool will find a ready answer.
- There is a lot of documentation about MongoDB. High journaling and availability is not common with other NoSQL solutions.
- Many of the DevOps things in MongoDB are free or very affordable.
- MongoDB is built for speed. Memory mapped files are used to store data. A virtual memory manager, which is an optimized modern OS system function, is responsible for the caching/paging. MongoDB also does padding of areas surrounding the document. This allows for their modification without having to move them, helping you save money. The binary format (XML, JSON) instead of text as is the case with some solutions, speeds reads and writes.
- MongoDB allows for easy scaling of writes and reads, with replicas autosharding sets.
- The fact that big names like Craigslist, MTV, Disney, Foursquare, LexisNexis, Shutterfly, bit.ly, The New York Times, Forbes, The Guardian, github, SAP, UK National Archives, and Intuit are using MongoDB is testament to how good MongoDB is.
- The Query Support feature allows you to query specific ranges of intended fields (range query). You are also able to query using regular expressions.
- The Secondary Index Support feature allows you not only to query intended fields, but to also define these fields with the label ‘secondary index) for better data access.
- Another MongoDB interesting feature is the Master-Slave Replication Support. This feature directs the read/write operations to different servers, one as a slave and the other as the master.
- NoSQL is designed for quick iteration, frequent code pushes, and agile sprints while relational models are not. This makes NoSQL a good idea for organizations whose data require a lot of flexibility to manage.
- NoSQL’s architecture is efficient while rational model architecture is monolithic since definition of schemas is necessary before data can be added. As an example, if you want to collect customer data such as their phone number, first, and last name, a SQL database requires to know what you are storing before you can store.
How to Optimize MongoDB
As is the case with most tools, MongoDB has some engineering tradeoffs. However, the most of these challenges have solutions (while the rest are unlikely to affect you negatively).
- MongoDB indexes will not be as flexible as some of the other NoSQL solutions are or even Oracle/MySQL/Postgres since the order of index, which makes use of B-Trees, is a factor. Real time queries are also as fast as those of other NoSQL solutions (this is particularly so when it comes to array fields).
- You can solve these problems by ensuring MongoDB uses the set indexes. You can achieve this easily by using the ‘explain’ function. MongoDB works perfectly as long as you maintain simple queries (not a problem) and if you are willing to do some homework. The large MongoDB community means the tool is constantly improving.
- MongoDB does not have the luxury of text search engines as is the case with some NoSQL solutions, some of which use Lucene for indexing. However, it still supports basic text searches even better than some traditional databases.
- With its first launch, MongoDB attracted criticism for requiring replicas to ensure there was data safety. This problem has since found a solution through 10gen, which improved MongoDB availability and replication by introducing Replica Sets.
- Since MongoDB has a global read/write lock, this means reading and writing operations occur concurrently. Replica sets and shards solve this concurrency problem. The new MongoDB 2.2 has incorporated concurrency at the database level.
- MongoDB, being relatively new, does not have a code base that can compete with those of RDBMs systems and there are no tons of 3rd party development and management tools. However, MongoDB seems to be addressing these issues and seems to work well for a wide variety of applications.
Tips for Effective MongoDB implementation
Consider starting small as this minimizes the risk.
Avoid the common temptation of DIY big data governance. This is a relatively new field and only a person with the right skills and experience should handle it, otherwise you may retrieve data that will be of no help.
About the author
Jenny Richards is database and network specialist. She opines that if you are a PHP, Java, and PHP developer who is new at NoSQL, consider MongoDB solutions from Remote DBA Support as a way to start off.