In the age of digital transformation, there has been a proliferation of social media, mobile, cloud, IoT (Internet of Things) in last couple of years. People and businesses are using these mediums/technologies with others/customers to drive their business and stay connected. These technologies are always available, deliver high customer experience and support large number of concurrent users. Users perform billions of interaction on these platforms thereby generating humongous data. This data is usually unstructured and heterogeneous in nature.
The earlier Database systems aka RDBMS are finding it difficult to store and process such large amounts of heterogeneous data as businesses want (near) real time data management which heightens the level of scalability and speed requirements. This is where users/organization are turning to NoSQL databases as they provide:
What is NoSQL?
Cons of NoSQL
The earlier Database systems aka RDBMS are finding it difficult to store and process such large amounts of heterogeneous data as businesses want (near) real time data management which heightens the level of scalability and speed requirements. This is where users/organization are turning to NoSQL databases as they provide:
- Better application development productivity through more flexible data model
- Greater ability to scale dynamically to support more users and data
- Improved performance to satisfy expectations of users wanting highly responsive applications and to allow more complex processing of data
- Some call it Not only SQL, some call it non SQL. However the term NoSQL gained popularity in early 21st century
- It represents class of non-relational data storage systems
- These databases usually do not require a fixed table schema nor do they use joins
- All NoSQL offerings relax one or more of the ACID (The famous RDBMS acronym) properties and follow BASE properties and principles of CAP Theorem
BASE properties
- Basically Available: possibilities of faults but not fault of whole system
- Soft state: copies of data item may be consistent
- Eventually consistent: copies becomes consistent at some later time if there are no more updates to that data item
CAP Theorem (Brewers' Theorem)
Computer scientist Eric Brewer proposed that a distributed system can achieve any two of below simultaneously
Computer scientist Eric Brewer proposed that a distributed system can achieve any two of below simultaneously
- Consistency: all nodes of a system are in a consistent state after the execution of an operation and see the same data at the same time
- Availability: clients can always read and write data in a specific period of time
- Partition Tolerance: the ability of the system to continue operation in the presence of network partitions
Most of the the large system will partition at some point, therefore it is mainly to decide between consistency and availability. Traditional databases (RDBMS) prefer consistency over availability and partition tolerance whereas most web application choose availability.
NoSQL - Different Data models
NoSQL databases leverages different data models based on the target functionality/use case. Some of the popular ones are:
- Key-Value store: Redis, MemecachedDB, BerkeleyDB
- Column Store: Cassandra, Hbase
- Document Store: MongoDB, CouchDB, Terrastore
- Graph Database: OrientDB, Neo4J, InfineGraph
Benefits of NoSQL
- Easy to implement
- Can scale horizontally and vertically
- Quickly process large amounts of data
- Flexibility due to schema-less design
- Relax the data consistency requirements (CAP)
- Can easily handle large web scale heterogeneous data
Cons of NoSQL
- Data is generally duplicated, potential for inconsistency
- No standardize Schema
- No standard format for queries
- Difficult to impose complicated structures
- Depend of application layer to enforce data integrity
No comments:
Post a Comment