NoSQL – What Is It?
NoSQL stands for Not Only SQL, a database used in real time web applications for storing and retrieving data. In NoSQL, data storage is fluid and helps maintain flexibility.
It was introduced in the early 2000s and focused on some specific attributes like processing large volumes of data and distributing it immediately to computing clusters. The flexibility of data schema enables it to change at a quicker speed as compared to the previous systems. This was very useful for apps that were being updated.
Generally, there are four types of databases;
Some examples of document databases include MongoDB, MarkLogic and CoucHBase Server. DocumentDB and Couch DB are also other examples of document databases.
Graph Stores contain information related to data networks. Some examples include Neo4j and Graph, where the data is stored in the form of nodes. The nodes represent records as seen in a relational database. They also include edges that are a representation of the connection present between the nodes. It can be said that this system stores in a key of data storage between the relationship of nodes. These are comparatively richer and finer representations.
When compared to relational databases, graph stores are more dependent on stricter schemas. The thing about graph stores is that it evolves at a certain time. It even changes with the way it is being used. They are very helpful in maintaining customer relationships and their respective management. Examples of Graph Stores include Titan, IBM Graph, Allegro Graph and Neo4j.
In Key-Value Stores, every item has an attributed name and a value. Key-value stores like Riak and Berkeley DB are comparatively simple which enable them to perform better than others and are also scalable. It works especially well for caches related to web applications and for managing sessions. Their performance varies when working with disk drives, solid state drives or RAMs.
There are also other examples of key-value stores like Redis, Memcache DB and Aerospike.
Wide Column Stores are better for queries over large datasets and are known to organize the given data in a tabular format as opposed to a row format. When it comes to large volumes of data, these are comparatively faster than any relational database and are especially useful for storing data in recommendation engines that can also be used for detecting any kind of fraud.
Wide Column Database is used to store catalogs. Some examples of wide column databases are Cassandra and HBase, that are explained in some detail below. Another widely known wide column database is Google Big Table.
Why do we need NoSQL?:
NoSQL has numerous advantages like
- Ability to handle a higher quantity of data load
- Supports unstructured text and data
- Flexibility and ability to make changes easily
- Minimum maintenance and management
- Highly scalable
- Better performance
- Low latency
Cassandra & HBase Data Model:
The Cassandra and HBase data models are often thought of as identical twins due to their similar modules. However, there are also a few distinctive aspects as listed below:
- HBase does not have a query language
- Cassandra works on the Cassandra Query Language (CQL)
- Cassandra’s primary key can contain multiple columns
Apache Cassandra Architecture & HBase Architecture:
- The architecture of Cassandra is masterless
- HBase architecture is master based
- HBase has a single failure point
- Cassandra cluster is always available
Performance of Cassandra & HBase:
- HBase takes more effort and is more time-consuming
- HBase has fast and consistent reads
- Cassandra writes the logs and stores cache simultaneously
Cassandra & HBase Security:
- Cassandra enables row-level access
- HBase enables cell level access
- Cassandra has a specified set of user roles
- HBase has a visibility label
Examples of Cassandra and HBase Applications:
Cassandra – Twissandra:
Twissandra, created by Eric Florenzano, is a well-functioning twitter clone. It is sourced through Python and depends on Django and a JSON library to function well and stay organised.
HBase – Java “Hello World” Application:
The hello world application uses the Cloud Bigtable HBase client library. Through this, one can create tables, add new data, read the available data or even delete the table. The app communicates with Cloud Bigtable by using the HBase API.
Disadvantages of Cassandra:
- Does not support ACID properties
- Latency issues
- Slower leads
- Joins are difficult
- Data duplication issues
- Inefficient memory management
Disadvantages of HBase:
- Single point of failure
- Does not support transaction
- Joins are not supported
- Only key sorting is possible
- There are no permissions
- No inbuilt authentication
- Does not support SQL
- Unpredictable latencies
Both Cassandra and HBase have their own pros and cons. Cassandra is comparatively better at management and storage of data while HBase is better at performing intensive reads. The data in Cassandra lacks consistency but it makes up for it with better write structures.
Our expert team at Smart Sight Innovations provides options for database development based on your business requirements, to help you arrive at an informed decision.