Is Postgres NoSQL Better Than MongoDB?

Is Postgres NoSQL Better Than MongoDB

In general, relational database management systems have been considered as a "one-size-fits-all solution for data persistence and retrieval" for decades. But increasing need for scalability and new application requirements have created new challenges for traditional RDBMS, including some dissatisfaction with this one-size-fits-all approach in some web-scale applications.

The answer to this has been a new generation of low-cost, high-performance database software designed to challenge dominance of relational database management systems.

A big reason for the NoSQL movement is that different implementations of web, enterprise, and cloud computing applications have different requirements of their databases

Example: For high-volume websites like eBay, Amazon, Twitter, or Facebook, scalability and high availability are essential requirements that cannot be compromised. For these applications, even the slightest outage can have significant financial consequences and impacts customer trust.

So, a complete database solution today often must address not only transactional integrity but also higher orders of data volume, increased data velocity and throughput speed, and a growing variety of data formats. New technologies have emerged that specialize in optimizing against one or two if these aspects while sacrificing the others. Postgres with JSON takes a more holistic approach to user needs, more successfully addressing the majority of NoSQL workloads.

Document/Relational Database Comparison

A smart approach to new technology demands a close evaluation of your needs and the tools available to meet those needs. The table below compares NoSQL Document oriented database (such as MongoDB) characteristics and Postgres' Relational / Document database characteristics to help you navigate matching the right solution to your requirements.

Source: EnterpriseDB Web Site

However the document in MongoDB is automatically enhanced with a "_id" field, if it is not present. When you want to get this one document, you can use this "_id" field - it behaves exactly like the primary key from relational databases. PostgreSQL stores data in tables' rows while MongoDB stores data as JSON documents. On one hand MongoDB seems like a great solution, as you can have all the different data from different PostgreSQL tables in just one JSON document. This flexibility comes with a cost of no constraints on the data structure, which can be really appealing at the first moment and really frustrating when you have a huge database and some of the documents contain bad values or there are missing some fields.

PostgreSQL 9.3 comes with great features which can turn it into a NoSQL database, with full transaction support, storing JSON documents with constraints on the fields data.

Simple Example

I will show how to do it using a very simple example of a table with Employee. Each Employee has a name, description, some id number and salary 

PostgreSQL Version

The simple table in PostgreSQL can look like:

CREATE TABLE emp (
     id SERIAL PRIMARY KEY,
     name TEXT,
     description TEXT,
     salary DECIMAL(10,2)
);

This table allows us to insert Employee like:

INSERT INTO emp (name, description, salary) VALUES ('raju', ' HR', 25000.00);

Unfortunately the above table also allows for adding rows missing some important information:

INSERT INTO emp (name, description, salary) VALUES (null, -34, 'sdad');

This should be fixed by adding constraints in the database. Assume that we want to always have unique not empty name, not empty description, non negative salary. The table with such constraints is:

CREATE TABLE emp (
    id SERIAL PRIMARY KEY,
    name TEXT UNIQUE NOT NULL,
    description TEXT NOT NULL,
    salary DECIMAL(10,2) NOT NULL,
    CHECK (length(name) > 0),
    CHECK (description IS NOT NULL AND length(description) > 0),
    CHECK (salary >= 0.0)
);

Now all the operations, like adding or modifying a row, which violate any of those constraints, just fail. Let's check:

INSERT INTO emp (name, description, salary) VALUES ('raju', 'HR', 25000.00);
--INSERT 0 1
INSERT INTO emp (name, description, salary) VALUES ('raju', 'HR', -1);
--ERROR: new row for relation "emp" violates check constraint "emp_salary_check"
--DETAIL: Failing row contains (2, raju, HR, -1).

NoSQL Version

In MongoDB the inserted row in the above table, would be just a JSON looking like this:

{
    "id": 1,
    "name": "raju",
    "description": "HR,
    "salary": 25000.00
}

similarly,

In PostgreSQL we can store this JSON as a row in the emp table:

CREATE TABLE emp (
     data TEXT
);

This works like most of the NoSQL datatabases, no checks, no errors with bad fields. As a result, you can modify the data the way you want, the problem begins when your application expects that the salary is a number, and you get a string there, or there is no salary at all.

Validate JSON

In PostgreSQL 9.2 there is the nice type for that, it is named JSON. The JSON type can store only a proper JSON, there is validation performed before converting into this type.
Let's change the definition of the table to:

CREATE TABLE emp (
     data JSON
);

We can insert correct JSON into this table:

INSERT INTO emp(data) VALUES('{
    "id": 1,
    "name": "raju",
    "description": "HR",
    "salary": 25000.00
}');
--INSERT 0 1
SELECT * FROM emp;
 { +
    "id": 1, +
    "name": "raju", +
    "description": "HR",+
    "salary": 25000.00 +
 }
--(1 row)

This works, but inserting not a valid JSON ends with an error:

INSERT INTO emp(data) VALUES('{
    "id": 1,
    "name": "raju",
    "description": "HR",
    "price": 25000.00,
}');
--ERROR: invalid input syntax for type json

The problem with formatting can be hard to notice (I've added comma after the last field, JSON doesn't like it).

Validating Fields

OK, so we have a solution which looks almost like the first native PostgreSQL solution: we have data which validates. It doesn't mean the data is sensible.
Let's add checks for validating the data.
In PostgreSQL 9.3, there are some new great features for manipulating JSON values. There are defined operators for the JSON type, which give you easy access to the fields and values.
I will use only one operator '->>', but you can find more information in PostgreSQL documentation.
I also need to validate the types of the fields, including id field. This is something Postgres just checks because of the types definitions. I am going to use some other syntax for the checks, as I want to name it. It will be easier to look at problem with specific field instead of searching through the whole huge JSON.

The table with the constraints looks like this:

CREATE TABLE emp (
    data JSON,
    CONSTRAINT validate_id CHECK ((data->>'id')::integer >= 1 AND (data->>'id') IS NOT NULL ),
    CONSTRAINT validate_name CHECK (length(data->>'name') > 0 AND (data->>'name') IS NOT NULL )
);

The "->>" operator allows me to get the value of a specific field from JSON, check if it exists and validate it.
Let's add a JSON without a description:

INSERT INTO emp(data) VALUES('{
    "id": 1,
    "name": "", 
    "salary": 1.0
}');

--ERROR: new row for relation "emp" violates check constraint "validate_name"

There is one more validation left. The id and name fields should be unique. This can be easily done with two indexes:

CREATE UNIQUE INDEX ui_emp_id ON emp((data->>'id'));
CREATE UNIQUE INDEX ui_emp_name ON emp((data->>'name'));

Now when you try to add a JSON document which id which already exists in database, then you will have an error like:

--ERROR: duplicate key value violates unique constraint "ui_emp_id"
--DETAIL: Key ((data ->> 'id'::text))=(1) already exists.
--ERROR: current transaction is aborted, commands ignored until end of transaction block

Performance

Postgres today handles the most demanding needs of the largest insurance companies, banks, brokerages, government agencies, and defense contractors in the world today and has done so for years.  Performance improvements for Postgres are continuous with each yearly release and include great performance for its unstructured data types as well.

Source: EnterpriseDB White Paper: Using the NoSQL Capabilities in Postgres

To experience Postgres' NoSQL performance for yourself. Download pg_nosql_benchmark  at GitHub.