dataeng

There are generally two types of databases which have different structure and models.

DBMS

Related: RDS, DynamoDB,

RDMBS (SQL) vs NoSQL


SQL:

  • SQL/RDMBS - interchangeable terms
  • Has predefined structured tables – Rigid Schema
  • Fixed relationship between tables

NoSQL:

  • Document centered rather than table centered.
  • Everything that is not relational.
  • No schema or weak schema
Link to original

ACID vs BASE

It all clocks around CAP theorem - Wikipedia – Consistency, Availability, Partition tolerant aka resilience.

  • Consistency – every read operation, will get most recent write, or it will get error
  • Availability – every request will not return error but without guarantee that it contains is most recent write.
  • Partition tolerance – system is made of multiple network partitions and system continues to operate even when there is a number of dropped messages or errors between these network nodes.
    A DB can have max of two.
ACIDBASE
Atomic, Consistent, Isolated, DurableBasically Available, Soft State, Eventually Consistent
A: Example, bank transaction consisting two parts: 1. Removing 10 to account B ALL or NO components must success or fail.BA: Reads and writes are available, but not consistent. Maybe/kind of. Instead of storing data is durable way it will spread that data across many nodes.
C: Items moves through DB and has one valid state or the other per its rules. Nothing in-between is allowed.S: DB is not consistent and state aware, this is a application’s developer job to ensure that application supports/negates soft state.
I: If multiple transactions execute at the same time, the don’t affect one another. Each executed in full.E: If you wait long enough, data will become consistent.
D: Once transaction is committed and succeeded, it must be durable, withstand power outages or crashed, be record on non-volatile memory.
Generally RDS DB – limits scalingGenerally DynamoDB, high performance

OLAP vs OLTP

Online analytical processing (OLAP) and online transaction processing (OLTP) are data processing systems that help you store and analyze business data. You can collect and store data from multiple sources—such as websites, applications, smart meters, and internal systems. OLAP combines and groups the data so you can analyze it from different points of view. Conversely, OLTP stores and updates transactional data reliably and efficiently in high volumes. OLTP databases can be one among several data sources for an OLAP system.

CriteriaOLAP (Analysis)OLTP (Day-to-day operations)
PurposeOLAP helps you analyze large volumes of data to support decision-making.OLTP helps you manage and process real-time transactions.
QueriesCOMPLEX, aggregate & limited updatessimple transaction & frequent
Data sourceOLAP uses historical and aggregated data from multiple sources.OLTP uses real-time and transactional data from a single source.
Data structureOLAP uses multidimensional (cubes) or relational databases.OLTP uses relational databases.
Data modelOLAP uses star schema, snowflake schema, or other analytical models.OLTP uses normalized or denormalized models.
Volume of dataOLAP has large storage requirements. Think terabytes (TB) and petabytes (PB).OLTP has comparatively smaller storage requirements. Think gigabytes (GB).
Response timeOLAP has longer response times, typically in seconds or minutes.OLTP has shorter response times, typically in milliseconds
Example applicationsOLAP is good for analyzing trends, predicting customer behavior, and identifying profitability.OLTP is good for processing payments, customer data management, and order processing.

OLAP cube

Faster processing via data cube, multidimensional database which works better with multidimensional than traditional DBS and hypercube consists of multiple dimensions