Structured Data
Structured data adheres to a defined data model, examples include spreadsheets and relational databases. Each field of data can be singly or jointly accessed with other data fields making information management efficient and flexible.
Unstructured Data
Unstructured data is information without a common data model, typically files with unorganised text containing data which can be non-uniform, repetitious and difficult to understand. Common examples are word processing files and PDF files.
There are systems which are optimised to store documents and are able to to analyse unstructured data e.g. file storage with meta data tags.
Semi-structured Data
Semi-structured data is a form of structured data that has some elements of a data model, and provides a level of hierarchy and tagging that provides some efficiency gains over unstructured data. Examples include XML and JSON which reduces the complexity to analyse structured data, compared to unstructured data.
Metadata
Metadata is data about data. It provides additional information about a specific set of data e.g. information about a file - ref, date.
Properties | Structured data | Semi-structured data | Unstructured data |
---|---|---|---|
Technology | It is based on Relational database table | It is based on XML/RDF(Resource Description Framework). | It is based on character and binary data |
Transaction management | Matured transaction and various concurrency techniques | Transaction is adapted from DBMS not matured | No transaction management and no concurrency |
Version management | Versioning over tuples,row,tables | Versioning over tuples or graph is possible | Versioned as a whole |
Flexibility | It is schema dependent and less flexible | It is more flexible than structured data but less flexible than unstructured data | It is more flexible and there is absence of schema |
Scalability | It is very difficult to scale DB schema | It’s scaling is simpler than structured data | It is more scalable. |
Robustness | Very robust | New technology, not very spread | — |
Query performance | Structured query allow complex joining | Queries over anonymous nodes are possible | Only textual queries are possible |