Loop Control Statements in C Introduction to HTML How to use the Github API The image tag, anchor tag and the button tag Ordered and Unordered Lists in HTML The division tag HTML Forms Tables in HTML Introduction to C Programming Introduction to Python Varibles and Datatypes in Python Operators in Python Typecasting in Python Input and Output in Python If Else in Python Loops in Python Break, Continue and Pass in Python Python practice section 1 Lists in Python Tuple in Python

Classification of data

Data is classified into the following types:

  1. Structured Data
  2. Semi-Structured Data
  3. Unstructured Data

Structured Data

Structured data refers to data that is organized and formatted in a consistent manner, following a specific data model. This type of data is typically stored in databases, spreadsheets, and tables, making it easy to search, retrieve, and analyze.

Key characteristics of structured data include:-
  1. Organized Format: Structured data is organized into rows and columns, where each column represents a specific attribute or field, and each row represents a record or entry.
  2. Easy to Query: Due to its well-defined structure, structured data can be queried using database query languages like SQL.
  3. Efficient Storage: Structured data is often stored in relational databases, which optimize storage and retrieval processes for structured information.

Examples of structured data include:
  1. Employee records in an HR database (columns: employee ID, name, position, salary).
  2. Sales transactions in an e-commerce database (columns: order ID, customer ID, product ID, quantity, price).

Employee records
employee ID name position salary
1 Akshay Software Developer 50,000
2 Manish Web Developer 50,000

Semi-Structured Data

It is information that does not reside in a relational database or excel sheet but has some organizational properties. With some effort we can store them in relational database. Examples are JSON and XML.

Key characteristics of semi-structured data include:
  1. Flexible Schema: Semi-structured data does not require a predefined schema like structured data. Instead, each data entry can have varying attributes and fields.
  2. Variability: Different data entries in semi-structured data can have different attributes, and the same attribute might not exist in all entries.
  3. Readable by Humans and Machines: Semi-structured data is typically human-readable due to its use of tags and identifiers. However, it can also be processed by machines for data analysis.

Examples of semi-structured data formats include:
  1. XML (eXtensible Markup Language): XML uses tags to define data elements.
  2. Example:
    
    <book>
        <title>Introduction to Data Science</title>
        <author>John Smith</author>
        <year>2022</year>
    </book>                          
    
    
  3. JSON (JavaScript Object Notation): JSON represents data as key-value pairs and supports nesting. It is widely used for APIs and web services.
  4. Example:
    
    {
        "book": {
        "title": "Introduction to Data Science",
        "author": "John Smith",
        "year": 2022
        }
    }
    

Unstructured Data

Unstructured data refers to data that lacks a predefined structure or does not fit neatly into a traditional tabular or relational format. This type of data can be more challenging to process and analyze compared to structured or semi-structured data.

Key characteristics of unstructured data include:
  1. Lack of Structure: Unstructured data does not have a predefined schema or consistent format. It can vary widely in terms of content, length, and organization.
  2. Multiple Formats: Unstructured data can take various forms, such as text, images, audio, video, social media posts, emails, documents, and more.
  3. Human-Centric: Unstructured data is usually created and consumed by humans.

Examples of unstructured data include:
  1. Textual Data: Emails, social media posts, articles, blogs, and free-form text content.
  2. Images: Photographs, scanned documents, screenshots, and other visual data.
  3. Audio: Voice recordings, podcasts, and sound files.
  4. Video: Recorded videos, live streams, and multimedia content.