Dbs101_flippedclass6
Topic: Nonrelational Databases
Let’s get started
For this flipped class, we had two groups: the expert group and the home group. As for the expert group, students were divided into group of 4 students forming 6 groups and divided into groups of 6 for the home group. The 6 expert groups were supposed to explore the type of non-relational databases, understand the advantages, disadvantages and applications of the type of non-relational databases given to the group and be able to explain it to their home group. The flipped class session was an hour long where students were given 20 minutes for group discussion, 25 minutes to present what they have discuused with the expert group to the home group and 15 minutes for the Q&A session.
Expected Outcomes
- Explore different types of non-relational databases.
- Understand the advantages and disadvantages of different types of non-relational databases.
What is NoSQL ?
It is a non-relational database that is used to store the data in the non-tabular form. NoSQL stands for Not only SQL.
Types of NoSQL Databases
Document-based databases
A document-based database is a type of database that uses document to store data, which are usually in formats like JSON,BSON,or XML rather than storing data in tabular form like relational databases.
- Advantage:
- Schema flexibility: Documents can have varying structures without a predefined schema.
- Horizontal scalability: Easily distribute data across multiple servers for high scalability.
- Improved performance: Faster read/write operations, especially for complex data.
- Native support for hierarchical data: Simplifies storage of nested data structures.
- Developer-friendly APIs: Intuitive APIs for seamless integration with applications.
- Disadvantage:
- Lack of ACID transactions: Sacrifices strict transactional guarantees for performance.
- Limited support for complex queries: May struggle with complex queries involving multiple documents.
- Potential for data duplication: Denormalization can lead to redundancy and inconsistency.
- Learning curve: Requires adaptation for developers accustomed to relational databases.
Application:
These databases are designed to handle semi-structured or unstructured data, making them well-suited for various applications, including content management systems, blogging platforms, e-commerce websites, and more.
Key-Value based databases or key-value store
It is the simplest form of NoSQL database where data is stored as pairs of keys and values. Each element within the database is uniquely identified by its key, allowing retrieval of data by specifying the corresponding key and values can range from simple data types like strings and numbers to more complex objects.
- Advantage
- Simplicity: Easy data management with only key and value columns.
- Scalability: Seamless expansion to handle increasing data and workload.
- Speed: Fast read and write operations due to efficient indexing.
- Disadvantage
- Limited Query Capabilities: Difficulty in complex data analysis and retrieval.
- Data Structure Constraints: Not ideal for complex data relationships.
- Data Consistency: Challenges in maintaining consistency, especially in distributed setups.
- Application
- Caching Systems: Improve application performance by storing frequently accessed data.
- Session Management: Efficiently handle user session data for web applications.
- Distributed Systems: Serve as a foundational storage layer for distributed setups.
- Content Delivery Networks (CDNs): Store and retrieve cached content across edge servers.
- Real-time Analytics: Support real-time data processing and analysis for insights generation.
Column oriented Databases
A columnar database is a type of database management system (DBMS) that stores data in columns rather than rows, optimizing performance for analytical queries. In a columnar database, each column is stored separately on disk, allowing for efficient retrieval and processing of specific columns relevant to a query.
- Advantage:
- Optimized for Analytics: Columnar storage facilitates efficient analytical queries, especially when dealing with a large number of columns.
- Compression: Data is often compressed within each column, reducing storage requirements and improving query performance.
- Scalability: Column-oriented databases can scale horizontally to handle large datasets and high query loads by distributing data across multiple nodes.
- Aggregation Performance: Well-suited for operations like aggregations, filtering, and complex analytical queries due to the structure of columnar data storage.
- Data Retrieval Efficiency: Read operations are faster, especially when accessing only specific columns, reducing disk I/O and improving overall performance.
- Disadvantage:
- Suboptimal for transactional data: Less efficient for transactional operations with frequent updates.
- Complexity: Requires specialized knowledge for optimization and management.
- Higher costs: May involve higher upfront costs compared to traditional relational databases.
- Applications:
- Data warehousing: Storing and analyzing vast amounts of historical data.
- Business intelligence: Powering analytics and reporting for data-driven decision-making.
- OLAP systems: Supporting multidimensional analysis for complex queries.
- Log analytics: Efficiently analyzing large volumes of timestamped data.
- Time series data analysis: Handling time-based data for monitoring and performance analysis.
Graph Databases
Graph databases are a type of NoSQL database that uses graph structures with nodes, edges, and properties to represent and store data. The connections between the nodes are called links or relationships.
- Advantage:
- Flexible data modeling with a graph structure.
- Optimized query performance for highly connected data.
- Scalability to handle large and growing datasets.
- Rich query language optimized for graph operations.
- Disadvantage:
- Complexity of queries not aligned with graph traversal.
- Scalability challenges with dense graphs.
- Learning curve associated with graph data modeling.
- Application:
- Social networks for modeling user relationships.
- Recommendation engines for personalized suggestions.
- Network and IT operations management.
- Fraud detection for identifying suspicious patterns.
- Knowledge graphs for semantic search and question answering.
Vector Databases
Vector databases, also known as vector stores or vector databases, are a specialized type of NoSQL database designed to efficiently store, query, and manipulate high-dimensional vectors. In the context of databases, a vector refers to a mathematical representation of data points in a multi-dimensional space.
- Advantage:
- Efficient storage and retrieval of high-dimensional vectors.
- Fast similarity search and nearest neighbor queries.
- Support for vector-specific operations and indexing.
- Scalability to handle large volumes of vector data.
- Well-suited for machine learning and recommendation systems.
- Disadvantage:
- Limited support for non-vector data types and complex queries.
- Higher complexity in data modeling and indexing for high-dimensional spaces.
- Learning curve associated with vector-specific query languages and operations.
- Application:
- Anomaly detection and pattern recognition in cybersecurity and fraud detection systems.
- Machine learning, recommendation systems, and similarity search.
Time-series Databases
Time series databases are specialized databases optimized for storing, querying, and analyzing time-stamped data points. They are designed to efficiently handle large volumes of sequential data points collected at regular intervals over time.
- Advantage:
- Efficient retrieval of time-stamped data points.
- Scalable to handle high write throughput and large data volumes.
- Specialized data structures and indexing for fast time-based queries.
- Built-in data compression for storage efficiency.
- Disadvantage:
- Limited support for complex non-time-based queries.
- Risk of data loss with improper retention policies.
- Learning curve for specific query languages and data modeling.
- Application:
- IoT and sensor data monitoring.
- Financial markets and trading analysis.
- Infrastructure monitoring and DevOps.
- Energy and utilities for grid monitoring.
- Healthcare and life sciences for patient monitoring.
Experience
I would rate this flipped class 4 on 5. I also played a part in explaining my topic to the home group and my home group friends also did a very good job in explaining it to us.