Data Base system - L1
这篇文章展示了数据库系统的学习记录
1.1 What is Database(DB)?
A non-redundant, persistent(like store in hard disk) collection of logically related records/files that are structured to support various processing and retrieval needs.
1.2 Database Management System
A set of software programs for creating, storing, updating, and accessing the data of a DB.
1.3 Basic concept and terminologies
-
instance (compared to the instance in oop)
- the collection of data (information) stored in the DB at a particular moment (ie, a snapshot)
- The data in the database at a particular moment in time is called a database state or snapshot. It is also called the current set of occurrences or instances in the database.
-
scheme/schema
- the overall structure (design) of the DB – relatively static
1.4 Data Abstraction: 3-level architecture
-
Internal/Physical Level: the complete details of data storage and access paths for the database. (i.e., SSD,Cloud,disk)(how data are actually stored)
- Internal level has an internal schema, which describes the physical storage structure of the database.
-
Conceptual Level: which describes the structure of the whole database. For a 2D array, it will be stored as a “table” in database. It hides the detail of internal level.(what data are actually stored)
- The conceptual schema hides the details of physical storage structures and concentrates on describing entities, data types, relationships, user operations, and constraints.(data model)
-
External Level: It only privides the data which the user wants and hide the rest of the datbase.
Consider a two-dimensional integer array of size n*m
- The physical level would simply be n × m (probably consecutive) storage locations
- The conceptual level is a grid of boxes, each possibly containing an integer, which is n boxes high by m boxes wide.
- There are $2^{n×m}$ possible views (external level). For example, a view might be the entire array, or may be a part of array.
1.5 Data Independence
- The ability to modify a schema definition in one level without affecting a schema in the next higher level
- Physical data independence: capacity to change the internal schema without having to change the conceptual schema
- Logical data independence: the capacity to change the conceptual schema without having to change external schemas or application programs.
DB System Architecture
2 The Entity-Relationship Model(实体关系模型)
2.1 Basic Definition
2.1.1 ER模型的作用
- 数据库设计的第一步是需求分析阶段,也是最重要的一步
- 概念建模也就是ER建模,将需求以ER图的方式进行可视化
- ER图可以描述一个数据库的逻辑结构
2.1.2 ER模型的组成
2.1.2.1 entity
-
entity(实体): a distinguishable object with an independent existence
- 一个entity可以是具体的(如:人,书,CityU), 也可以是抽象的(如:假期)
-
entity set: a set of entities of the same type
- Like: Student Bank……
2.1.2.2 attribute
- attribute(属性): 一个entity用一组attributes来表示,每一个attribute有一个对应的值,比如一个人可以有name,SSN,gender,phone number等attributes
- Different type of Attributes:
- Simple: Each entity has a single atomic value for the attribute. For example SSN or Sex
- Composite: The attribute may consist of several components. For example, Name(FirstName, MiddleName, LastName).
- Multi-valiue: An entity may have multiple values for that attribute. For example, Color of a CAR, denoted as {Color}.
In general, composite and multi-valued attributes may be nested.
- Domain: 一个attribute的domain是这个attribute值的有效范围
- Formally, an attribute A is a function which maps from an entity set E into a domain D: $$A: E\to D$$
- Different type of Attributes:
2.1.2.3 relationship
-
Relationship: the relationship(association) among several entities
An attribite $A:E\to D$ is a “simplified form of a relationshihp”
- if we allow D to be an Entity Set, then A becomes a relationship
A realtionship can carry attribute
- Example: Patrick takes cs2450 with a grade of B+. “takes” is relationship bteween “Patrick” and “CS2450”, “B+” is the attribute of this relationship, since “B+” is neither the attribute of “Patrick” nor “CS2450”, which is “Patrick takes cs2450”
- The degree of a relationship set/type is the number of participating entity sets/types.
- Both MANAGES and WORKS_ON are binary relationships.
- In general, an n-ray relationship is not equialent to n binary relationships.
- The relationship has participation constraint
- partial participation: single line (not all entity participate)
- total participation: double line (every entity participate)
- (min,max) notation for relationship
- Specified on each participationof an entity set E in a relationship set R
- Specifies that each entity e in E participates in at least min and at most max relationship instances in R
- Default (no constraint): min=0, max=n ! Must have: $min<=max,min>=0,max>=1$
(a) is not equivalent to (b) but to (c).
2.3 ER diagram
2.3.1 ER图的基本构成
- Rectangles(正方形): Entity Sets
- Ellipses(椭圆形): Attributes
- Diamonds(菱形): Relationship Sets
- Lines: Attributes to Entity/ Relationship Sets or, Entity Sets to Realtionship Sets
- 不同的键(Key): to distinguish individual entities or relationships
- Superkey (超键): a set of one or more attributes which, taken together, identify uniquely an entity in an entity set
- 在关系中能唯一标识元组的属性集称为关系模式的超键
- Example: 一般来说,学号是标识学生实体的唯一标识,所以该元组的超键就为学号。我们可以将学号和其他属性结合起来,(学号,性别),(学号,年龄)等也为超键。
- Candiaite key(候选键): minimal set of attributes which can identify uniquely an entity in an entity set
- 最小的超键,仅包括独一标识实体所必需得最小属性的数量,即不含多余属性的超键为候选键
- Example: stident ID identify a student, but Name is not a candidate key
- Superkey (超键): a set of one or more attributes which, taken together, identify uniquely an entity in an entity set
候选键是超键的子集
- Primary key(主键): a candidate key chosen by the DB designer to identify an entity in an entity set
- 是从候选码(candidate key)中人为挑选的一个
- Foreign key: FK is a set of attributes of R1, which refers to the PK(primary key(s)) of R2
- Domain of FK should be the same as PK
- t1[FK] in R1 either occurs as a value of t2[PK] in R2 or is NULL.
- 实体间的联系
- Integrity Constraints
- One-to-one(一对一): 例如班级和班长之间就是一对一联系,一个班级只有一个班长
- One-to-many(一对多/多对一): 例如一个班级有一个辅导员,但担任辅导员的老师可以同时担任其他班级的辅导员,所以辅导员和班级之间就是一对多联系
- Many-to-many(多对多): 例如一个学生可以学习多门课程,每门课程也可以有多个学生,那么学生和课程之间的关系就是多对多联系
- Weak Entity Set
- An entity set that does NOT have enough attributes to form a primary/candidate key
学习笔记,仅供参考