2. Compression Technology Features under Different Storage Engine Architectures
Posted: Thu Jan 23, 2025 9:19 am
Through the previous discussion, we have a preliminary understanding of the direction of database compression solutions. So, what are the implementation methods of database compression? The following will extract several technical points for analysis, and combine the GaussDB storage kernel principle to list several advantageous use cases.
In current mainstream database systems, storage engines are organized in various ways, including heap-organized tables and index-organized tables. They are based on different storage principles, such as B-tree structure, LSM-Tree structure, as well as key technologies such as append update and in-place update, and constitute the core of the storage engine architecture.
How exactly do these technologies combine with compression technology?
For heap-organized tables, the database allocates storage kazakhstan phone number data space in pages (such as 8KB) and stores rows in the page in the order in which the data is written. Multiple pages can be associated with a table space , so that the overall table space size remains stable; the index structure, usually using B-tree or B+ tree, is used to associate each row of data in the heap table, storing not only the index key but also the row pointer, making the data length relatively small. The heap-organized table and the ordinary index structure are shown in Figure 2 below.
Figure 2 Heap organization table and common index structure
In contrast, the index-organized table is stored in an index-sorted manner. As shown in Figure 3, the leaf nodes of the tree structure store not only the index key but also the primary key, making the data length relatively long and more query-friendly. However, when inserting a large amount of data, addressing is required and a large number of random writes are generated, which may cause the data writing process to be slow.
In current mainstream database systems, storage engines are organized in various ways, including heap-organized tables and index-organized tables. They are based on different storage principles, such as B-tree structure, LSM-Tree structure, as well as key technologies such as append update and in-place update, and constitute the core of the storage engine architecture.
How exactly do these technologies combine with compression technology?
For heap-organized tables, the database allocates storage kazakhstan phone number data space in pages (such as 8KB) and stores rows in the page in the order in which the data is written. Multiple pages can be associated with a table space , so that the overall table space size remains stable; the index structure, usually using B-tree or B+ tree, is used to associate each row of data in the heap table, storing not only the index key but also the row pointer, making the data length relatively small. The heap-organized table and the ordinary index structure are shown in Figure 2 below.
Figure 2 Heap organization table and common index structure
In contrast, the index-organized table is stored in an index-sorted manner. As shown in Figure 3, the leaf nodes of the tree structure store not only the index key but also the primary key, making the data length relatively long and more query-friendly. However, when inserting a large amount of data, addressing is required and a large number of random writes are generated, which may cause the data writing process to be slow.