This is one of the more recent projects I worked on to address an issue I’ve faced in the past with Oracle tablespaces becoming fragmented and difficult to shrink after periods of rapid growth followed by drastic reductions in data retention. The culprits were mostly very large (4+ TB) date partitioned tables.
At one point, I put together a system where historically very large tablespaces (5-10+ TB) were broken up into many, much smaller tablespaces and partitioned tables were spread evenly over all available tablespaces. I added a process to a daily partition management job that would facilitate this by recalculating a subpartition template for subpartitioned tables each day based on the representative size of each subpartition value such that the resulting template would evenly spread data over a range of tablespaces. While this did successfully even out tablespace sizes in the database, it did not address the issue of tablespace fragmentation.
Inevitably at some point it would be decided that one or more large tables should be radically reduced in length of retention or dropped entirely. This resulted in a large amount of free space in the many small tablespaces and the theoretical advantage of no tablespace in the database being very large was that the effort to reorganize and shrink any individual tablespace would be small. However, in practice this wasn’t the case due to the fact that the large objects were highly partitioned and Oracle does not really provide an ability to move/rebuild multiple objects at the same time (apart from DataPump export/import which would be a lengthy outage). Each small subpartition had to be moved separately resulting in a huge quantity of expensive DDL statements (expensive due to the very large library footprint of the biggest tables). Ironically the time spent actually moving a subpartition was far less than the time spent parsing the move statement not to mention the contention of running the moves in parallel.
What I’ve proposed now is that rather than focus on keeping the tablespace sizes even, instead rotate partitioned tables through a set range of tablespaces such that every few days, one of the tablespaces in the range is emptied and easy to shrink. I’ve included some graphs of what this would look like. Basically, partitioned tables would be grouped based on similar length of retention and synced such that the same partition dates across all tables in the group would be in the same tablespace. As an example the below large example tables could be grouped together:
||Total Tablespace Span
||Total Assigned Tablespaces
||Maximum Potentional Retention
The tables above are in the same logical group because they share the same tablespace range (TABLESPACE_31 through TABLESPACE_80). This introduces a concept which I refer to as tablespace span (there’s probably a less confusing name for this). This value is the number of date partitions that should be placed in the same tablespace. For example, a tablespace span of 3 as seen in the above table would look like:
The reasoning behind this is that a tablespace span of 1 would require that there be an available, non-repeated tablespace for each partition in the table. This would require hundreds of tablespaces in some cases which would be a great deal of overhead.
The total tablespace span of each table is the number of tablespaces that the table can occupy at one time based on the add offset days, drop offset days, and tablespace span. It is calculated as follows:
total_tablespace_span = FLOOR((retention + future_partitions)/tablespace_span) + 1
The idea here is to keep the total tablespace span less than the total assigned tablespaces in the range. That way there is always one tablespace that is emptied on a regular basis as the tables move out of one tablespace and into another. This gap moves through all tablespaces in the range over time and thereby free storage can be automatically reclaimed from all tablespaces without expensive reorganization. The frequency at which a tablespace is emptied is the same as the tablespace span assigned to the group. In the above example, every third day a tablespace could be shrunk. The reason you would want to assign more tablespaces to each group than required is so that there is a buffer in case it is decided that the retention of a table in the range should be increased. The maximum potential retention a table could be increased to and still support this model is calculated as:
maximum_potential_retention = tablespace_span * (total_assigned_tablespaces – 1) – retention
Below are several graphs to help visualize this:
Obviously this model only works for date partitioned tables. The same idea would work for certain other partitioning schemes that involve the regular addition and dropping of partitions from large tables. The assumption here being that if a table is not partitioned or is statically partitioned, it probably isn’t large enough to be of concern and could be placed in one of several tablespaces reserved for such tables. This would include mostly reference tables and other small stage and load tables that total an insignificant amount of storage when compared to the heavy hitters in the database.
Most of the work in implementing something like this is to carefully and methodically construct groups of tables of similar size and retention as well as account for possible changes in retention. The size of the tables in the group should also be taken into account so that the tablespace range is large enough such that the individual size of each tablespace doesn’t get too far out of hand.
Basic implementation of this process is to alter/setup a regularly scheduled partition management job that will add future date partitions to tables and drop date partitions older than the retention threshold. The future partitions would be added such that the tablespace for each partition would repeat for <tablespace span> days. I can provide some PL/SQL examples for this here if people are interested.
A second job could then be created that queries tablespaces for those that are “empty” and re-sizes their associated datafiles to the high water mark. Empty likely being defined as between two configurable values (less than a maximum used size and greater than a minimum total allocated size. This would prevent the process from attempting to calculate high water marks for and re-size datafiles that aren’t “empty” but also prevent the process from shrinking the same already shrunk tablespaces each day. An example query that could be included in a PL/SQL process:
FROM (SELECT d.tablespace_name,
ROUND((SUM(d.bytes) / 1024 / 1024) - (NVL(SUM(f.bytes) / 1024 / 1024,0))) used_mb,
ROUND(SUM(d.bytes) / 1024 / 1024) allocated_mb
FROM dba_data_files d,
WHERE d.tablespace_name = f.tablespace_name(+)
GROUP BY d.tablespace_name)
WHERE used_mb <= gn_max_used_space_for_shrink
AND allocated_mb >= gn_min_allocated_for_shrink
AND REGEXP_LIKE(tablespace_name, gv_valid_tablespace_regex);