|
|
|
![]() |
|
|
|
|
|
Tip of the Week
(Archived) The subject line from the OTN forum was BEST WAY TO DELETE OVER A 75,000,000 RECORDS. Further explanation included the fact that the table contained approximately 113,294,685 records (a very accurate approximation) and the query condition WHERE MONTHS_BETWEEN(SYSDATE, TO_DATE(YRMO_NBR||'01','YYYYMMDD')) > 38 It also seems like this is something that needs to be done on a regular basis. I think there is some discrepancy between the amount of data to be deleted, the need for the delete to be recurring, and the time frame, but taking the statements at face value, the requirement is to remove the majority of rows from a large table based on a date. So what are the options? The first one mentioned in the forum was to use CREATE TABLE AS SELECT (CTAS) to create a copy of the table containing only the rows not to be deleted, since there are far fewer of them, drop the original table, and rename the copy. It would work, but it would not be my choice. There would need to be downtime for the table during the process, since new data would have to be prevented from entering the original table while the CTAS was running. There would also be issues dealing with dependencies and, for a period of time, additional disk space is required to hold a second copy of a large table. Another option is partitioning. If this type of delete would need to be done on a ongoing basis, then partitioning is a terrific solution. The table is partitioned by month, the partition that is 39 months old is dropped each month, and a new partition is created for the upcoming month. It's quick, and as a by product, it is possible that query performance improves as well. There is some application redesign involved, but the biggest drawback is cost. This is a pricey solution if you are not already licensed to use partitioning. Partitioning is only available with the Enterprise version of Oracle (list price $40K per processor), and partitioning is even an extra cost option (list price $10K per processor) on top of that. (If anyone is interested, we offer a nice discount off list.) A third option (my suggestion) is to actually figure out a way to do the delete as efficiently as possible. I put together the following example using bulk collect and delete. First I built a test table and populated it with 1,000,000 rows. Each row contains a column with a single digit from 0 through 9.
CREATE TABLE test_tbl Then I used the following PL/SQL script to delete 70% of the rows in the table, trying to somewhat replicate the original request.
DECLARE The procedure can be tweaked by changing the value of l_delete_buffer, set above to 100,000. A larger value could improve performance, but consume more resources, such as memory and undo space. A smaller value would reduce the resource usage, but also limit performance. How was the thread resolved? The initial response to the above approach was "Awesome!" Then the thread went on (and on) with how to tweak the above and partitioning. Note: This tip was tested on Oracle9i. Suggestions or comments? Click here for the tip feedback page. Thank you. |
|
|
|
|