Sectioning: Shot and Forgot
About partitioning you can find a lot of information , in particular, here you can read about the theory, and then the author develops the idea and offered his solution to quickly add section. I recommend you to familiarize yourself.
After studying the theory, almost everyone comes up with the idea of automating the process of creating sections. Above was one of the options, the second complex option I saw from the creators of the respected, I think, not only me Zabbix.
After a little adaptation, I decided to implement it myself ... Unfortunately, it revealed several shortcomings: when creating a new section, the first record in this section was lost; with a large number of sections, inserting even one record takes too much time (caused by 2 factors: each time a table was calculated where to put the record; using a lot of rules instead of 1 trigger with all conditions). Nevertheless, the guys did an excellent job and I take this opportunity to send them rays of respect.
As a result, I present my decision to your court. To start, an example of how to start partitioning:
Date range:
These commands will start the partitioning by days, weeks and months for the tables test1, test2, test3.
We will analyze the parameters: the
first field ' partitions.test1.cdate ' values separated by a dot:
partitions - the name of the scheme where the sections
test1 will be added - the ancestor table in the current area visibility
cdate - the field by which the table will be divided; the
second field 'date' sets the type of sections the
third field 'day, YYYY_MM_DD' sets the parameters for the
day section - create sections every day
YYYY_MM_DD - the suffix for the table
and finally the fourth now () :: textexample data for creating a section
Another example:
Here, the number will be reduced to 10-bit by adding zeros to the left (if necessary) and all digits to the left of the eighth will be taken for the section name:
1000000000 will go to partitions.test4_10
100000000 to partitions.test4_01
10000000 to partitions.test4_00
10000000000 to partitions .test4_100
Briefly describe how it works:
when you run select create_partition ('partitions.test1.cdate', 'date', 'day, YYYY_MM_DD', now () :: text); First, the "plugin" partition_ date is called. The purpose of this function is to return the section name and the condition of getting into it according to the parameters passed. Further in the main function, a new table is created, inherited from the current one with the necessary conditions, all the ancestor indices are copied.
Then, a query from the system tables pulls the data about the descendant tables and their check condition from this data, creates a trigger that distributes the data into sections.
Now, at any insertion, the trigger will fire and determine the desired table-section, if there is no such table, create_partition is called with the same parameters that were when creating the sections.
For the first example, after a year of filling, the trigger will look like this:
As you can see, it turns out to be a practically classic example from the documentation :)
As you may have guessed, this solution is easy to expand, for example, by partitioning by the first letters of a string or a hash of a string. I suggest writing this “plugin” yourself and posting it in the comments.
I hope someone will find my solution useful.
After studying the theory, almost everyone comes up with the idea of automating the process of creating sections. Above was one of the options, the second complex option I saw from the creators of the respected, I think, not only me Zabbix.
After a little adaptation, I decided to implement it myself ... Unfortunately, it revealed several shortcomings: when creating a new section, the first record in this section was lost; with a large number of sections, inserting even one record takes too much time (caused by 2 factors: each time a table was calculated where to put the record; using a lot of rules instead of 1 trigger with all conditions). Nevertheless, the guys did an excellent job and I take this opportunity to send them rays of respect.
As a result, I present my decision to your court. To start, an example of how to start partitioning:
Date range:
select create_partition('partitions.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text);
select create_partition('partitions.test2.cdate', 'date', 'week,YYYY_IW', now()::text);
select create_partition('partitions.test3.cdate', 'date', 'month,YYYY_MM', now()::text);
These commands will start the partitioning by days, weeks and months for the tables test1, test2, test3.
We will analyze the parameters: the
first field ' partitions.test1.cdate ' values separated by a dot:
partitions - the name of the scheme where the sections
test1 will be added - the ancestor table in the current area visibility
cdate - the field by which the table will be divided; the
second field 'date' sets the type of sections the
third field 'day, YYYY_MM_DD' sets the parameters for the
day section - create sections every day
YYYY_MM_DD - the suffix for the table
and finally the fourth now () :: textexample data for creating a section
Another example:
select create_partition('partitions.test4.id', 'digits', '10,2', 1::int);
Here, the number will be reduced to 10-bit by adding zeros to the left (if necessary) and all digits to the left of the eighth will be taken for the section name:
1000000000 will go to partitions.test4_10
100000000 to partitions.test4_01
10000000 to partitions.test4_00
10000000000 to partitions .test4_100
Briefly describe how it works:
when you run select create_partition ('partitions.test1.cdate', 'date', 'day, YYYY_MM_DD', now () :: text); First, the "plugin" partition_ date is called. The purpose of this function is to return the section name and the condition of getting into it according to the parameters passed. Further in the main function, a new table is created, inherited from the current one with the necessary conditions, all the ancestor indices are copied.
Then, a query from the system tables pulls the data about the descendant tables and their check condition from this data, creates a trigger that distributes the data into sections.
Now, at any insertion, the trigger will fire and determine the desired table-section, if there is no such table, create_partition is called with the same parameters that were when creating the sections.
For the first example, after a year of filling, the trigger will look like this:
CREATE OR REPLACE FUNCTION trig_partition_test1_cdate() RETURNS TRIGGER AS $BODY2$
declare
child text;
begin
IF ((NEW.cdate >= '2012-09-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-10-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_09 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-08-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-09-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_08 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-07-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-08-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_07 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-06-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-07-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_06 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-05-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-06-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_05 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-04-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-05-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_04 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-03-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-04-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_03 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-02-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-03-01 00:00:00-08'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_02 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-01-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-02-01 00:00:00-08'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_01 VALUES (NEW.*);
ELSE
EXECUTE 'SELECT create_partition(''public.test1.cdate''::text, ''date''::text, ''month,YYYY_MM''::text , '''||NEW.cdate||'''::text)' INTO child;
EXECUTE 'INSERT INTO '|| child || ' SELECT $1.*' USING NEW;
END IF;
RETURN NULL;
end;
$BODY$ LANGUAGE 'plpgsql';
As you can see, it turns out to be a practically classic example from the documentation :)
As you may have guessed, this solution is easy to expand, for example, by partitioning by the first letters of a string or a hash of a string. I suggest writing this “plugin” yourself and posting it in the comments.
I hope someone will find my solution useful.
Helper scripts found on sql.ru forum
CREATE OR REPLACE FUNCTION to_timestamp(timestamp without time zone) RETURNS timestamp with time zone AS $_$
select $1::timestamp with time zone;
$_$ LANGUAGE sql STABLE STRICT;
CREATE OR REPLACE FUNCTION to_timestamp(timestamp with time zone) RETURNS timestamp with time zone AS $_$
select $1::timestamp with time zone;
$_$ LANGUAGE sql STABLE STRICT;
CREATE OR REPLACE FUNCTION to_numeric(text) RETURNS numeric AS $_$
select regexp_replace($1, '\D+', '', 'g')::numeric;
$_$ LANGUAGE sql STABLE STRICT;
CREATE OR REPLACE FUNCTION copy_constraints(srcoid oid, dstoid oid)
RETURNS integer AS
$BODY$
declare
i int4 := 0;
constrs record;
srctable text;
dsttable text;
begin
srctable = srcoid::regclass;
dsttable = dstoid::regclass;
for constrs in
select conname as name, pg_get_constraintdef(oid) as definition
from pg_constraint where conrelid = srcoid loop
begin
execute 'alter table ' || dsttable
|| ' add constraint '
|| replace(replace(constrs.name, srctable, dsttable),'.','_')
|| ' ' || constrs.definition;
i = i + 1;
exception
when duplicate_table then
end;
end loop;
return i;
exception when undefined_table then
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE OR REPLACE FUNCTION copy_constraints(src text, dst text)
RETURNS integer AS
$BODY$
begin
return copy_constraints(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION copy_indexes(srcoid oid, dstoid oid)
RETURNS integer AS
$BODY$
declare
i int4 := 0;
indexes record;
srctable text;
dsttable text;
script text;
begin
srctable = srcoid::regclass;
dsttable = dstoid::regclass;
for indexes in
select c.relname as name, pg_get_indexdef(idx.indexrelid) as definition
from pg_index idx, pg_class c where idx.indrelid = srcoid and c.oid = idx.indexrelid loop
script = replace (indexes.definition, ' INDEX '
|| indexes.name, ' INDEX '
|| replace(replace(indexes.name, srctable, dsttable),'.','_'));
script = replace (script, ' ON ' || srctable, ' ON ' || dsttable);
begin
execute script;
i = i + 1;
exception
when duplicate_table then
end;
end loop;
return i;
exception when undefined_table then
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE OR REPLACE FUNCTION copy_indexes(src text, dst text)
RETURNS integer AS
$BODY$
begin
return copy_indexes(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION copy_triggers(srcoid oid, dstoid oid)
RETURNS integer AS
$BODY$
declare
i int4 := 0;
triggers record;
srctable text;
dsttable text;
script text = '';
begin
srctable = srcoid::regclass;
dsttable = dstoid::regclass;
for triggers in
select tgname as name, pg_get_triggerdef(oid) as definition
from pg_trigger where tgrelid = srcoid loop
script =
replace (triggers.definition, ' TRIGGER '
|| triggers.name, ' TRIGGER '
|| replace(replace(triggers.name, srctable, dsttable),'.','_'));
script = replace (script, ' ON ' || srctable, ' ON ' || dsttable);
begin
execute script;
i = i + 1;
exception
when duplicate_table then
end;
end loop;
return i;
exception when undefined_table then
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE OR REPLACE FUNCTION copy_triggers(src text, dst text)
RETURNS integer AS
$BODY$
begin
return copy_triggers(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
Main scripts
CREATE OR REPLACE FUNCTION "create_partition" (in tbl varchar, in method varchar, in params varchar, in sample text) RETURNS text AS $BODY$
declare
scheme varchar := split_part(tbl, '.', 1);
parent varchar := split_part(tbl, '.', 2);
field varchar := split_part(tbl, '.', 3);
child varchar;
script text;
trig text;
part text[];
begin
execute 'select partition_'||method||'('''||params||''', '''||field||''', '''||sample||''')' into part;
-- RAISE EXCEPTION 'part %', part;
child = scheme || '.' || parent || '_' || (part[1]::text);
execute 'create table IF NOT EXISTS ' || child || '
(
constraint partition_' || (part[1]) || ' check '
|| (part[2]) || '
)
inherits (' || parent || ')';
perform copy_constraints(parent, child);
perform copy_indexes(parent, child);
-- execute 'GRANT SELECT ON ' || child || ' TO some_other_user';
-- execute 'GRANT ALL ON ' || child || ' TO user';
script = (select string_agg(c, chr(10)||' ELS') from (
select to_numeric(replace(t.table_name, parent||'_','')) as n, 'IF '||replace(left(right(cc.check_clause, -1), -1), c.column_name, 'NEW.'||c.column_name)||' THEN INSERT INTO '||t.table_schema||'.'||t.table_name||' VALUES (NEW.*);' as c
from information_schema.TABLE_CONSTRAINTS t
join information_schema.CONSTRAINT_COLUMN_USAGE c
ON t.constraint_name = c.constraint_name
join information_schema.check_constraints cc
ON t.constraint_name = cc.constraint_name
where constraint_type IN ('CHECK')
and t.table_name like parent||'\_%'
group by t.table_schema, t.table_name, c.column_name, cc.check_clause
order by n desc) t);
trig = 'trig_partition_'||parent||'_'||field;
execute
'CREATE OR REPLACE FUNCTION '||trig||'() RETURNS TRIGGER AS $BODY2$
declare
child text;
begin
'||script||'
ELSE
EXECUTE ''SELECT create_partition('''''||tbl||'''''::text, '''''||method||'''''::text, '''''||params||'''''::text , ''''''||NEW.'||field||'||''''''::text)'' INTO child;
EXECUTE ''INSERT INTO ''|| child || '' SELECT $1.*'' USING NEW;
END IF;
RETURN NULL;
end;
$BODY2$
LANGUAGE ''plpgsql'' VOLATILE;';
execute 'DROP TRIGGER IF EXISTS t_'||trig||' ON "'||parent||'" CASCADE';
execute 'CREATE TRIGGER t_'||trig||' BEFORE INSERT ON "'||parent||'" FOR EACH ROW EXECUTE PROCEDURE '||trig||'();';
return child;
end;
$BODY$ LANGUAGE 'plpgsql';
CREATE OR REPLACE FUNCTION "partition_date" (in params text, in field varchar, in sample timestamp with time zone) RETURNS text[] AS $BODY$
declare
period varchar:= split_part(params, ',', 1);
fmt varchar := split_part(params, ',', 2);
clock timestamp with time zone := to_timestamp(sample);
delta varchar := '1 '||period;
suffix varchar;
check_beg varchar;
check_end varchar;
condition varchar;
begin
-- RAISE EXCEPTION 'period %, fmt %, clock %', period, fmt, clock;
check_beg = date_trunc(period, clock);
check_end = date_trunc(period, clock + delta::interval);
suffix = to_char (clock, fmt);
condition =
'(
' || field || ' >= ' || quote_literal (check_beg) || ' and
' || field || ' < ' || quote_literal (check_end) || '
)';
return ARRAY[suffix, condition];
end;
$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION "partition_digits" (in params text, in field varchar, in sample numeric) RETURNS text[] AS $BODY$
declare
len int := split_part(params, ',', 1)::int;
pref int := split_part(params, ',', 2)::int;
norm text := to_char(sample::numeric, 'FM000000000000000000000');
suffix text := regexp_replace(left(norm, -1*(len-pref)), '0*(?=\d{'||pref::text||'})', '');
check_beg varchar;
check_end varchar;
condition varchar;
begin
check_beg = (trunc(norm::numeric, -1*(len-pref)))::numeric::text;
check_end = (check_beg::numeric+10^(len-pref))::numeric::text;
condition =
'(
' || field || ' >= ' || check_beg || ' and
' || field || ' < ' || check_end || '
)';
return ARRAY[suffix, condition];
end;
$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;
Tests
Date Sections
Number range sections:
DROP TABLE IF EXISTS "public"."test1" CASCADE;
CREATE TABLE "public"."test1" (
"id" serial,
"cdate" timestamp with time zone,
"text" text,
CONSTRAINT "test11_pkey" PRIMARY KEY (id)
) WITH OIDS;
CREATE INDEX test_idx_cdate ON test1 USING btree (cdate);
-- select create_partition('public.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text);
-- select create_partition('public.test1.cdate', 'date', 'week,YYYY_IW', now()::text);
select create_partition('public.test1.cdate', 'date', 'month,YYYY_MM', now()::text);
-- insert into test1 (cdate, text) SELECT cdate, md5(random()::text) as text FROM generate_series('2012-01-01'::date, now(), '1 day'::interval) cdate;
Number range sections:
DROP TABLE IF EXISTS "public"."test2" CASCADE;
CREATE TABLE "public"."test2" (
"id" bigserial,
"text" text,
CONSTRAINT "test2_pkey" PRIMARY KEY (id)
) WITH OIDS;
select create_partition('public.test2.id', 'digits', '10,2', 1::int);
insert into test2 values(10000000, 'test2');
-- insert into test2 (id, text) SELECT t.id, md5(random()::text) as text FROM generate_series(0, 100000000000, 1000000000) as t(id) limit 200;