Borgius September 25, 2012 at 04:50

Sectioning: Shot and Forgot

About partitioning you can find a lot of information , in particular, here you can read about the theory, and then the author develops the idea and offered his solution to quickly add section. I recommend you to familiarize yourself.
After studying the theory, almost everyone comes up with the idea of automating the process of creating sections. Above was one of the options, the second complex option I saw from the creators of the respected, I think, not only me Zabbix.
After a little adaptation, I decided to implement it myself ... Unfortunately, it revealed several shortcomings: when creating a new section, the first record in this section was lost; with a large number of sections, inserting even one record takes too much time (caused by 2 factors: each time a table was calculated where to put the record; using a lot of rules instead of 1 trigger with all conditions). Nevertheless, the guys did an excellent job and I take this opportunity to send them rays of respect.

As a result, I present my decision to your court. To start, an example of how to start partitioning:
Date range:

select create_partition('partitions.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text);
select create_partition('partitions.test2.cdate', 'date', 'week,YYYY_IW', now()::text);
select create_partition('partitions.test3.cdate', 'date', 'month,YYYY_MM', now()::text);

These commands will start the partitioning by days, weeks and months for the tables test1, test2, test3.

We will analyze the parameters: the
first field ' partitions.test1.cdate ' values separated by a dot:
partitions - the name of the scheme where the sections
test1 will be added - the ancestor table in the current area visibility
cdate - the field by which the table will be divided; the
second field 'date' sets the type of sections the
third field 'day, YYYY_MM_DD' sets the parameters for the
day section - create sections every day
YYYY_MM_DD - the suffix for the table
and finally the fourth now () :: textexample data for creating a section

Another example:

select create_partition('partitions.test4.id', 'digits', '10,2', 1::int);

Here, the number will be reduced to 10-bit by adding zeros to the left (if necessary) and all digits to the left of the eighth will be taken for the section name:
1000000000 will go to partitions.test4_10
100000000 to partitions.test4_01
10000000 to partitions.test4_00
10000000000 to partitions .test4_100

Briefly describe how it works:
when you run select create_partition ('partitions.test1.cdate', 'date', 'day, YYYY_MM_DD', now () :: text); First, the "plugin" partition_ date is called. The purpose of this function is to return the section name and the condition of getting into it according to the parameters passed. Further in the main function, a new table is created, inherited from the current one with the necessary conditions, all the ancestor indices are copied.
Then, a query from the system tables pulls the data about the descendant tables and their check condition from this data, creates a trigger that distributes the data into sections.
Now, at any insertion, the trigger will fire and determine the desired table-section, if there is no such table, create_partition is called with the same parameters that were when creating the sections.
For the first example, after a year of filling, the trigger will look like this:

CREATE OR REPLACE FUNCTION trig_partition_test1_cdate() RETURNS TRIGGER AS $BODY2$
declare
        child text;
      begin
        IF ((NEW.cdate >= '2012-09-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-10-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_09 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-08-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-09-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_08 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-07-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-08-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_07 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-06-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-07-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_06 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-05-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-06-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_05 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-04-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-05-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_04 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-03-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-04-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_03 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-02-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-03-01 00:00:00-08'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_02 VALUES (NEW.*);
        ELSIF ((NEW.cdate >= '2012-01-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-02-01 00:00:00-08'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_01 VALUES (NEW.*);
        ELSE
            EXECUTE 'SELECT create_partition(''public.test1.cdate''::text, ''date''::text, ''month,YYYY_MM''::text , '''||NEW.cdate||'''::text)' INTO child;
            EXECUTE 'INSERT INTO '|| child || ' SELECT $1.*' USING NEW;
	    END IF;
	    RETURN NULL;
      end;
$BODY$ LANGUAGE 'plpgsql';

As you can see, it turns out to be a practically classic example from the documentation :)

As you may have guessed, this solution is easy to expand, for example, by partitioning by the first letters of a string or a hash of a string. I suggest writing this “plugin” yourself and posting it in the comments.

I hope someone will find my solution useful.

Helper scripts found on sql.ru forum

CREATE OR REPLACE FUNCTION to_timestamp(timestamp without time zone) RETURNS timestamp with time zone AS $_$
  select $1::timestamp with time zone;
$_$ LANGUAGE sql STABLE STRICT;
CREATE OR REPLACE FUNCTION to_timestamp(timestamp with time zone) RETURNS timestamp with time zone AS $_$
  select $1::timestamp with time zone;
$_$ LANGUAGE sql STABLE STRICT;
CREATE OR REPLACE FUNCTION to_numeric(text) RETURNS numeric AS $_$
  select regexp_replace($1, '\D+', '', 'g')::numeric;
$_$ LANGUAGE sql STABLE STRICT;
CREATE OR REPLACE FUNCTION copy_constraints(srcoid oid, dstoid oid)
  RETURNS integer AS
$BODY$
declare
  i int4 := 0;
  constrs record;
  srctable text;
  dsttable text;
begin
  srctable = srcoid::regclass;
  dsttable = dstoid::regclass;
  for constrs in
  select conname as name, pg_get_constraintdef(oid) as definition
  from pg_constraint where conrelid = srcoid loop
    begin
    execute 'alter table ' || dsttable
      || ' add constraint '
      || replace(replace(constrs.name, srctable, dsttable),'.','_')
      || ' ' || constrs.definition;
    i = i + 1;
    exception
      when duplicate_table then
    end;
  end loop;
  return i;
exception when undefined_table then
  return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE OR REPLACE FUNCTION copy_constraints(src text, dst text)
  RETURNS integer AS
$BODY$
begin
  return copy_constraints(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION copy_indexes(srcoid oid, dstoid oid)
  RETURNS integer AS
$BODY$
declare
  i int4 := 0;
  indexes record;
  srctable text;
  dsttable text;
  script text;
begin
  srctable = srcoid::regclass;
  dsttable = dstoid::regclass;
  for indexes in
  select c.relname as name, pg_get_indexdef(idx.indexrelid) as definition
  from pg_index idx, pg_class c where idx.indrelid = srcoid and c.oid = idx.indexrelid loop
    script = replace (indexes.definition, ' INDEX '
      || indexes.name, ' INDEX '
      || replace(replace(indexes.name, srctable, dsttable),'.','_'));
    script = replace (script, ' ON ' || srctable, ' ON ' || dsttable);
    begin
      execute script;
      i = i + 1;
    exception
      when duplicate_table then
    end;
  end loop;
  return i;
exception when undefined_table then
  return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE OR REPLACE FUNCTION copy_indexes(src text, dst text)
  RETURNS integer AS
$BODY$
begin
  return copy_indexes(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION copy_triggers(srcoid oid, dstoid oid)
  RETURNS integer AS
$BODY$
declare
  i int4 := 0;
  triggers record;
  srctable text;
  dsttable text;
  script text = '';
begin
  srctable = srcoid::regclass;
  dsttable = dstoid::regclass;
  for triggers in
   select tgname as name, pg_get_triggerdef(oid) as definition
   from pg_trigger where tgrelid = srcoid loop
    script =
    replace (triggers.definition, ' TRIGGER '
      || triggers.name, ' TRIGGER '
      || replace(replace(triggers.name, srctable, dsttable),'.','_'));
    script = replace (script, ' ON ' || srctable, ' ON ' || dsttable);
    begin
      execute script;
      i = i + 1;
    exception
      when duplicate_table then
    end;
  end loop;
  return i;
exception when undefined_table then
  return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE OR REPLACE FUNCTION copy_triggers(src text, dst text)
  RETURNS integer AS
$BODY$
begin
  return copy_triggers(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;

Main scripts

CREATE OR REPLACE FUNCTION "create_partition" (in tbl varchar, in method varchar, in params varchar, in sample text) RETURNS text AS $BODY$
declare
  scheme varchar := split_part(tbl, '.', 1);
  parent varchar := split_part(tbl, '.', 2);
  field varchar := split_part(tbl, '.', 3);
  child  varchar;
  script text;
  trig text;
  part text[];
begin
    execute 'select partition_'||method||'('''||params||''', '''||field||''', '''||sample||''')' into part;
    -- RAISE EXCEPTION 'part %', part;
    child = scheme || '.' || parent || '_' || (part[1]::text);
    execute 'create table IF NOT EXISTS ' || child || '
    (
     constraint partition_' || (part[1]) || ' check '
      || (part[2]) || '
    )
    inherits (' || parent || ')';
    perform copy_constraints(parent, child);
    perform copy_indexes(parent, child);
    -- execute 'GRANT SELECT ON ' || child || ' TO some_other_user';
    -- execute 'GRANT ALL ON ' || child || ' TO user';
    script = (select string_agg(c, chr(10)||'        ELS') from (
        select to_numeric(replace(t.table_name, parent||'_','')) as n, 'IF '||replace(left(right(cc.check_clause, -1), -1), c.column_name, 'NEW.'||c.column_name)||' THEN INSERT INTO '||t.table_schema||'.'||t.table_name||' VALUES (NEW.*);' as c
          from information_schema.TABLE_CONSTRAINTS t
                  join information_schema.CONSTRAINT_COLUMN_USAGE c
            ON t.constraint_name = c.constraint_name
                  join information_schema.check_constraints cc
            ON t.constraint_name = cc.constraint_name
          where constraint_type IN ('CHECK')
                  and t.table_name like parent||'\_%'
          group by t.table_schema, t.table_name, c.column_name, cc.check_clause
        order by n desc) t);
    trig = 'trig_partition_'||parent||'_'||field;
    execute
    'CREATE OR REPLACE FUNCTION '||trig||'() RETURNS TRIGGER AS $BODY2$
      declare
        child text;
      begin
        '||script||'
        ELSE
            EXECUTE ''SELECT create_partition('''''||tbl||'''''::text, '''''||method||'''''::text, '''''||params||'''''::text , ''''''||NEW.'||field||'||''''''::text)'' INTO child;
            EXECUTE ''INSERT INTO ''|| child || '' SELECT $1.*'' USING NEW;
	    END IF;
	    RETURN NULL;
      end;
      $BODY2$
      LANGUAGE ''plpgsql'' VOLATILE;';
    execute 'DROP TRIGGER IF EXISTS t_'||trig||' ON "'||parent||'" CASCADE';
    execute 'CREATE TRIGGER t_'||trig||' BEFORE INSERT ON "'||parent||'" FOR EACH ROW EXECUTE PROCEDURE '||trig||'();';
  return child;
end;
$BODY$ LANGUAGE 'plpgsql';
CREATE OR REPLACE FUNCTION "partition_date" (in params text, in field varchar, in sample timestamp with time zone) RETURNS text[] AS $BODY$
declare
  period varchar:= split_part(params, ',', 1);
  fmt varchar := split_part(params, ',', 2);
  clock timestamp with time zone := to_timestamp(sample);
  delta varchar := '1 '||period;
  suffix varchar;
  check_beg varchar;
  check_end varchar;
  condition varchar;
begin
    -- RAISE EXCEPTION 'period %, fmt %, clock %', period, fmt, clock;
    check_beg = date_trunc(period, clock);
    check_end = date_trunc(period, clock + delta::interval);
    suffix = to_char (clock, fmt);
    condition =
    '(
       ' || field || ' >= ' || quote_literal (check_beg) || ' and
       ' || field || ' < ' || quote_literal (check_end) || '
     )';
    return ARRAY[suffix, condition];
end;
$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION "partition_digits" (in params text, in field varchar, in sample numeric) RETURNS text[] AS $BODY$
declare
  len int := split_part(params, ',', 1)::int;
  pref int := split_part(params, ',', 2)::int;
  norm text := to_char(sample::numeric, 'FM000000000000000000000');
  suffix text := regexp_replace(left(norm, -1*(len-pref)), '0*(?=\d{'||pref::text||'})', '');
  check_beg varchar;
  check_end varchar;
  condition varchar;
begin
    check_beg = (trunc(norm::numeric, -1*(len-pref)))::numeric::text;
    check_end = (check_beg::numeric+10^(len-pref))::numeric::text;
    condition =
    '(
       ' || field || ' >= ' || check_beg || ' and
       ' || field || ' < ' || check_end || '
    )';
    return ARRAY[suffix, condition];
end;
$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;

Tests

Date Sections

DROP TABLE IF EXISTS "public"."test1" CASCADE;
CREATE TABLE "public"."test1" (
  "id" serial,
  "cdate" timestamp with time zone,
  "text" text,
  CONSTRAINT "test11_pkey" PRIMARY KEY (id)
) WITH OIDS;
CREATE INDEX test_idx_cdate ON test1 USING btree (cdate);
-- select create_partition('public.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text);
-- select create_partition('public.test1.cdate', 'date', 'week,YYYY_IW', now()::text);
select create_partition('public.test1.cdate', 'date', 'month,YYYY_MM', now()::text);
-- insert into test1 (cdate, text) SELECT  cdate, md5(random()::text) as text FROM generate_series('2012-01-01'::date, now(), '1 day'::interval) cdate;

Number range sections:

DROP TABLE IF EXISTS "public"."test2" CASCADE;
CREATE TABLE "public"."test2" (
  "id" bigserial,
  "text" text,
  CONSTRAINT "test2_pkey" PRIMARY KEY (id)
) WITH OIDS;
select create_partition('public.test2.id', 'digits', '10,2', 1::int);
insert into test2 values(10000000, 'test2');
-- insert into test2 (id, text) SELECT  t.id, md5(random()::text) as text FROM generate_series(0, 100000000000, 1000000000) as t(id) limit 200;

Tags:

Sectioning: Shot and Forgot

Also popular now: