Sequencing in MySQL

    Hello! Today there was an interesting problem in data sampling, the solution of which I decided to share.

    So, given:

    Two tables - shops and products

    Roughly speaking - a platform where different stores place their goods.
    And so, there was a need to make the delivery of goods on the main page, but so that the user would not see a bunch of goods from one store. Stores need to be alternated.

    shops:

    • id int not null auto_increment primary key,
    • name varchar (255) null

    products:

    • id int not null auto_increment primary key,
    • shop_id int not null,
    • name varchar (255) null,

    A little google - no sensible solution was found. But the thought came up how to implement a selection with alternating stores.

    Initially, I will describe the algorithm:

    1. You must select all the products and sort them by store.
    2. Next, number each product, starting from 1 with an interval equal to the number of stores.
    3. When numbering, as soon as the goods of one store end, the numbering is reset to zero, shifted by one, and starts again
    4. Select products by sorting them by numbered field

    And all this with MySQL tools. And preferably with a single request.

    With a plan in mind, you can begin to implement it. What do we need?

    1. @i - The counter that will number our products
    2. @cnt - Number of stores
    3. @delta - the delta by which the counter is shifted when numbering the goods of the next store
    4. @cur - id of the current store, to add a delta and reset the counter when numbering a new store

    Declare our variables:

    set @cnt = 0;
    set @i = 0;
    set @delta = 0;
    set @cur = 0;

    Next, assign the initial values ​​(number of stores and id of the first store).

    select @cur:=id from shops order by id limit 1;
    select @cnt:=count(id) from shops;

    Now you can start the selection itself. What do we need?

    • It is necessary to number goods with an interval equal to the number of stores.
    • At the end of the goods of one store - reset the counter, add a delta and change the current store.

    I got the following request:

    select id, shop_id, @i:=@i+@cnt as counter,
                         IF(@cur<>shop_id,@delta:=@delta+1,@delta) as delta,
                         IF(@cur<>shop_id,@i:=@delta,@i) as cur,
                         IF(@cur<>shop_id,@cur:=shop_id,@cur) as curshop
      from t_product order by shop_id

    More information about what is here:

    @i:=@i+@cnt

    In each line, we increase our counter by a number equal to the number of stores. That is, if we have 5 stores, then we will get the following numbering: 0, 5, 10, 15, etc.

    IF(@cur<>shop_id,@delta:=@delta+1,@delta) as delta

    As soon as we have a new store, we increase the shift by one. Those. for the first store, the shift will be 0, for the second - 1, etc.

    IF(@cur<>shop_id,@i:=@delta,@i) as cur,

    When changing the store, we also need to reset our counters and start numbering goods from the beginning, not forgetting to add a shift.

    IF(@cur<>shop_id,@cur:=shop_id,@cur) as curshop

    And in the end - to update the current store, the goods of which we number ...

    As a result, we get a sample of the type:
    idshop_idcounterdeltacurcurshop
    439891100101
    469891200201
    1141721thirty0thirty1
    839891400401
    671721fifty0fifty1
    946722eleven1eleven2
    64892211212
    419892311312
    616722411412
    974893122123

    Here we see that the counter is added correctly, when changing the store it is reset, a shift is added and the numbering starts from the beginning (taking into account the shift).

    Actually, the matter is left to the small. Wrap the resulting selection in a subquery and sort by our counter:

    select id as product_id, shop_id, cur from (
        select id, shop_id, @i:=@i+@cnt as counter,
            IF(@cur<>shop_id,@start:=@delta+1,@delta) as delta,
            IF(@cur<>shop_id,@i:=@delta,@i) as cur,
            IF(@cur<>shop_id,@cur:=shop_id,@cur) as curmag
            from  products order by shop_id
        ) as A order by cur ;

    Voila! We got a selection of goods with alternating stores:
    product_idshop_idcur
    4187110
    74832eleven
    4045312
    90914thirteen
    1457514
    23876fifteen
    8109716
    1445817
    2102918
    92451019
    6744120
    7854221
    2164322

    There is one minus - the goods at each store go in order. Those. at the beginning we will see the very first product of the first store, then the first product of the second store, the third, fourth, etc. Next will go the second store goods, the third and so on.

    In order to get rid of this pattern, we need to mix the goods in the initial sample, wrapping it in another subquery:

    select id as product_id, shop_id, cur from (
        select id, shop_id, @i:=@i+@cnt as counter,
            IF(@cur<>shop_id,@start:=@delta+1,@delta) as delta,
            IF(@cur<>shop_id,@i:=@delta,@i) as cur,
            IF(@cur<>shop_id,@cur:=shop_id,@cur) as curmag
            from  products order by shop_id
        from (select id, shop_id from products order by shop_id, rand()) as A order by shop_id) as B order by cur;

    So our products will already be mixed before they are numbered.

    Actually, this is where the solution to the problem ended. Full request can be viewed under the cut.

    Full interleaved query
    set @cnt = 0;
    set @i = 0;
    set @start = 0;
    set @cur = 0;
    select @cur:=id from shops order by id limit 1;
    select @cnt:=count(id) from shops;
    select id as product_id, shop_id, cur from (
        select id, shop_id, @i:=@i+@cnt as counter,
            IF(@cur<>shop_id,@start:=@delta+1,@delta) as delta,
            IF(@cur<>shop_id,@i:=@delta,@i) as cur,
            IF(@cur<>shop_id,@cur:=shop_id,@cur) as curmag
            from  products order by shop_id
        from (select id, shop_id from products order by shop_id, rand()) as A order by shop_id) as B order by cur;

    For the sake of interest - I looked at the sampling rate. In my opinion, the results were quite good:

    10 stores, 10,000 products - ~ 16ms (0.016s)
    100 stores, 1,000,000 products - ~ 2568ms (2.568s)
    100 stores, 10,000,000 products - 129951ms (2m 9.951s)

    I think these are good results , although, of course, you need to test in combat mode.

    PS For me, only one unclear question remains. Everything is fine, but what to do with pagination? After all, each next page is a new request.

    Accordingly, goods mixed in stores will receive a new serial number and may appear more than once in the sample.

    If you have any thoughts on this, I will be grateful to hear them in the comments.


    Thank you all for your attention)

    Also popular now: