Everyone can find the Higgs boson!


    On May 12, CERN announced the Higgs Boson Machine Learning Challenge , a competition for the best event search algorithm involving the Higgs boson in an experimental data set. The competition will last until September 15, winners will receive cash prizes from $ 2,000 to $ 7,000. A successful solution can be integrated into the real process of processing data from the ATLAS detector. To participate in the competition does not need special knowledge in particle physics.

    The Higgs boson in the Large Hadron Collider is not detected directly, but by decay products. Huge energy protons collide in the center of the detector. During the collision, the Higgs boson can be born, which decays into other particles in a short time. According to the predictions of the standard model, the most popular decay channel is into a pair of quarks B and anti-B. The competition suggests focusing on rarer events when the Higgs boson decays into a tau lepton and an antitau lepton. Since these leptons also quickly decay through various channels, the detector “sees” only the products of their decay. However, a similar set of decay products can be obtained in many other ways, therefore, many events form the background, and in order to study the Higgs boson, it is necessary to distinguish events with a boson from the background.

    A huge number of collisions occur in the collider, so it is very important to quickly and accurately distinguish interesting events from uninteresting ones according to the data from the detector. This is what the contestants are invited to do.

    Each event is described by thirty numbers, of which 17 are direct data from the detector, and 13 are derived values ​​calculated from raw data, which, according to experts, can be useful for prediction. Among the raw data, for example, PRI_tau_pt is the perpendicular component of the momentum of the detected hadron tau (tau lepton restored via the hadron decay channel). Among the derivatives, for example, DER_mass_MMC is the estimated mass of the Higgs boson, which could most likely generate this event (if there was a Higgs boson at all). A full theoretical description of the parameters is provided in a special article , although you may not need to read it in order to approach the task with an unblinked gaze.

    Participants are inviteda training set of 250 thousand events for which it is known whether they are a signal or noise, and it is proposed to classify 550 thousand pre-known control events. Results will be evaluated using a formula that takes into account the number of correct and incorrect answers. To make it difficult to adjust the results, you are not informed of the exact result of the test: until the end of the contest, the test is conducted on a random subset of 18% of the control sample.

    Participants can team up to four people and send up to five decisions per day. You can discuss approaches to solving the forum . To test your decision, it is enough to send a file with predictions: you can download the sources later, if you claim a prize.

    The authors of the three best solutions will receive cash prizes: $ 7000, $ 4000 and $ 2000. The ATLAS collaboration will also select a winning team, the solution of which will be best suited for use in the experiment (taking into account performance, reliability and other parameters). This team will be invited to CERN to meet with the ATLAS collaboration (to cover transportation costs).

    Also popular now: