All content in spam?

    These considerations prompted me to constantly increase the number of topics in my RSS reader. On Habré also articles constantly appear in the style of "but I don’t like topics about XXX on the main page, let me filter it." As a solution, they suggest filtering disliked words through Regexp in Yahoo Pipes.

    Bayesian algorithm has been successfully used to filter spam. It is simple, learning and effective (cuts off up to 95–97% of spam). So why not use it to filter the flow of information?

    Suppose all topics in the reader are spam. User behavior does not differ from usual - he reads topic after topic, noting to himself what he liked and what he didn’t like. We will add one additional opportunity to the reader - mark the topics you like (for learning the Bayesian filter). Then, after a period of training, the filter will be able to filter out topics that the user is likely to like. And put them, for example, in the "Read First" section.

    You can go further and filter information using the plug-in to the browser.

    Can transfer the topic to the blog "I'm crazy"?

    Also popular now: