The pipeline will save you from duplicates in the tape

Super Mario, the great pipeliner

Hey. Andorro

recently wrote about vile duplicates of posts that are likely to appear in the feed if you subscribe to Habr and GT. There is a great way to solve this problem using rss and yahoo, and for one, subscribe to intersecting hubs.

Yahoo pipes

Pipes is a powerful composition tool to aggregate, manipulate, and mashup content from around the web.

Pipes is a service that receives something at the entrance, does something with it inside itself and gives back what happened. You can enter csv, rss xml, or even dsdjl of another pipe as input. The output can be rss, json, email or widget.

For example, you can get the rss channel fly in Tumblr, regularly replace in all img a link to a small picture with a link to a large one and return the resulting rss.

Building a pipeline

We register on Yahoo (if necessary), go to and open the constructor.

Screenshot 1 Constructor

On the left is a list of blocks, in the center is the constructor itself, at the bottom is a debugger. For each block there is a description and an example of use. There are many options, but we need Fetch Feed.

Add it and, for example, add Windows hubs and Laptops with GT and Development with Habr to it.

Screenshot 2 Fetch Feed

The log shows that the data is being received. Now you can sort them by date (Sort block).

Screenshot 3 Sort

After sorting, it became clear that the posts from the Windows hubs and laptops are quite repeatable. This is easily solved by filtering by headers (Unique block).

Screenshot 4 Unique

We connect the output of the Unique block, save and can be started .

Screenshot 5 Finish

I hope this recipe will be useful to you.

Yahoo Pipes
Demo Example

Also popular now: