mailRe: [Feed2imap] Dupes

Others Months | Index by Date | Thread Index
>>   [Date Prev] [Date Next] [Thread Prev] [Thread Next]



Posted by Lucas Nussbaum on November 30, 2010 - 22:16:
On 30/11/10 at 21:14 +0100, Michael Welle wrote:
To do that, feed2imap hashes the whole item, so changes in the item's
body trigger an update of the email (not a duplicate: feed2imap replaces
the current email with the new version). Some feeds do stupid things, so
there's a (per-feed) option to change that: ignore-hash.
The hash calculation explains the observed behaviour pretty well. What
are the side effects of ignoring the hash? Does it mean that every
item is regarded as new?

No, that all updates to existing items are ignored.

Another way to avoid that issue is to filter the feeds to remove the
content that changes constantly. There are two ways to do that:
 - the execurl option specifies a command that outputs the feed content
   on stdout. so you can run wget -O - http:/foo |grep -v bar,
 - the filter option specifies a command that will receive the
   downloaded feed on stdin, and output the modified feed on stdout.
   For example, my slashdot definition is:
      name: "Slashdot:"
      target: maildir:/home/lucas/Maildir/.z.newsinfo.slashdot
      filter: "ruby -p -e '$_ = $_.gsub(/\\/Slashdot\//i, 
   Because the capitalization of "Slashdot" in the slashdot feed changes
   constantly (or used to, at some point).
At a first glance the filter looks a little bit like the proposed per
feed comparison function. But I guess the output of the filter is not
used only as an input for the hashing function? So I can't strip
everything except the guid of a feed item and expect that feed2imap
still works as expected?

No, it's also used to generate the content of the email.

I've just added an option that does "do not re-upload posts that were
already deleted by the user even if their hash changes", see

- Lucas

Related Messages

Powered by MHonArc, Updated Tue Nov 30 22:20:14 2010