From e6a7285b64fc5030fc759ba1bafc4071034b83fc Mon Sep 17 00:00:00 2001
From: ewen <ewen@web>
Date: Tue, 21 Mar 2017 08:48:05 +0000
Subject: Added a comment: Track GUIDs to avoid duplicate downloads

---
 .../comment_25_211b8f829070021e977c6de9eebf829f._comment       | 10 ++++++++++
 1 file changed, 10 insertions(+)
 create mode 100644 doc/tips/downloading_podcasts/comment_25_211b8f829070021e977c6de9eebf829f._comment

diff --git a/doc/tips/downloading_podcasts/comment_25_211b8f829070021e977c6de9eebf829f._comment b/doc/tips/downloading_podcasts/comment_25_211b8f829070021e977c6de9eebf829f._comment
new file mode 100644
index 000000000..7588598d6
--- /dev/null
+++ b/doc/tips/downloading_podcasts/comment_25_211b8f829070021e977c6de9eebf829f._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="ewen"
+ avatar="http://cdn.libravatar.org/avatar/605b2981cb52b4af268455dee7a4f64e"
+ subject="Track GUIDs to avoid duplicate downloads"
+ date="2017-03-21T08:48:04Z"
+ content="""
+While tracking podcast media URLs *usually* works to avoid duplicate downloads, when it fails it usually fails spectacularly.  In particular if a podcast feed decides to update *all* the URLs (for old and new podcasts) to use a different URL scheme, then suddenly that looks like a huge volume of new URLs, and all of them get downloaded -- even if the content has actually already been retrieved from a different URL.  For instance the `acast.com` service has changed their URL scheme a couple of times in the last 1-2 years, rewriting all the historical URLs, so I have three copies of many of the episodes on podcasts on their service :-(  (Many downloaded; some skipped once I caught the bulk download and stopped it/reran with `--fast` or `--relaxed` to make placeholders instead.  `acast.com` seem to have managed to cause even more confusion by rewriting many of the older `mp3` files with new `id3)
+
+Some (all?) podcast feeds also have a `guid` field, which specifies what should be a unique per-episode
+"""]]
-- 
cgit v1.2.3