TumblThree - A Tumblr Backup Application

TumblThree - A Tumblr Backup Application

TumblThree is the code rewrite of TumblTwo, a free and open source Tumblr blog backup application, using C# with WPF and the MVVM pattern. It uses the Win Application Framework (WAF). It downloads photo, video, audio and text posts from a given tumblr blog.

Screenshots:

TumblThree - A Tumblr Backup Application.</a></p>
<p><a id=

Features:

  • Source code at github (Written in C# using WPF and MVVM).
  • Multiple concurrent downloads of a single blog.
  • Multiple concurrent downloads of different blogs.
  • Internationalization support (currently available: en, zh, ru, de, fr).
  • A download queue.
  • Autosave of the queuelist.
  • Save, clear and restore the queuelist.
  • A clipboard monitor that detects blogname.tumblr.com urls in the clipboard (copy and paste) and automatically adds the blog to the bloglist.
  • A settings panel (change download location, turn preview off/on, define number of concurrent downloads, set the imagesize of downloaded pictures, set download defaults, enable portable mode, etc.).
  • Uses Windows proxy settings.
  • A bandwidth throttler.
  • An option to download an url list instead of the actual files.
  • Set a start time for a automatic download (e.g. during nights).
  • An option to skip the download of a file if it has already been downloaded before in any currently added blog.
  • Uses SSL connections.
  • Preview of photos & videos.
  • Taskbar buttons and key bindings.

Blog backup/download:

  • Download of photo, video (only tumblr.com hosted), text, audio, quote, conversation, link and question posts.
  • Download meta information for photo, video and audio posts.
  • Downloads inlined photos and videos (e.g. photos embedded in question&answer posts).
  • Download of _raw image files (original/higher resolution pictures).
  • Support for downloading Imgur, Gfycat, Webmshare, Mixtape, Lolisafe, Uguu, Catbox and SafeMoe linked files in tumblr posts.
  • Download of safe mode/NSFW blogs.
  • Allows to download only original content of the blog and skip reblogged posts.
  • Can download only tagged posts.
  • Can download only specific blog pages instead of the whole blog.
  • Allows to download blog posts in a defined time span.
  • Can download hidden blogs (login required / dash board blogs).
  • Can download password protected blogs (of non-hidden blogs).

Liked/by backup/download:

  • A downloader for downloading "liked by" photos and videos instead of a tumblr blog (e.g. https://www.tumblr.com/liked/by/wallpaperfx/) (login required).
  • Download of _raw image files (original/higher resolution pictures).
  • Allows to download posts in a defined time span.

Tumblr search backup/download:

  • A downloader for downloading photos and videos from the tumblr search (e.g. http://www.tumblr.com/search/my+keywords).
  • Download of _raw image files (original/higher resolution pictures).
  • Can download only specific blog pages instead of the whole blog.

Tumblr tag search backup/download:

  • A downloader for downloading photos and videos from the tumblr tag search (e.g. http://www.tumblr.com/tagged/my+keywords) (login required).
  • Download of _raw image files (original/higher resolution pictures).
  • Allows to download posts in a defined time span.

Program Usage:

  • Extract the .zip file and run the application by double clicking TumblThree.exe.
  • Copy the url of any tumblr.com blog you want to backup from into the textbox at the bottom left. Afterwards, click on 'Add Blog' on the right side of it.
  • Alternatively, if you copy (ctrl-c) a tumblr.com blog url from the address bar/text file, the clipboard monitor from TumblThree will detect it and automatically add the blog.
  • To start the download process, click on 'Crawl'. The application will regularly check for (new) blogs in the queue and start processing them, until you stop the application by pressing 'Stop'. So, you can either add blogs to the queue via 'Add to Queue' or double click/drag'n'drop first and then click 'Crawl', or you start the download process first and add blogs to the queue afterwards.
  • A light blue bar left to the blog in the queue indicates a actively downloading blog.
  • The blog manager on the left side also indicates the state of each blog. A red background shows an offline blog, a green background an actively crawling blog and a purple background an enqueued blog.
  • You change the download location, the number of concurrent connections, the default backup settings for each newly added blog and various other settings in the 'Settings'.
  • In the Details window you can view statistics of your blog and set blog specific options. You can here what kind of post type (photo, video, audio, text, conversation, quote, link) to download.
  • For downloading only tagged posts, you'll have to do some steps:
    1. Add the blog url.
    2. Open the blog in the details tab, enter the tags in the Tags textbox in a comma separated list without the leading hash (#) sign. E.g. great big car,bears would search for images that are tagged for either a great big car or bears or both.
  • For downloading password protected blogs, you'll have to do some steps:
    1. Add the blog url.
    2. Open the blog in the details tab, enter the password in the Password textbox.
  • For downloading hidden blogs (login required blogs), you have to do some steps:
    1. Go to Settings, click on the Connection tab and fill in your tumblr email address (login) and password, then click the Authenticate button. If the login was successfully, the label will change and display your email address. The email address and password are not stored locally on disk but cookies are generated and saved in %LOCALAPPDATA%\TumblThree in json format.
    2. Add the blog url.
  • For downloading liked photos and videos, you'll have to do some steps:
    1. Go to Settings, click on the Connection tab and fill in your tumblr email address (login) and password, then click the Authenticate button. If the login was successfully, the label will change and display your email address. The email address and password are not stored locally on disk but cookies are generated and saved in %LOCALAPPDATA%\TumblThree in json format.
    2. Add the blog url including the liked/by string in the url (e.g. https://www.tumblr.com/liked/by/wallpaperfx/).
    3. For downloading your own likes, make sure you've (temporarily) enabled the following options in your blogs settings (i.e. https://www.tumblr.com/settings/blog/yourblogname):
      1. Likes -> Share posts you like (to enable the publicly visible liked/by page)
      2. Visibility -> blog is explicit (to see/download NSFW likes)
  • For downloading photos and videos from the tumblr search, you'll have to do some steps:
    1. Add the search url including your key words separated by plus signs (+) in the url (e.g. https://www.tumblr.com/search/my+special+tags).
  • For downloading photos and videos from the tumblr tag search, you'll have to do some steps:
    1. Go to Settings, click on the Connection tab and fill in your tumblr email address (login) and password, then click the Authenticate button. If the login was successfully, the label will change and display your email address. The email address and password are not stored locally on disk but cookies are generated and saved in %LOCALAPPDATA%\TumblThree in json format.
    2. Add the search url including your tags separated by plus signs (+) in the url (e.g. https://www.tumblr.com/tagged/my+special+tags).

Key Mappings:

  • Currently mapped keys:
    • double click on a blog adds it to the queue
    • drag and drop of blogs from the manager (left side) to the queue
    • space -- start crawl
    • ctrl-space -- pause crawl
    • shift-space -- stop crawl
    • del -- remove blog from queuelist
    • shift-del -- remove blog from blogmanager
    • ctrl-shift-g -- manually trigger the garbage collection

Getting Started:

The default settings should cover most users. You should only have to change the download location and the kind of posts you want to download. For this, in the Settings (click on the Settings button in the lower panel of the main user interface) you might want to change:

  • General -> Download location: Specifies where to download the files. The default is in a folder Blogs relative to the TumblThree.exe
  • Blog -> Settings applied to each blog upon addition:
    • Here you can set what posts newly added blogs will download per default. To change what each blog downloads, click on a blog in the main interface, select the Details Tab on the right and change the settings. This separation allows to download different kind of post for different blogs. You can change the download settings for multiple existing blogs by selecting them with shift+left click for a range or ctrl-a for all of them.
    • Note: You might want to always select:
      • Download Reblogged posts: Downloads reblogs, not just original content of the blog author.

Settings you might want to change if the download speed is not satisfactory:

  • Connection -> Concurrent connections: Specifies the number of connections used for downloading posts. The number is shared between all actively downloading blogs.
  • Connection -> Concurrent video connections: Specifies the number of connections used for downloading tumblr video posts. The vt.tumblr.com host regularly closes connections if the number is too high. Thus, the maximum number of vt.tumblr.com connections can be specified here independently.
  • Connection -> Concurrent blogs: Number of blogs to download in parallel.

Most likely you don't have to change any of the other connection settings. In particular, settings you should never change, unless you're sure you know what you are doing:

  • Connection -> Limit Tumblr Api Connections: Leave this checkbox checked and do not change the corresponding values of 90 connections per 60 seconds. If you still change them, you might end up with offline blogs or missing downloads.

Further Insights:

  • Note: All the following files are stored in json format and can be opened in any editor.
  • Application settings are stored in C:\Users\Username\AppData\Local\TumblThree\.
  • You can use the portable mode (settings->general) to stores the application settings in the same folder as the executable.
  • For each blog there is also a database (serialized class) file in the Index folder of the download location named after the blogname.tumblr. Here blog relative information is stored like what files have been downloaded, the url of the blog and when it was added. This allows you to move your downloaded files (photos, videos, audio files) to a different location without interfering with the download process.
  • Some settings aren't hooked up to the graphical user interface. It's possible to view all TumblThree settings by opening the settings.json in any editor located in C:\Users\Username\AppData\Local\TumblThree\. Their names should be self explainatory. Some notable settings to further fine tune the application include:
    • BufferSize: Allows to set the buffer size for downloading binary files (photos, videos) in multiples of 4KB. The default is 2MB, thus the BufferSize has a value of 512. Increasing this value reduces disk fragmentation as more of the file is kept in the memory before it gets written out to the disk but increases the memory usage.
    • MaxNumberOfRetries: Sets the maximum number of retries if a tumblr server forcefully closes the connection. This might regularly happen on the tumblr video host (vt.tumblr.com) if too many connections were opened in parallel. After the limit is exhausted, the file is left truncated, but is also not registered as a successful downloaded. Thus, the file can be resumed in the next crawl.
    • TumblrHosts: Contains a list of hosts which is tried for downloading _raw photos if the photo size is set to raw. If none of the hosts contains the _raw version, the actually scanned host is tried with the next lower resolution (1028).

Changelog:

2018-07-05:

  • Implements the Tumblr login process and cookie handling in code instead of relying on the Internet Explorer for the Tumblr login process.

2018-06-09:

  • Fixes hidden Tumblr blog download problems caused by the new Tumblr ToS.

2018-05-20:

  • Programmatically agrees to new ToS and GDPR.
  • Implements SVC authentication changes. The SVC service is used to display the dash board blogs (i.e. hidden tumblr blogs). Changes in this internal Tumblr api prohibited TumblThrees access.
  • Saves the last post id in successful hidden tumblr downloads.
  • Improves the text parser of the tumblr api and tumblr svc data models. Separated the slug from the url as the data models are inconsistent. Separated the photoset urls from the photo urls. Moved the date information into a separate column.
  • Minor text changes of some user interface elements.

2018-04-18:

  • Updates the tumblr blog crawler and the hidden tumblr datamodel to reflect tumblr api changes that break blog download of previous TumblThree versions.

2018-02-28:

  • Allows to download only specific pages of hidden Tumblr blogs and in the tumblr search.
  • Improves the proxy settings. TumblThree now uses the default Windows (Internet Explorer) settings if not overridden within TumblThree.
  • Changes the behavior of the timeout value (Settings->Connection->Timeout). The timeout value now counts file chunks of 4kb instead of the whole file download, thus it should better detect if a download is stalled or a connection dropped without canceling active downloads of larger files (e.g. videos).
  • Changes default timeout value (for new users) from 600s to 30s.
  • Fixes possible download of the same photo but with different resolutions. This happened if the _raw file download was interrupted (the timeout hit), then the same photo was queued for download with the _1280 resolution. If the blog was then subsequently queued again, the _raw file was downloaded next to the _1280 file.
  • Fixes reblog/original post detection in the tumblr hidden crawler.
  • Fixes check blog status during startup-option.
  • Fixes download of password protected tumblr blogs.
  • Adds Mixtape, Lolisafe, Uguu, Catbox and SafeMoe parser (thanks to bun-dev).

2017-12-31:

  • Fixes a bug that released the video connection semaphore too often. That means the slider in the settings for limiting the video downloads didn't work at all. It should properly limit the connections to the vt.tumblr.com host and prevent incomplete video downloads now.
  • Includes a rewrite of the blog detection during blog addition. It should reduce latency if you mass add blogs by copying urls into the clipboard (ctrl-c). Offline blogs aren't added anymore.
  • Notifies the user when a connection timeout has occurred. The message states whether the timeout has occurred during downloading or crawling. If it happened during crawling, you might want to re-queue the blog at some point to grab missing posts. A connection timeout should only happen if your connection is wonky. You can decrease/increase the timeout in the settings (settings->connection).
  • You can now specify in the Details-panel for each blog where its files should be downloaded. If the text box control is empty, the files are downloaded as in previous releases in the folder specified in the global download location (settings->general), plus the blogs name.
  • Imgur.com linked albums in tumblr posts are now entirely downloaded if enabled (details panel->external->download imgur). Previously, only directly linked images were detected.
  • Adds an option to load all blog databases into memory and compare each to-download binary file to all databases across TumblThree before downloading. If the file has already been downloaded in any blog before, the file is skipped and will not be counted as downloaded. You can enable this in the settings (settings->global).
  • Allows to add hidden tumblr blogs using the dashboard url (i.e. https://www.tumblr.com/dashboard/blog/blogtobackup).
  • Allows to add all blog types without the protocol suffix (i.e. wallpaperfx.tumblr.com, www.tumblr.com/search/cars).
  • Adds an option to enable a confirmation dialog before removing blogs (#186, #130, #98). It's off by default.

2017-11-17:

  • Adds support for downloading Imgur.com, Gfycat.com and Webmshare.com linked files in tumblr posts.
  • Improves downloading of tumblr liked/by photos and videos.

2017-10-20:

  • Restores bandwidth limiter functionality.

2017-10-13:

  • Changes the default _raw photo host.

2017-10-09:

  • Fixes crawler stop in hidden tumblr blog downloads.
  • Adds options to set the default blog settings for the download from time, download to time and tags in the settings menu.
  • Adds some (ar, el, es, fa, fi, he, hi, it, ja, ko, no, pa, pl, pt, th, tr and vi) google translate translations.

2017-09-08:

  • Can download password protected blogs of non-hidden blogs.
  • Minor UI updates.

2017-08-22:

2017-08-21:

  • French, Spanish and simplified Chinese translations.
  • Removes user interface lag during blog addition.
  • Allows to set the buffer size for downloading binary files in the settings.json in multiples of 4KB. The variable is called BufferSize. The new default is 2MB, thus the BufferSize has a value of 512. Previously it was set to 4KB, but apparently Windows does not do any useful caching on NTFS if multiple writes are concurrent and async. Thus, this should reduce disk fragmentation.
  • Uses .NET Framework 4.6 now as it should be available for all supported windows versions (Windows Vista and above).
  • Improved the selection handling in the details panel. If multiple blogs are selected, old values are now kept if they are the same for all blogs and changes are immediately reflected.
  • Audio file download support for tumblr and hidden tumblr blogs.
  • More code Refactoring.

2017-07-03:

  • Can download hidden (login required/dash board) blogs.

2017-06-30:

  • Improved performance and bugfixes.

2017-06-20:

  • Downloads high resolution (_raw) images.
  • Updated translations (German and Russian).
  • Applies changed settings immediately.

2017-06-04:

  • Sets the date modified date in the Explorer to the posts time.
  • Allows to download single or ranges of blog pages.
  • Full screen media preview.

2017-05-20:

  • Option to skip reblogged posts.
  • Improves detection of inlined photos and videos in text posts (e.g. in answer posts).

2017-05-14:

  • Portable mode.
  • Downloads liked photos and videos.

2017-04-18:

  • Code refactoring.
  • Uses async/await in most of the code instead of tasks from the threadpool.
  • Uses a consumer producer pattern for grabbing and downloading as the Tumblr api v1 is now rate limited.
  • Downloads are now resumable.
  • Data files are now saved as json instead of binary.
  • Reduced memory usage by layering off the downloaded file list and only load it if needed.
  • Improves ui responsiveness.

2017-01-08:

  • Improves the speed of the network code.
  • Adds an option to use a http proxy.
  • Downloads inline images of tumblr posts.
  • Added Russian translation.

2016-12-13:

  • Improves the ui scaling of the main window for smaller resolutions.
  • Prevents crawling of offline blogs.
  • If the same blog is multiple times in the queue and already once active, any other free crawler task will skip and remove any already active blog and proceed to the next inactive blog in the queue.
  • Improved german translation.

2016-12-10:

  • The check for already downloaded files is now independent from the actual host and based entirely on the filename. It look likes the host/mirror does actual vary which would result in a reload of the file since its url changed.
  • Add scrollbars to the settings window if the controls do not fit.
  • Safely replaces blog indexes. If there is an error (e.g. no disk space left) during the update of the index file, the old state should not be corrupted anymore.
  • Changes some color and adds an alternate color for the blog manager.

2016-11-23:

  • Fixes application crashes which occurred by adding tumblr blogs without title or description.
  • Decreases determination time of already downloaded files for large blogs (>100,000 posts) by at least three orders of magnitude.

2016-11-22:

  • Creates more meta information (post id, reblog key, timestamp, tags, slug, title) of the posts, including image, video and audio types.
  • Fixes the progress calculation by adding the found duplicates to the progress. Also states them in the details window.
  • Fixes a locking issue for the meta files (*.txt) which resulted in incomplete downloads.
  • Updates the details and settings view for a better understanding on how to use the application.

2016-11-20:

  • Fixes proper counting of downloaded files.
  • Fully implements the details window (context menus, etc.).

2016-11-18:

  • Fixes the initial automatic queue restore function.
  • Fixes the autodownload function.

2016-11-16:

  • Picture- and videopreview in the details window.
  • Allows the download of text, audio, quote, conversation, link type posts.
  • Download of text, audio, quote, conversation, link and .gif images are based on each blog instead of a global setting and can be turned on/off in the details view. The settings in the settings window are used as template for newly added blogs.
  • Modified .tumblr index files get now always saved upon application exit regardless of the crawlers state. Previously if the application was closed during an active crawl, the index wasn't updated.
  • Inlined the WAF code under lib for easier project setup for newcomers that want to contribute code.
  • bugfixes, UI and memory enhancements.

2016-10-15:

  • Bandwidth throttling.
  • Connection timeout settings.
  • auto queue and start download function.
  • save states of the UI (column size and order).
  • download of hidden blogs.
  • fix proper saving of the ratings and tags.

2016-06-11:

  • Added German translation.

2016-06-10:

  • Support for tumblr.com hosted videos. Check the settings window to enable video download (default: off).

2016-06-08:

  • Tag crawling now properly working. Also it's case-insensitive now.
  • Fixed crash upon blogs with zero-image count in the queue list (e.g. blog is offline, or tag search didn't evaluate any images).
  • Fixed randomly occurring crash in the clipboard monitor.
  • Changed icons (requested by the TumblOne creator).

2016-04-12:

  • Now with progress output in the Queue tab (during url crawling for imageurls -- the number of posts evaluated; during downloading -- the current image url).
  • Added missing resume button in the taskbar control.

2016-04-11:

  • Support for urls starting with https:
  • Fixes application crashed upon pressing the stop-button due to improper exception handling
  • Now saves the index file at every time. Previously the application would exit if the crawling processes was still active without properly waiting them to finish and save its state. Now there is a grace period for the tasks to finish. Same was true if the crawl was paused and then exited.

Download:

Comments

zab
Sun, 14/01/2018 - 11:26

Thanks for the suggestion.

That's actually the next bigger thing I'll personally add if no one is going to do it before me. I'm just lacking time right now to code at TumblThree, but you can see my suggestion post on how to possibly implement it here.

Mon, 22/01/2018 - 21:11

I love this application. Soon as I get it down good you are going to see some money come your way.

I successfully downloaded all images today but I only got 2 text postings. I always put text with my images when i post. But when i look at what was downloaded all I see are the images. Is there anyway to get the text which was posted with the images to backup usingTumblthree??

Thanks again for a great application. I searched high and low before finding this. Long time coming.!!

Mickey

zab
Mon, 22/01/2018 - 21:39

Yes, it should be possible.

If you enable the checkbox "Download image meta" then you'll get a file called images.txt with details from your posted photo posts. You can select between two modes by changing the "metadata format" combobox. The option text is probably what you want. The other option is more detailed but saves the posts information in json format that can be used for further automated processing. You can simply enable the checkbox, requeue your blog and TumblThree will only update the images.txt but not redownload your already downloaded content, thus it should be relatively quick.

Alternatively you can enable the checkbox "Dump crawler data". it will simply dump everything TumblThree sees as json, thus also detailed information about photo posts, but might be too detailed/large/noisy for non further processing.

The above mentioned text output could potentially be extended or updated though, but someone has to do it.

Daniel (not verified)
Fri, 26/01/2018 - 10:46

I love your application, I use it a lot but I have a problem, I would like to download the post with multiple photos in a separate folder.

I leave you with an image that I hope will help you understand
https://prnt.sc/i5titw

I hope you understand my idea

Greg (not verified)
Sat, 24/02/2018 - 03:22

Why does TumblrThree keep "evaluating" blog files instead of actually downloading them? It has only actually downloaded images in one session and never since - just keeps evaluating each blog! How do I stop it from "evaluating" and make it actually download the images?

zab
Sat, 24/02/2018 - 07:51

Just read the manual I've written on the main page. Untick the "force rescan"-option or be more specific to when this happens. If you re-queue an already downloaded blog?

Greg (not verified)
Fri, 09/03/2018 - 18:10

I have read the "manual" and it doesn't answer my question. How do I stop it "evaluating" and make it actually download the files? Where is the box to untick the "force rescan" feature?
Why does this program download files from one tumblr blog and only "evaluate" others (and they are not blogs that have been downloaded previously)?
This looks like it could be a very useful program if only it would work.

zab
Fri, 09/03/2018 - 18:18

What blog does --- in your opinion --- only evaluate and not download? If you don't state the name/url, there is no way to help you (check if it does indeed not work). Hence, I'm not sure why you are even posting here.

Everything else is already said. Read the text. Force rescan is three times mentioned there, and also how to properly use the application.

Greg (not verified)
Sat, 10/03/2018 - 10:57

1. This is the only mention of "force rescan" and it doesn't say where to find the box to tick or untick :

"You might want to always select:
Download Reblogged posts: Downloads reblogs, not just original content of the blog author.
Force Rescan: Force Rescan always crawls the whole blog and not just new posts which were added after the last successful crawl. The statistics of a blog (total posts, number of post, number of duplicates) currently can only be updated if the whole blog is crawled. Thus, disabling this might result in downloading "more" posts than displayed in TumblThree. If you don't matter if about the displayed blog statistics, turning Force Rescan off will decrease the scanning time since already downloaded posts are skipped in the scanning.

Where is it?

2. Which blogs won't download? - actual figures - 81 out of 93 blogs I have tried.

Maybe I am doing something wrong. Maybe my computer is not saving the files - everytime I start the program, a different set of previously loaded blog addresses appears. The automatic adding of urls on the clipboard has only worked once. I would like to start again - remove the program and load it again but I have spent hours checking and adding the blogs one by one ...

3. How do I switch it from evaluating to actual downloading? It's a simple question.

zab
Sun, 11/03/2018 - 08:12

>3. How do I switch it from evaluating to actual downloading? It's a simple question.

.. That is not simple to answer with no information. If you'd use the default settings, this behavior would not happen. That's why I made them the default. That would also fit your description on why it "worked" in the first session, and then never again. So, post screenshots of your blog settings (i.e. a blog selected that doesn't work with the details tab on the right hand side as seen in the screenshot of TumblThree's description) and connection settings page in the Settings window?

Based on your new description, that under 2., I'm guessing you've not enabled the "download reblogged post"-option. Thus, TumblThree scans the blogs, but since the majority of blogs only contain reblogs from some other blogs, nothing is actually downloaded. TumblThree simply does its job by scanning the blog, but your options are set in the way that nothing is downloaded.

>1.
>Where is it?

Look in the screenshot!

Honestly, how to properly use the application is written in the text (with a screenshot showing the options) and explained in tool tips over the options within the application, exactly for the reason so that I don't have to answer the same questions over and over again. And I'm sure I've answered this question (TumblThree does not download, only evaluate) already multiple times. Thats why I keep telling you this.

>2.
>I would like to start again - remove the program and load it again [..]

You can do that. The blog settings are saved separately from the application settings. It's written in the text. You can remove the application settings ($localappdata$\TumblThree) while TumblThree is closed.
But you might also have to adjust all your blog setttings and add the "download reblogged posts"-option. You can simply do that be selecting all blogs, then change the settings in the Details panel. The settings will be changed for all blogs. It's also mentioned in the text.

Michael (not verified)
Mon, 12/03/2018 - 18:30

Dear Johan,
tumblr just got banned in my country. how to use tumblthree to download with proxy setting? currently it won't even authenticate and detect even though I've filled the proxy setting?

thanks in advance
Michael

zab
Mon, 12/03/2018 - 18:36

Use the newest version (v1.0.8.44), then you only have to set the proxy settings within Windows. See the release notes, its even mentioned there.

I've talked to two people from China, they told it works in the newest release using the Windows proxy settings only. Thus, I've removed them as they are superfluous now.

Michael (not verified)
Tue, 13/03/2018 - 15:57

woww that is soo much easier. I've tinkered with 8.20 without success. Thank you very much :) I appreciate it

Seurat (not verified)
Tue, 20/03/2018 - 23:15

Hi,

zunächst mal vielen Dank für die viele Arbeit, die in das Programm wandern.
Wo aber finde ich den Download der 1.0.4.31er Version, um meine alten .tumblr-Dateien zu splitten? Oder alternativ wie soll ich mit dem Source Code umgehen, wohin mit dem?

MfG

zab
Fri, 06/04/2018 - 08:06

Das hat nie wirklich 100%ig funktioniert, und da es schon eine Weile her ist, hab ich alle Downloads vor der Version v1.0.4.47 entfernt.

Ich würde auf jeden Fall auf die aktuelle Version umsteigen, da seit dem doch schon einige Fehler verbessert wurden. Wenn noch alle alle heruntergeladenen Dateien vorhanden sind, lädt TumblThree diese auch nicht nocheinmal herunter, sondern nur neue Posts (oder _raw Fotos statt _1280 Fotos, falls _raw als Fotogröße gewählt ist).
Der einzige Aufwand wäre dann, alle Blogs wieder hinzuzufügen. Das geht vermutlich am besten mit dem Clipboard-Monitor

Geoff (not verified)
Wed, 28/03/2018 - 00:39

Hello, your Authenticate feature apparently relies on the (outmoded) Internet Explorer to view the Tumblr site and execute the account login. Unfortunately, IE 11 is now broken on Tumblr and doesn't show the login fields. It's broken in your authenticate window and in IE itself.

I am using IE 11 version 11.1884.14393.0 (Update Version 11.0.48) running on Windows 10 64-bit version 1607.

Is there a workaround?

zab
Wed, 28/03/2018 - 00:58

> Unfortunately, IE 11 is now broken on Tumblr and doesn't show the login fields. It's broken in your authenticate window and in IE itself.

I've just tested it, no issues. Stock Windows 10 Pro, 1709. Did you test it more than once times? Maybe the webpage just didn't completely load for you. Did you modify the IE?

Proof (exactly as I've expected it):
Authentication window screenshot

And I doubt that it will not work on Windows 10 1607, Windows 8 or Windows 7 either. I'm actually even sure it does since I've tested it on windows 7 one week ago. The login page is also not a fancy, heavily javascript-based webapp that requires the newest browsers. And lastly, probably a whole lot more people would complain if the login would be broken.

Maybe you can try to login to yahoo in the Internet Explorer if it still doesn't work for you.

anonymous (not verified)
Sat, 31/03/2018 - 19:54

Actually, small suggestion:

Galleries like that already have a link to a .zip file. Why not allow for the downloading of that zip file when it comes to albums?

zab
Sun, 01/04/2018 - 07:41

Doesn't really make sense.

I've seen that option, but all other files (photos) from the blog aren't zipped. So, it's inconsistently. And eventually you'll have to unzip them for viewing though, then why download them as zip in the first place? Thus it only adds even more buttons to the already overfull details tab, because the next one will come around who wants the exact opposite. For sure.

But you can add it (for yourself) though. It's open source after all, and only a few lines of code.

zab
Sat, 31/03/2018 - 11:01

You can already specify a time span in blog downloads.

Otherwise, repeat you suggestion, preferable with more than one sentence ..

Wolfgang (not verified)
Thu, 05/04/2018 - 23:10

Hi There !

Es ist ein großartiges Program - wenn es nicht in letzter Zeit einige Probleme machen würde - Für Dein Interesse - es bleibt bei eingen Blogs einfach stehen, obwohlweitere abgearbeitet werden.
Einige Blogs beinhalten sicher weit über 100 (soviel kann man einfach händisch nachschauen) - Einträge - es werden auch teilweise eine große Anzahl angezeigt - und TumblThree hört nach beispielsweise 15 auf - auch durch einen Rescan wir das nicht menhr geändert.
Blogs, die Eingetragen werden und zuerst eine Anzahl anzeigen, verlieren diesen Einrag, wenn man das Program schlißt und wiederöffnet. Dann steht dort ein blankes Feld. nur durch löschen und wiedereintrag und sofortiges Starten wir dann ein Download durchgeführt ...

Vielleicht könntes Du Dir das bitte anschauen - denn von der Idee ist es phnatastisch !!

Danke !!

zab
Thu, 05/04/2018 - 23:20

Ist bei mir nicht der Fall, also kann ich mir da auch nichts anschauen.

Das ist bei allen Blogs so? Falls nicht, bitte die URL mitposten.
Falls du TumblThree schon länger nutzt, hilft es vielleicht einmal die Einstellungen zu löschen. Dazu sollte TumblThree zu sein bevor man die Dateien löscht unter %LOCALAPPDATA%\TumblThree (siehe auch Beschreibung weiter oben).

Edit: Du nutzt die neuste Version?

Jesse (not verified)
Mon, 09/04/2018 - 06:17

I'm sure I'm missing something something... but a lot of the blogs I follow will post a photo and then add text/words to that photo.. for example, maybe they post a picture of a butterfly and commend along with the picture, "what a pretty butterfly!" , but when downloading the blog using this program, all I get is the picture of butterfly and not the corresponding text/comment.. ?

bonnysn (not verified)
Tue, 24/04/2018 - 01:46

Hey zab!

Anyway to bring index files from any version over to the newest version? I hate the manual task of having to copy paste links. I can copy paste names of the tumblrs from index, but then i'd have to add a ".tumblr.com" to every one of them. I'm not looking to get the data from any of them, just to start over from new is fine enough.

Also, some of the links have 10k+ media files, could you add a max pictures criteria?

Thank you for adding the third party media downloader(mega, imgur, etc).

zab
Tue, 24/04/2018 - 02:48

To re-add all your blog: In the settings is an option to export all blog urls into a .txt file for export. Do this, then you can simply select and add them all by pressing ctrl-a and ctrl-c in an editor in a fresh/different tumblthree session.

anonymous (not verified)
Sat, 19/05/2018 - 15:52

I'm trying my best to troubleshoot an issue I'm having right now.

It may be that tumblr has changed something with it's cookies or authentication, but i can't seem to get Tumblthree to crawl certain blogs. All of them are of the NSFW variety mind you and likely are tagged as such.

I know I'm authenticated, I've doublechecked, triple checked even. So I'm certain it's not something like "I haven't properly authenticated my tumblr"

I'm not sure as to any real leads as to what could be causing this issue though asides from Tumblr shenanigans

zab
Sat, 19/05/2018 - 15:55

Those blogs are all hidden tumblr blogs?

NSFW blogs don't require authentication unless they are also hidden blogs. So this means you can crawl non-hidden NSFW blogs in tumblthree without authentication

Do you use the latest release? There was a change made by Tumblr a month ago that broke TumblThree.

For fixing authentication issues (if TumblThree keeps stating that you need to login to download) it might help to remove the cookies once by deleting them in the Internet Explorer->settings wheel->internet options->browsing history->delete->delete cookies. Make sure TumblThree is closed while doing this. Then reopen TumblThree and reauthenticate.

Whats the actual error message/behavior though?

Edit: Looks like tumblr changed something again. Thanks for the notification!

JeanP (not verified)
Wed, 30/05/2018 - 17:05

Hi
Need help.
Downloaded and installed, set some basic settings and then tried to crawl but nothing happens.
Have I overlooked something?

zab
Wed, 30/05/2018 - 17:10

Probably..

Based on your description I cannot say much, except that TumblThree does work with the default settings. Thus, undo your modifications and try again. Is there any error message at all? Does the green bar appear next to the item in the queue? Have you added anything to the queue at all?

One guess is that you might have nothing selected in the "Details" (right hand side after selecting a blog) and thus nothing is downloaded. Simply read the website. I've written the instructions on how to use the application for the purpose of not having to support questions like these.

anonymous (not verified)
Thu, 07/06/2018 - 15:05

I'm having problems with this Tumblr blog. It says it's offline, but the blog is obviously not. I don't know what's going on.

zab
Thu, 07/06/2018 - 21:35

I don't know why, but redo your authentication steps, it most likely will work then. Remove your cookies for this using the internet explorer while TumblThree is closed.

Some people already told me the same, but I couldn't find the actual cause of it yet. Currently, I don't really have time for this. All I can say is that I've noticed that the response from Tumblr varies based on something I couldn't figure out yet.

anonymous (not verified)
Fri, 08/06/2018 - 04:46

Yeah, I just tried clearing the cookies and still nada. It's weird and I can't think of what's causing it. I don't think Tumblr's changed anything

zab
Fri, 08/06/2018 - 04:56

> I don't think Tumblr's changed anything

I think they did. That issue only appeared about 3-4 days ago (maybe with the GDPR/ToS changes) and no one noticed this for over a year the feature is now implemented in TumblThree.

Here is a screenshot of TumblThree using my account downloading that particular blog with zero issues.

Yet I've seen a video showing me your mentioned issue doing the exact same steps I've done and it fails to download.

anonymous (not verified)
Sat, 09/06/2018 - 08:28

How positively bizarre and frustrating. The most I can tell you is that the last time i was able to do a crawl was on 6/5/2018 @ 9:26:47 PM [EST]

So all I can figure is that something has changed in the last few days

I decided to see if any other blog was coming up with this issue, and I found two blogs that were offline that I knew were still online, but just dragging and dropping them into queue let them be scanned without a problem with no further tinkering

zab
Sat, 09/06/2018 - 08:33

Yes, I've already figured out that it has something to do with the cookie handling. That's why most likely the issue started with the ToS change a few weeks ago.

I've written more about my current insights about this issue on github.

Edit: The new releases should fix the issue.

anonymous (not verified)
Fri, 08/06/2018 - 02:28

So I'm an aspiring programmer, and I like your software here, and I probably use far more than you do. I see in a lot of cases you find certain grief and frustration when dealing with certain issues and suggestions, choosing to basically say "It's open source, look at it yourself" in some cases.

Though I'm not here to criticize you for that, I do wish to take steps to maybe be able to assist in working on this project as well. Do you have any pointers as to what I'd need to understand first to have a basic understanding as to how and what TumblThree is doing? If possible, would you be interested in making a short blog post detailing your knowledge?

Thank you

trav (not verified)
Thu, 28/06/2018 - 23:07

Hello!

Is it possible to scrape a tumblr with a nonstandard URL like http://xtheo.ca for example? Putting the link in the download box does not add it to the list. I have successfully crawled normal .tumblr.com URLs today.

Thanks for your time!

zab
Fri, 29/06/2018 - 00:25

Yes, it's possible:

On the main page of the tumblr blog you want to download, right click and open the source code of the page. In the source of the page, search for the "blogname=" string and you will find the tumblr subdomain that you can crawl that corresponds to the blog.

For example if the string would be "blogname=wallpaperfx", then you'd have to add https://wallpaperfx.tumblr.com/

For your mentioned blog above, that would be https://theodorexnicholson.tumblr.com as the source code contains blogName=theodorexnicholson.

Justin (not verified)
Sat, 30/06/2018 - 19:02

The window opens, theres links at the top right to sign up or log in, but the log in text is the same color as te background, and clicking it just takes you to a page with a tumblr background, there is no ability to log in, whatsoever.

Please help me to be able to log in.

zab
Sat, 30/06/2018 - 19:57

I'm not exactly sure what you're talking about -- "clicking it takes you to a page" -- there is no reason to click any link. But I've just tested it, it works perfectly fine.

In the window that opens, in the "Email"-field, enter your email address, click on the "Use password to login"-button, enter your password. Done.

If that doesn't work for you for some reason, you can achieve the same using the Internet Explorer (not the new Edge browser). Just login to Tumblr using the regular Internet Explorer if that's easier for you. TumblThree will use the same cookies.

Justin (not verified)
Sun, 01/07/2018 - 01:36

Thank you, I will try that.

Again, I click authenticate, and the dialog window that appears has the tumblr background, has the login/sign up in the top-right, but the rest of the page is simply the tumble background of the day.

Thank you for your time and effort on this program, it is awesome.

anonymous (not verified)
Tue, 03/07/2018 - 04:54

Bit of a chin rubbing situation cause all I can say is that I did wipe the Internet explorer cookies and relogin, and it still persists on my end.

TumblThree reports the error as a 404

Post writing all this though, I found a fix in the dumbest way of just deleting theBlog and then adding it again. Any idea what that's about?

Jürgen (not verified)
Mon, 16/07/2018 - 10:27

Hi,

first of all: Great work!!!
I will switch to TumblThree as my currently used programm seems to be buggy.

Is it possible to think about a command line version of the programm?
Background is, that i want to let it run on a headless Linux Box (24/7 on) with no GUI.

Thanks and best regards from Munich to Munich :)

Juergen

Pages