Nepomuk is becoming a great tool, but it still has it’s drawbacks.
A hobby of mine that I’ve done for over twenty-five years now is genealogy. Over the course of that time I’ve acquired a lot of documents and scans of documents, not to mention photos, web snippets, text notes, pdf files and other such things.
As an ardent KDE user, the natural thing to do for keeping track of all these files – and being able to find them again – is by tagging for Nepomuk. With Dolphin I give them a tag or two, add a comment, and I should have no trouble finding the file in the future. While for many users that would hold true, for my usage (and I suspect many other users) there’s still a problem with relying solely on Nepomuk. It’s tags and comments don’t transfer to the cloud, or another computer. In other words, because Nepomuk’s stores all those tags and comments in it’s database and not in the file itself, the tags and comments don’t transfer elsewhere. With me, I sync all my research files in Dropbox, but when I access them with my laptop out in the field, none of those tags or comments are there. That’s a serious handicap to my research.
A habit I developed years ago was to use some software (like GIMP or digiKam) to add comments and tags directly to the file’s metadata. When Nepomuk used Strigi it could read that metadata, but since the change to it’s own engine, that’s currently not the case. After trying three distro versions of KDE 4.10, I haven’t been able to read any of my metadata tags or comments with Nepomuk. So now what?
I went searching for an alternative, and found a great one, Recoll. It’s a file indexing program started years ago by Jean-Francois Dockes and still in active development. It has a native Qt interface, is blazingly fast in it’s indexing, highly configurable, and as an added bonus to us KDE users, it now has a kio-slave and a Krunner plug-in. You can search Recoll’s database right from within Dolphin or Konqueror, and also from the trusty ALT-F2 keyboard shortcut.
Recoll has few dependencies, mostly it’s Xapian database backend and the small programs it uses to parse information from files for it’s database. I suggest to anyone who wants to try it to go ahead and allow your package manager to get all the “recommends” it lists, they’re small, don’t stay in memory when not in use, and make it truly versatile. Once installed, when you start it for the first time it will present you with a setup dialog. Here you can set what folders you do and don’t want indexed, any file types you want excluded, what language you want it’s dictionary to use, etc., and whether you want it running in the background all the time or as a cron job, only running at your set specified times.
By default you’ll notice only one entry in the list of folders to index, “~”, meaning your home folder. Unless you really want it to index every single file in your user’s home, delete that entry. Just add the folders you really want to index. Note that every folder you add to the list is indexed recursively, so every sub-folder in that folder will also be indexed. If you don’t want all the subfolders indexed, you can place the ones you don’t want in the excluded list.
Recoll can do it’s indexing in one of two ways, either running in the background and indexing as files are added or modified, or run at certain times set by the user. For my use I opted for the cron job indexing. At three in the morning (my computer runs 24/7) Recoll is run by cron and updates it’s index. It also means Recoll isn’t running all the time on my machine, only when I want or need it. You can of course add more runs to the cron, if you have a need for indexing more often, but still don’t want it running all the time in the background. For those not familiar with setting cron runs, it has short but concise explanations on how to do it in it’s own supplied dialog.
Once I had it all set up, it took Recoll about a minute and a half to completely index my folders of nearly 4G of miscellaneous file types. It’s fast! And then, because I have set the indexing to only run at night, it exits. I used KDE’s keyboard settings to create a shortcut, and now all I have to do to search my files is press a key combination. Recoll loads in a second and is ready for my search. Close it and it’s gone, nothing hanging around in memory. That’s shiny enough, but where it gets really useful to me is with the kio-slave. Open Dolphin, click on the location bar to enter input mode, and type “recoll:searchterm” to have your search pop right up in Dolphin. If you find you search for the same things fairly often, you can also click on the little folder icon to the left of your search term and drag it to your “Places” to create a virtual folder. Then anytime you want to re-run the search all you have to do is click on the virtual folder. If you find you’re creating a lot of searches, you could do like I did and create a parent folder called “Searches” or something like that, and just drag and drop the virtual folders there. You’ll notice it creates “.desktop” files of your searches. Just click on any one of them in Dolphin to pop those search results back up. When I say Recoll’s fast, I mean it’s fast! Results are almost instantaneous.
So if you need a way to search for files and Nepomuk still isn’t doing quite what you need it to, or if you’re like me and need to search your embedded metadata across the cloud and several computers, give Recoll a try. At the time of this writing the kio-slave still wasn’t available for Kubuntu’s Raring, but it is for earlier versions. For openSUSE users it’s available for all their supported versions. Other distros will need to check with their respective package managers.
If there’s one drawback to Recoll, it’s that it requires the user to enter metadata to the files themselves, as opposed to doing it with the file manager. With mixed file types like text and image files, it’s easier with a file manager to do multiple files, but I think it’s a small inconvienience for the power and speed Recoll offers.