Author Archives: thameera

Bookmarklets for bulk deleting items from Kindle

I got my Kindle 4 a few days back. Other than for reading books, it’s ideal for reading long articles and blog posts you come across in the web. You can even direct RSS feeds to the Kindle. All items you send to the device (other than by transferring directly from the PC) get stored in your Amazon account. You can view these documents in the Manage Your Kindle page. As days go by, this list of items become too long. And you want to delete them.

Amazon doesn’t let you select several items or all of them at once. You have to go through the tedious process of clicking Actions, then clicking Delete from Library and answering ‘Yes’ for each item.

Luckily, there are bookmarklets to help you avoid the mess. For example, the Check and Delete bookmarklet adds checkboxes in front of each item and you can check the items you want to delete and press Delete. There’s also this MDKI bookmarklet which lets you specify the titles of the items and delete them. The former option is more convenient in most cases. You’ll be able to modify the MDKI bookmarklet to select all items from a given author (i.e., source) and delete them.

Beware though, if you delete any item from the library, it’s gone. Even purchased items.

Keeping the Git directory and the working tree separate

When you clone/init a repo, the .git directory is created inside your working tree. This colleague of mine wanted to check-in the source code to a legacy versioning system other than git. Normally the .git directory gets pushed as well, since it’s inside the working tree.

Normal structure

But he didn’t want to push the .git dir as well. He wanted something like this:

Required structure

If you look at the git-clone man page, there’s this option called --separate-git-dir which lets you assign a custom directory as the .git dir. The syntax is:

git clone git://path.to.repo newdir --separate-git-dir=somedir

But in this case, my colleague didn’t want to clone again. He had local branches galore and all. So I tried out the git-clone mentioned above and checked what really happens. It turns out that git creates a file called .git inside the working tree that looks like this:

gitdir: /home/thameera/path/to/git/dir

Gotcha! Now what we need to do is move my friend’s .git directory to a separate location and create a file called .git that has the path to the git dir as above. It works!

ShortcutFoo

As you may have guessed from the name, ShortcutFoo lets you master keyboard shortcuts in various apps including Vim, Emacs, Eclipse, Git, Excel, Visual Studio and many more. It first lets you learn the shortcuts, then presents drills and a practice mode to master them. So you actually use the shortcuts, instead of just memorizing them. Clever idea.

The downside is, only a few shortcut bundles in each app is free. To unlock the rest, you’ll have to pay a one-time fee of $8.99. That’s too high. I may subscribe one day if I happened to learn Emacs or some crazy text editor other than Vim. But no thanks for now.

Why the ‘Foo’ anyway? Shouldn’t it have been ‘Fu’?

We don’t want any more Twitter clients – Twitter

The Next Web has an article about the Windows 8 Twitter client Tweetro. Tweetro has become so popular among the users that it has had to face the wrath of Twitter.

Thank you for reaching out to get clarification on our developer policies. As you know, we discourage developers from building apps that replicate our core user experience (aka “Twitter clients”).

It’s like when you introduce someone to your circle of friends and then that someone becomes more popular than you in the circle, so you become jealous and want that someone out. Ugh.

Gmail with Thunderbird

I’d been dealing with my work-related mail with Mozilla’s Thunderbird for a few weeks. For the personal Gmail account, it was the good old webapp. For a change I hooked the Gmail account with Thunderbird to see how things would go. It had trouble authenticating the Google account, but then remembered about the 2-step verification so created an application-specific password to make it work.

The experience is better than I’d imagined. It’s quite handy being able to open mail in tabs. There are hundreds of add-ons to choose from. Especially, the Conversations add-on is a must. It’s smarter than Gmail’s conversation view. For example, if you get code broken into separate git patches like [PATCH 1/7], [PATCH 2/7] etc, they appear as separate threads in gmail. But this add-on groups all related patches to a single tree of conversation with branches. Clever! Guess I’d be sticking with this for the foreseeable future.

Processing music tags with Python

I was looking for the optimal way to traverse the directory tree with Python and found out that os.walk is the answer. To put this into practice, I wrote the following script which traverses the music library and finds out which MP3s haven’t been properly tagged. In other words, it will print the full paths of all MP3s that have any of the artist, album or title tracks missing. In my case, 502 of the 10466 songs had bad tags.

[sourcecode language=”python”]
#!/usr/bin/python

import os
import id3reader

rootdir = "/home/thameera/Music"
processed = 0
bad = 0

for thisDir, subdirList, fileList in os.walk(rootdir):
for fname in fileList:
if fname[-3:] == 'mp3':
processed += 1
fullPath = os.path.join(thisDir, fname)

try:
id3r = id3reader.Reader(fullPath)
except:
bad += 1
print "Error extracting id3 info from %s" % fullPath
continue

if None in (id3r.getValue('performer'), id3r.getValue('album'), id3r.getValue('title')):
bad += 1
print fullPath

print "%d MP3s processed. Found bad tags in %d of them" % (processed, bad)
[/sourcecode]

It uses the id3reader library to parse the tags. I don’t think it’ll work if you have files with other formats, like wma or flac.

This script can be further expanded to ignore certain directories, output only one entry per each directory that has bad tags, check for other tags as well, check if the artist tag does not match the name of the parent directory, etc, etc.

Fixing bad sectors

Something went wrong and my work laptop running Windows 7 turned off without suspending when the lid was closed. Turned it back on to find that everything’s darn slow and the hard disk light is lit all the time. Shut it down and restarted a several times with the same result. Booting with a CD or a USB wasn’t an option since this is a locked machine. Some googling suggested that chkdsk would be a good idea.

So I typed in cmd at the start menu, right clicked and chose Run As Administrator. Even though I don’t have admin access, it’s possible to run an app ‘as administrator’. Then typed in chkdsk /r to the prompt and pressed Enter. The /r flag would let chkdsk locate bad sectors and try to recover readable information. It wasn’t able to run the scan then and there coz I’d booted the OS with the same hard disk, but offered to schedule the check for the next time the system restarts. Restart the machine and chkdsk would run for several hours depending on the hard disk size/speed. It took nearly 4 hours in my case and several bad sectors were recovered. After this Windows would boot without a hitch.

The probability of SHA-1 collisions

Git uses SHA-1 hashes to identify its objects (commits, refs, blobs, etc). It assumes that if the SHA-1 hashes of two objects are equal, then the two objects themselves are equal, hence there’s no need to store both in the repo. In other words, if the hashes of two different objects turned out to be equal at some point, Git wouldn’t be able to handle that!

But what are the chance of such a SHA-1 collision? This was discussed in the Git mailing list today and a recent analysis on this problem was shed into light. This analysis has used the famous Birthday Paradox and come up with a formula to determine the probability. It quotes,

“Applying the formula for 160bit SHA-1 you need 1.7e23 objects to get a 1% chance of collision. The current Linus kernel repository has 2.7 million objects. So to get a collision you’d need a repository that’s 6e16 times larger. That should be plenty.

For some wacky perspective that’s 10 million kernel sized contributions for every man woman and child on earth together in a single repository. It would seem git will reach plenty of other bottlenecks before SHA-1 becomes a problem…”

The probability is, of course, still non-zero. But don’t ever expect to witness such a collision before you die.

Most popular twitter clients

I was playing with Twitter’s streaming API and thought of finding out what the most popular twitter clients are. So I listened to Twitter’s stream of public tweets for half an hour and sorted the list of tweet sources. The sample contained 100,000 random public tweets. The top ten clients were:

  1. web – 24697
  2. Twitter for iPhone – 17564
  3. Twitter for Android – 13007
  4. Twitter for BlackBerry® – 10592
  5. Mobile Web – 3521
  6. UberSocial for BlackBerry – 1849
  7. TweetDeck – 1776
  8. Twitter for iPad – 1411
  9. Facebook – 1322
  10. Echofon – 1320

As expected, the Twitter web is the source of almost a quarter of the tweets, while the official clients for iPhone, Android and Blackberry take the next slots. And then comes the Mobile Web. Twitter for iPad sits in the 8th place. In other words, more than 70% of the world’s tweets are posted from official clients.

The complete list of results is here.

Activating IRC on Empathy

As it turns out, Empathy doesn’t come with an IRC account support in Ubuntu 12.10. In other words, the new Online Accounts section in Quental lets you connect to Google, Twitter, Jabber and a few other accounts but not IRC. I have no clue what made the packagers take this decision, but it’s a poor move, considering that the plugin only weighs 11.6 kB. Anyway, you can simply install the IRC plugin with the command,

sudo apt-get install account-plugin-irc

There are a few other plugins that do not come pre-installed. You can view them using,

apt-cache search account-plugin