Browser fingerprinting

Tomei conhecimento do estudo da Electronic Frontier Foundation intitulado “Web Browsers Leave ‘Fingerprints’ Behind as You Surf the Net”, através do Miguel Almeida, mas já vi várias opiniões em vários sítios diferentes.

Apesar de achar que os resultados do estudo são de facto preocupantes, acho que se está a exagerar bastante.

Ao correr o teste disponível aqui, obtive os seguintes resultados:

Your browser fingerprint appears to be unique among the 1,763 tested so far.
Currently, we estimate that your browser has a fingerprint that conveys at least 10.78 bits of identifying information.

Browser Characteristic bits of identifying information one in x browsers have this value value
User Agent -0.84 0.56 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
HTTP_ACCEPT Headers -6.98 0.01 text/html, */* ISO-8859-1,utf-8;q=0.7,*;q=0.7 gzip,deflate en-us,en;q=0.5
Browser Plugin Details 10.78+ 1763 A bunch of them
Time Zone -7.13 0.01 -60
Screen Size and Color Depth -5.38 0.02 1280x800x24
System Fonts 6.26 76.65 A bunch of them
Are Cookies Enabled? -8.89 0 Yes
Limited supercookie test -8.03 0 DOM localStorage: Yes, DOM sessionStorage: Yes, IE userData: No

Analisando os resultados linha a linha, podemos ver que o único parâmetro que é verdadeiramente único é o “Browser Plugin Details” e em segundo lugar o “System fonts”, sendo que 1 em cada 76 browsers testados têm as mesmas fontes que eu. Ou seja, o meu browser é unico entre os 1963 browsers testados, mas apenas pela minha lista de plugins. Ora em 1763 browsers testados, não acho que o problema seja tão grave como se faz parecer, ainda para mais quando é vulnerável ao uso de plausible deniability.

Como se pode ver, há um certo exagero nas opiniões que tenho visto sobre o estudo. Na prática, apesar de se provar que de facto o nosso browser está longe de ser comum, e consequentemente garantir qualquer tipo de anonimato, também não é tão particular como se faz parecer.

Just my 2 cents.

PrinScreen RSS File – Long live Prt.sc… sort of

Recently, one of my favorite blog aggregators died, but left us with a nice OPML file containing all rss feeds from the authors of the aggregator.

Yahoo! My pipes couldn’t parse the opml file, so I built a python script to do it and hosted it in my server.

The RSS File is updated every hour and contains all RSS entries from all Prt.Sc authors not older than seven days.

You can get the RSS file at http://simaom.com/prtsc.xml.

So prt.sc lives… well… sort of… I won’t update the opml feed with any authors, we only get access to rss entries from the authors included in the last version of Prt.sc.

Let me know what you think.

opml2rss.py – An opml to rss converter

I just uploaded another script to my github repository.

It’s a python script to parse an opml file and generate a rss file with entries from all rss feeds in the rss file not older than a certain number of days.

You can get the script here: http://github.com/…/opml2rss.py

The script has a few configuration parameters that are pretty self explanatory, should be easy.
You also need to install a few python modules: Feedparser, OPML and PYRSS2Gen.

New github repository

I just set up a github repository to hold my code.

Here’s the link http://github.com/simao/mycode

Currently, the repository contains only the code of my latest Python script, rssTorrents.py.

Automatically download torrent files from a RSS feed

I was looking for a way to parse a RSS feed I built using yahoo pipes and add new torrents to Transmission to download them automatically.
I couldn’t find anything useful, so I just wrote a python script to do just that.
If you want to use it, you’ll need to configure the first lines of this file to suit your needs.
The script is pretty self explanatory.

#!/usr/bin/python2.6
#
# Python script to parse a RSS feed containing torrent files urls and
# add new torrent files to transmission based on the time of the last added
# torrent. The script stores the RSS item date of the last added
# torrent and adds a new torrent to the BT client if the RSS contains
# one or more items with a later date
#
# Sat Jan 23 21:30:00 WET 2010
#

DATEFILE = "rsstorrents.pid"
RSSFILE = "http://pipes.yahoo.com/pipes/......" # Change RSSFILE to point to your RSS file URL
TORRENTCOMMAND = "transmission-remote -n username:password -a "

import feedparser
import pickle
from datetime import datetime
from datetime import timedelta
import time
import commands

# Read the date, download shows from last 3 weeks if we can't read any date
try:
    with open(DATEFILE, "rb") as f:
        lastdate = pickle.load(f)
        print "Read date %s" % lastdate
except Exception:
    lastdate = datetime.now() - timedelta(weeks=3)
    print "Could not read date of last feed, using last 3 weeks %s" % lastdate

# Fetch RSS File
feedInfo = feedparser.parse(RSSFILE)

# Fetch all items until date is later than the stored date
# Add all files to deluge
n = 0
for entry in feedInfo.entries:
    feedDate = datetime.fromtimestamp(time.mktime(entry.modified_parsed))

    if feedDate > lastdate:
        torrentURL = entry.enclosures[0]['href']
        print "Adding torrent %s" % entry.title

        outputstatus  = commands.getstatusoutput(TORRENTCOMMAND + torrentURL)
        if(outputstatus[0] != 0):
            print "Error adding torrent: %s" % outputstatus[1]
        else:
            n = n + 1
    else:
        break

# Set the last date to the date of the most recent item of the RSS feed
lastdate = datetime.fromtimestamp(time.mktime(feedInfo.entries[0].modified_parsed))

try:
    with open(DATEFILE, "wb") as f:
        pickle.dump(lastdate, f)
        print "Saved date %s" % lastdate
except Exception:
    print "Could not save date of last feed"

# Feedback user
print "Finished. Added %d torrents" % n

Disclaimer:
I use this script to download legal torrents ;)

Emacs easy window switching

I just found out about a another cool emacs plugin: WindMove

This packages allows you to switch windows withouth using C-x o.

WindMove is included with emacs, just include the following code in your .emacs:

(when (fboundp 'windmove-default-keybindings)
(windmove-default-keybindings))

More info @ http://www.emacswiki.org/emacs/WindMove

Why Jungle Disk is so slow

I recently signed up for an account at Jungle disk, http://www.jungledisk.com.

I’m paranoid about backups, I use Time Machine to do a full weekly backup and Jungle Disk as an off-site backup solution. It seemed the cheapest option since you only pay for what you upload, and although I have a full 160GB hard drive, my sensitive files only total about 10 GB. At 0.18$/Month that’s 1,8$/Month + bandwidth.

Jungle Disk Uploads files to an Amazon S3 disk, in my case, located in Europe. I chose to pay 0.03$/Month + bandwith extra for that location because I thought latency would influence the speed of my backups, that’s why I went with Amazon over Rackspace. I’ll probably migrate anyway when Jungle Disk offers a migration tool for this.

I have access to an internet connection with a symmetric 100 Mbit link so I was surprised when I noticed jungle disk was only using about 40 KB/s. After thinking about it, it actually makes sense.

The reason for the slowness of the backup is due to several factors.

Jungle Disk Uploads Individual files, not a big compressed file containing all files to be backed up.

In practice, this means JD does not have a continuous stream of bytes to upload like with a big file, instead it has to stop sending information while preparing to send the next file (including encrypting, it see next item). Besides, it needs to setup the S3 file system to receive the new file. During this time, the TCP connection is almost stopped. When JD starts uploading the next file, the TCP connection already lost all it’s speed, that’s why the speed is only high when JD is uploading big files, the TCP connection has enough time to adjust and recognize the speed of the link.

Jungle Disk Encrypts the files using AES before sending it over the internet through a SSL connection. (UPDATE: This is wrong, see comments)

This means JD has to stop sending files until it finishes encrypting and entire file. This could be solved if JD could encrypt files at the same time it sends the previous file. This could reduce the time the TCP connection is stopped.

Uploading only differences is not really an improvement.

While it’s true that uploading only new files or changed files is a big improvement, uploading only the differences of the files themselves it’s not that big of an improvement. When you think about it, what files have you edited lately with small differences? Text files? So you only send 10 KB instead of 50KB? You gain 40KB? That’s nothing in today’s bandwidth speeds. When you edit a big file, like a big image, you most likely edited a big part of the file and you still have to upload most bytes of the file.

Jungle Disk is a really nice service, and I think it’s the best you can get. Probably there’s not  an off-site backup solution that isn’t slow. Besides, it’s only slow the first time, when you have to upload 10GB in one time, after that you only have to upload new or changed files, that’s about 1GB per backup in my case.

If you don’t have an off-site backup solution yet, Jungle Disk is the way to go. And you DO need an off-site backup solution, right?

Clean Macports

Here’s a nice post on how to clean up macports:

http://simenhag.blogspot.com/2008/11/cleaning-up-macports.html

I do this cleansing from time to time.

How to easily encrypt/decrypt files using GPG in Emacs

The latest emacs pretest version (23.0.95) includes  EasyPG making it easier to encrypt/decrypt files almost transparently.

You just have to C-x C-f a file and C-x C-s and EasyPG takes care of the rest for you.

How to fix Time Machine stopping

I had my Time Machine backup working and I used to do my backs every week or so.

After I ran out of space, I bought a new external to use as a backup volume, so I formatted the new hard drive using Disk Utility and set it to use the GUID partition scheme and HFS+.

I started time machine and it began a new backup but after some time the backup always stopped.

I retried several times, and I reformatted my external hdd multiple times, and several other solutions I found while googling but Time Machine always stopped during the backup.

I saw this knowledge base article from apple, where they advice users to use GUID, so I didn’t try anything else. Until I did, and it worked.

So my solution on how to fix Time Machine when it stops repeatedly during a backup is to use MBR and not GUID (as apple suggests). I think the SATA controller of my disk doesn’t like using GUID.