As a former Mercurial developer, this feels like an admission of defeat. Most
of hg's user interface still seems superior to Git's, even if Git was quicker
to get the branching model right. The Mercurial code base, in many ways, is a
testament to how approachable a Python application can be, and the extension
possibilities stemming from writing a few Python functions seem far more
attractive than Git's apparent hodge-podge of C, shell and Perl. It's good
that people at Mozilla and Facebook are starting to talk more about
hg's advantages, though.
While I wanted to learn Git sooner, the lack of usability made me mostly avoid
it until about 8 months ago, when I became a CouchDB committer and thus could
no longer escape. Two months ago, I also got a new job where Git is the primary
VCS, so I've been diving in. Obviously, it's a pretty great VCS, but some
aspects of the (command-line) user interface are still baffling to me. This has
been written about in plenty of places, so I won't go point-to-point
here. And I'll have to admit that many commands are starting to be ingrained in
muscle memory, to the point that I sometimes use Git-like commands in places
where I use still hg.
However, now that I have basic usage down such that my lack of
experience with Git is no longer a limiting factor, the network effect
values from Git (and GitHub, specifically) outweigh my usability concerns.
The GitHub UX feels more polished (and seems to receive more attention)
than Bitbucket's, and makes me quite happy to use it. I also feel that the
community on GitHub is quite a bit larger than on Bitbucket, which could make
my projects more accessible (see also this account from Eli Bendersky).
I've already gathered some stars (mostly for Persona-TOTP, so far) over the
past six weeks; I hope that's just the start.
I wanted to update the user icon/picture for my OS X user (which may
include the iCloud/Apple ID picture as well), but it turned out to be
harder than I thought. Here's to hoping this post may help others who
run into the same problem. tl;dr: use iCloud's web app to upload
the new picture for your own Contacts entry.
Update (2013-11-02): on Twitter, both Christopher Lenz and Justin Mayer
pointed out that you can just drag and drop an image onto the System
Preferences panel. I thought I'd tried that, but apparently not! Still, I
wonder if that UI is sufficiently ingrained that discoverability is not
Update (2013-11-25): Hugh Hosman, via email, points out that you can also
drop an image into /Library/User Images if you have super user privileges.
Like any person who values 0-day upgrades more than their system's
stability, I recently upgraded to OS X Mavericks. Going into the Users &
Groups preferences panel, double-clicking my current picture provided
me with 6 possible options:
Defaults: a sample of pictures provided by Apple
Recents: contains the current picture, but no others
iCloud: is apparently connected to my iCloud Photo Stream
Faces: a selection based on the iCloud Photo Stream
Camera: take a new picture from my laptop camera
Linked: appears to have something to do with my Contacts
In other words, there was no way here to simply link in a JPEG. Apparently,
the way to get pictures into the Photo Stream is either through an iOS
device (probably through the Camera app) or via Apple's iPhoto or Aperture
photo software, neither of which I own (though iPhoto is apparently free
for everyone who buys a new machine from now on). I did some Googling, which
yielded precisely zero useful results; apparently, using a JPEG was still
supported under Mountain Lion, and no one had documented this problem yet.
(One of the more promising venues appeared to be the Apple StackExchange
site Ask Different.)
Update (2013-10-17): slides and video are now available.
For Software Freedom Day 2013, which is on Wednesday, the 18th of September, I
will give a presentation about nanomsg at the Centrum Wiskunde &
Informatica (the Center for Mathematics & Computer Science) at the Amsterdam
Science Park. If you're in the neighborhood and/or interested in nanomsg, come
nanomsg: simple smart sockets
nanomsg is a socket library that provides several common communication patterns
to help build distributed systems. It aims to make the networking layer fast,
scalable, and easy to use. Implemented in C, it works on a wide range of
operating systems with no further dependencies.
This talk will give a short history of the nanomsg project, an explanation of
the value provided by nanomsg in building distributed systems, and a
demonstration of some key features.
Gervase Markham kindly wrote about it on his blog after I recommended it for
a Firefox development meeting, which made me think I should write about it here.
What it is
AWMY is a tool to communicate event (meeting) times to geographically
dispersed and therefore timezone-challenged audiences. This means it displays
date/time values in (a) an original timezone, (b) the UTC timezone and (c) the
user's local timezone, with a title or description and a countdown timer.
Critically, it supports recurring meetings in a way that a single URL will
show the next meeting in the series no matter when it's loaded into the
browser. This makes it a good fit for use in automatically generated meeting
announcements. Currently, the only supported repeating modes are weekly
One of the design goals is to have nice-looking URLs; ideally, you can
understand the meeting date/time from the URL even without clicking the link.
For now, hacking the URL is the only way to create a new event page; this
should be easy in most cases. I hope to add a form to make it even easier
Timezone support is based on the venerable Olson timezone database. I've put
some thought into handling events near daylight savings transitions and tried
to put in some warnings, but it's probably not perfect yet. At least weekend
events close to daylight savings transitions should be somewhat rare.
The domain name was chosen because it fits in with a Mozilla meme (e.g.
fast, pretty, small, popular, flash and probably others); I
couldn't come up with a better alternative that was also still available. This
one will hopefully be memorable at least for some part of the intended audience.
How to use it
In the current iteration, the page accepts a maximum of 5 arguments:
A timezone: a subset of Olson timezones are accepted and can be referenced in
a few different forms. Only the continent timezones are accepted (e.g.
"America/Los_Angeles", "Europe/Amsterdam"), plus the "UTC" timezone. The
continent is optional (and left out in the canonical versions). A space can
be used where underscores are used in timezone names.
A date: an ISO 6801-formatted date, like "2013-08-26". A three-letter weekday
abbreviation also works here (like "Mon"), but it will emit a warning if
used without the weekly repeating mode.
A time: ISO 6801-formatted 24-hour time, like "15:30".
A repeating mode: currently "w" for weekly or "b" for bi-weekly.
A title: any text.
If no timezone is provided, it's assumed to be UTC. Some examples:
I got started based on some discussion on the mozilla-governance mailing
list. Most Mozilla meetings are coordinated based on the timezone for the
Mozilla HQ, in California. For many non-US participants, it's easier if
meeting times are communicated in UTC, because they know their own UTC offset.
However, this would change actual local meeting times based on daylight
savings, which is a bit of a pain for recurring meetings. Therefore, it
makes more sense to keep the reference meeting time in a timezone that has
daylight savings, on the premise that most people live in zones that use
mostly similar daylight savings schedules.
Some tools exist: for example, here's a timeanddate.com link use for a
Firefox developer meeting. Although timeanddate.com has most of the information
available from AWMY, it's provided in a much more cluttered fashion.
Personally, I find it quite hard to visually parse that page to find the data
I need. Of course, it does provide other useful features that AWMY does not
I've also seen everytimezone.com used for this kind of thing; here's an
example. It does provide the user with a sense of context, which is probably
useful when you want to see what meeting times make sense in timezones you
care about. For the purpose of communicating a single meeting time, it feels
The user experience for these tools doesn't work well for this use case, so I
thought I might be able to do better. On top of that, the other tools don't
appear to handle recurring meetings. Having a stable URL for a series of
events is useful when you want to point to a meeting time from many different
places, but having to update each pointer every week is kind of a drag. Thus
was born AWMY.
At the top of my to do list is a feature to combine event series. This is
mostly inspired by CouchDB meetings, which take place at alternating 13:00
UTC and 19:00 UTC times to accomodate people in different timezones. My current
implementation strategy is to have a "merge" flag that signals another meeting
series, such that two bi-weekly events series can be joined together.
As mentioned before, friendlier UI to build new events is one of my other
priorities. A few form elements could go a long way, though I probably want a
slightly more polished experience. I'll also have to figure out how to make
dealing with series easy, in particular when working with the merging feature.
It would make sense to add a few other repeating modes, in particular "3rd
Wednesday of each month"-like functionality. Offering ICS downloads would be
nice. I would like each page to show the next meeting instance, if only as an
indication that you're dealing with a recurrent event.
Because there's no server side component, I really want to keep all state in
the URLs. On the other hand, I also want readable URLs. These goals don't
always align well, so balancing them is an interesting act. I'm thinking about
a way to generate alternative URLs that aren't as readable, but significantly
I hope this will be a useful tool for the open source community (and anybody
else who has a use for it). I'd be interested to hear your thoughts on what
features would be most useful to add. If you want to contribute some code,
that would be even better; check it out via the Bitbucket project.
All feedback is welcome!
I have recently been contributing to Mozilla's Persona project, which is an
awesome way to make authentication easier for sites and their users. They
kindly published an interview with me, which I reproduce here in full
for archival purposes.
Over the past year, Dirkjan Ochtman has been a consistent, constructive voice
in the Persona community. His involvement has helped ensure that we stay true
to Mozilla’s mission of open, transparent, and participatory innovation.
We hope this interview highlights his contributions and inspires others to get
From the rest of us at Mozilla, thank you.
Who are you?
I’m Dirkjan Ochtman, a 30-year old software developer living in Amsterdam. I
work for a financial startup by day; in my free time, I contribute to a bunch
of open source projects, like Mercurial, Python, Gentoo Linux and Apache
CouchDB. I also started a few things of my own.
Have you contributed to Mozilla projects in the past? How did you get involved in Persona?
I started using Firefox almost ten years ago, and I’d been watching Mozilla
before that. The Mozilla mission of an open Internet resonates with me, so I
tend to try and find stuff around the edges of the project where I can help.
This year, I also became a Mozilla Rep.
I find BrowserID/Persona compelling because I hate having to register on
different sites and make up passwords that fit (often inane) security
requirements. And you just know that many sites store passwords insecurely,
leaking sensitive information when they get hacked. Persona allows me to
authenticate with my email address and a single password; no more guessing
which username I used. I trust Mozilla’s password storage to be much more
secure than the average Internet site, and because Persona is open source, I
can verify that it is.
In addition to setting up Persona sign in on a small community site I run,
I’ve also implemented my own Python-based Identity Provider. This means that
when I use Persona, I control my own login experience. My Identity Provider
uses Google Authenticator, so now I don’t have to remember any passwords at
The documentation for building an Identity Provider was scattered and
incomplete, so I helped improve that. From that work, I got to know some of
the great people who work on Identity at Mozilla.
What have you hacked on recently?
There has been a long-standing issue that the Persona dialog contained too
much Mozilla branding and did not sufficiently emphasize the individual
websites that users were signing into. There was an issue about this on
Github, but I seem to remember complaints on the mailing list from even
Of course, I prefer to use Persona over Facebook Connect or Twitter, so I
decided to see if I could fix some of these issues. Luckily one of the Persona
developers, Shane Tomlinson, was available to work on this at roughly the
To improve the branding balance, we first de-emphasized the Persona branding.
I focused on allowing websites to specify a background color for the Persona
dialog. This is important because it can make the dialog feel much more “at
home" on a site. We had to work out some tricks to ensure that text stayed
readable regardless of the background color specified.
What was that experience like?
It was great. I had no previous experience with Node.js, but getting the
application up and running was easy. I got basic backgroundColor support
working in a few hours, but it took a few nights to tweak things and write
tests. Fortunately, Shane is also based in Europe, so we could easily work
together. When Shane showed our work on the mailing list, response from the
other developers was very positive.
It would be really great if this helps drive Persona adoption amongst large
Any plans for future contributions?
I’ll probably stay involved for the foreseeable future. Now that I know what
I’m doing with the dialog, I would like to help out with further improvements
to the login flow and website API. I’m also very interested in stabilization
and/or standardization of the Identity Provider API.
Two weeks ago, I posted a graphic showing a visualization of 2.5 years of my
location data to my social media feeds. I wanted to jot down a few notes on
how the plan to create this image came together.
As I remember it, I saw similar locative art from some artist a few years
ago. It was like the mapping work from Daniel Belasco Rogers, but I
think it was done by a Dutch guy, with very sparse white maps with red lines,
who had mapped several European cities (including Amsterdam). I spent a few
fruitless hours last week trying to find the "originals" I remember; Mr.
Rogers' is the closest analogue to the other guy's work I could find.
There's something about these maps that resonated with me: the patterns of
a familiar city combined with the paved cow paths of a person's routine,
seen from above, in an entirely different perspective. I soon decided I would
like to build one of these from my own paths, but I didn't own a GPS device,
and some idea of an "art project" certainly wasn't reason enough to buy one.
Fast forward through time, and we get smartphones with location sensors,
Google's Latitude service, adding location history and a limited API (last
30 days of history only) soon after launching. The API was announced in May
2010; it might not be a coincidence that my Latitude history starts on
2010-05-20. Finally, Google's Data Liberation Front posted a short blog
post last week announcing the availability of data dumps containing all
So after a few years of gathering data while waiting for devices and software
to align, I could get to work. Drawing little dots onto an SVG canvas is
actually very easy: the hard part was making up some heuristics to create a
sensible bounding box. If the bounding box is too large, you get a large white
space with a few clumps of dots in different places; if it's too small, you
get a view of your home town with a whole lot of dots in it. I ended up
implementing a slightly convoluted algorithm to measure the ratio of required
surface versus the amount of points in it and taking a derivative from that
line. I ended up with satisfactory results on my own data set, but I have no
clue as to how robust the algorithm is.
Implementing an idea that's been kicking around for a few years is an (oddly?)
satisfying experience. If you happen to be interested in this kind of thing,
my code just needs a Python 2.7 environment. I'm not sure the resulting
images would qualify as "art", but I'm happy with how this turned out.
Two weeks ago, I completed the Compilers course on Coursera; it was a very
worthwhile experience. From the amount of discussion about e-learning, it seems
a pretty hot topic. So far, I haven't seen any posts about what it's like to
actually participate in a Massive Open Online Course (MOOC), so I figured I'd
write up some thoughts on my experience. It's pretty detailed, so:
Taking this course was an great way to learn more about compilers and fill a
hole in my CS curriculum. Professor Alex Aiken is a great instructor and
covers a good amount of material. I learned a lot about compiler construction
despite having toyed with my own compiler before starting the course. The
programming assignments were particularly tough, giving me useful experience
in building compilers and a great sense of achievement. Coursera seems a
nicely designed platform, and I'd like to try some other courses next year.
I heard about the course via a Google plus post at the end of April. I'd
been playing with writing my own compiler for a language I'm experimenting
with, and I figured this would be a good way to learn a few things about what
works in compiler design. For my own project, I had gotten started in Python,
with a custom regex-based lexer, a Pratt parser and doing fairly basic code
generation by writing out LLVM IR. At some point, the code generation code
grew unwieldy and I split it up into a "flow" stage, doing what I now know to
be called semantic analysis and turning the AST into a control flow graph, and
an actual code generation stage to translate the CFG to IR.
There was no compilers class for my CS program in university, and I had
substituted another programming class for the assembler programming class they
offered, but one of Steve Yegge's rants had stuck with me. So I figured
that, with my masters in CS and some experience writing a basic compiler "from
scratch" I would be able to handle the course next to my full-time day job.
This compilers course is originally from Stanford, and supervised by professor
Alex Aiken and some staff via Coursera. There is a similar class on Udacity,
which I didn't find until after I'd already started at Coursera. The Coursera
version is comprised of lectures, quizzes, proof assignments, mid-term and
final tests and programming assignments, with the programming assignments being
graded separately so that there are two levels of participation. This
installment (I think it will run again in the future) ran from April 19,
when the first lectures were available, to July 6, when the final test was due.
Lectures were posted each week. There was at least about 90 minutes of video
per week, though some weeks it ran up to 160 minutes. It's divided up into
pieces of about 5 to 25 minutes, which made the viewing significantly more
manageable. I used HTML5 video inside the browser (which worked great even on
3G internet), but you can also download each video separately. For the
in-browser viewer, there's an option to view at speeds from 0.5x to 2x, but
it's hidden in the user preferences, so I didn't find it until after I was
done with the course.
Each video starts with a short introduction where you can see the professor
talking; after that, you see the slides, which the professor scribbles notes
on as he goes along explaining the material. The slides can be downloaded
separately as PDFs for later review, in two versions: the pristine version and
the one with the notes scribbled on them by the professor.
I didn't like lectures much in university, but found that I actually liked
watching these. The pacing is pretty good, although of course it's sometimes
a little slow and sometimes a little fast, but the professor was engaging and
the scribbling on the sheet makes it feel a little more interactive than your
standard slides plus narration. It also helped me that the videos were generally
pretty short, so you can watch one, do something else for a bit, then watch
Quizzes & tests
There were 6 quizzes, spaced out in time throughout the course, each covering
the material in the lectures posted since the last quiz. There were two
deadlines on each quiz: the early deadline, something like a week after the
quiz was posted, and the end of the course, for half credit. Each quiz could
be taken as many times as you wanted, the highest score would count, and you
got to see the correct answers as soon as you submitted the quiz.
The questions were pretty challenging. In the first few quizzes, I just
clicked through and didn't end up getting very good scores. If I felt the score
was too low, I'd take it again and see if I could do better after studying
the correct answers for a bit, but it didn't always get much better. I was
treating it like a small exam on what I'd learned from the last set of
At some point halfway through, I changed my strategy and started seeing the
quizzes more as additional material. I took extra time, started noting things
down on paper and really working through the problems, as many times as
necessary to get a perfect score. I feel this was a much better way to do it,
because I actually learned things by checking my answers.
I found some of the questions annoying because I felt they required very
detailed reasoning through an example DFA or generating MIPS assembly code by
hand. I don't mind building up assembly code for the programming assignment
(that much -- see below) but I was really hoping more for questions that
tested my understanding of the underlying theory rather than my reproduction
skills of detailed algorithms laid out in the lectures. Of course, taking
more time for the quizzes helped with that, too, but I'm still inclined to
dislike those kinds of questions.
The tests were more or less like the quizzes, though slightly harder. I didn't
do that well on the mid-term, where I got about 50%; I got 75% on the final,
which was much more satisfying. There was some grumbling in the forums about
some of the questions being ambiguous or even just wrong, but this wasn't a big
issue for me.
There were 6 proof assignments, run via a small web app called DeduceIt. The
web app is a little rough around the edges, and so these assignments could
just get you extra credit. In all of these assignments, some part of the week's
material was represented in a proof assignment, where we were suplied with a
number of given statements, a goal statement and a few production rules. We
had to apply the rules to the given statements in a particular number of
steps to derive the goal statement.
This would all be fine if it wasn't for the representation used in the
assignments. To require students to carefully apply the rules to given
statements themselves, both rules and statements are given as LaTeX-like text
expressions. These are hard to read and very tedious to reproduce, making the
assignments more about understanding the representation than actually going
through the proof.
On the other hand, the proof assignments are a good way to go through the
algorithms that were covered in the lectures without having to dive into the
details of actual code implementing the algorithm. I liked doing them for this
reason, but felt the representation got in the way more and more as the
assignments got more complicated towards the end of the course.
The real meat of the course, for me, was in the programming assignments,
implementing a compiler for the Classroom Object-Oriented Language (COOL).
Each assignment corresponded to one of the essential compiler stages: a lexer,
a parser, a semantic analyzer and a code generator. All assignments were
available in C++ or Java flavors, with the first two being built around on
the flex/JLex and bison/CUP tools respectively. It was also
allowed to just use another language, with the caveat that it required
reimplementing some of the support code that was made available.
I decided to go with the Java variants of the first two assignments, on the
premise that it would be educational to learn the usage of tools like JLex
and CUP rather than building my own lexer and parser like I'd done for my own
compiler. Getting started with JLex was fairly frustrating, so I almost dumped
it in favor of doing it in Python, but once I got the hang of it it turned out
(I've since been wondering whether students could learn more if they
implemented a lexer or parser from scratch versus using something like flex
or bison. At least from my own experience, writing a lexer/parser for a
realistic language is not that hard. Of course flex results in a highly
optimized lexer, but writing up a lexing algorithm from scratch would seem more
instructive. On the other hand, perhaps using the generator tool allows you to
focus more on the characteristics of the grammar you're developing, instead of
it potentially getting lost in procedural code.)
The reference compiler is implemented in the same four stages, with a simple
text format used to communicate the relevant data through UNIX pipes: the
lexer consumes program code and spits out a simple line-based list of nodes,
which the parser consumes and turns into an indentation-based AST
representation, the semantic analyzer augments the AST with more typing
information and finally the code generator outputs MIPS assembler code.
Grading was done via Perl scripts that fuzzily compare the output of your
code to that of the reference component, running over a list of sample
programs (both valid and invalid) to score your program. This worked quite
nicely and made it straightforward to find the remaining issues. In fact, the
main way I wrote my code was by starting from a small program, trying to
generate some reasonable output, then use diff -u to compare it to the
output from the reference program and see where my code failed. I often find
that it's easier to stay motivated when generating small successes with some
regularity, and this way of working made it harder to get stuck. On the other
hand, it might have been better to do a little more design up front, forcing
me to think through issues rather than hacking away at problems that come up.
I ended up doing the last two assignments in Python. I wrote up a small
parser for the textual representation of the AST used and some infrastructure
to easily write passes over the AST, then got started with the actual
assignment. This worked great; Python is the language I use most, and not
having to think so much about typing or memory management made it easier to
get the assignments done. In my opinion, Python is a much better language for
most teaching purposes than either Java or C++, because it enables more
focus on the actual algorithm (versus the "details" of programming).
Apparently the Udacity compilers course uses Python, so I might have gone
there if I'd known about it before starting with Coursera.
The programming assignments took a lot of time. With my full-time job, I
found it pretty hard to find 10-25 hours per assignment of focused time reading
documentation, writing code and testing against sample programs. However,
these assignments were also the most instructive and rewarding for me, so I
wouldn't want to have missed them, and having some deadlines also helped with
not procrastinating so much.
As a result, I ended up giving up on the last 20% of the code generation
assignment. Assembly is pretty verbose and hard to read, and working with
MIPS assembly certainly made me appreciate LLVM IR more (even if x86 assembly
is uglier still). Debugging assembly is also pretty painful, so I got a little
frustrated as the deadline was getting closer. The simulator used to run MIPS,
SPIM, was also a little limited and buggy in places, which certainly didn't
make things easier. In short, I'll be happy to return to LLVM and its suite
Part of why this course worked for me was definitely the community. The
deadlines can be harsh if you're not a full-time student, but being able to
get some help from staff, confer about particularly hairy parts of the
semantic analyzer or get some extra explanation about a quizz question makes
it much more fun to do. It should also make it less likely you hit a wall and
have to give up on the course and give some extra incentive to finish the
whole course. This made the forums an integral part of the experience for me.
If you similarly lack any education in compilers and find that Steve Yegge's
aforementioned rant has you inspired, Coursera's course is a great way to learn
a lot about compilers. I'm sure the Udacity course is nice, too, but the
Coursera environment seems more attractive. I also found Coursera's relation
to renowned universities appealing, though that might just be good marketing.
In any case, I look forward to taking another course through their site.
I used to have a weblog. I took the posts offline 4 years ago, when I realized
that some posts weren't fit for the public internet. Since then, I've used
Twitter to vent random thoughts and links, but over the past year I've started
to miss having an outlet for longer articles (and practicing my writing).
Iteration 2 will be mostly about techology; programming languages, version
control systems, web technology and software engineering at large are just some
of the topics I like to think about. If that seems a broad selection, I would
agree: there are too many things I want to do. We'll see what's on here in a
year or so.
So, consider this a blog reboot. I have a few topics lined up, but it's going
to take some time to think through turning them into something presentable.