Dirkjan Ochtman: writing

Single-source Python 2/3 doctests

Somewhere in 2009, I took over maintenance of CouchDB-Python from Christopher Lenz. While maintenance has slowed down over the years, since the core libraries work well and the CouchDB API has been quite stable, I still feel responsible for the project (I also still use it in a bunch of places). This being a Python project, it always felt like it would have to be ported to Python 3 sooner or later. Since it's working with a fairly deep HTTP API (as in, it uses a large subset of the protocol, with extensive hacking of httplib/http.client), the changes needed in string/bytes handling are quite involved.

My first serious attempt started in November of 2012, as evidenced from some old patches that I have lying around in mq repositories. I picked it back up again about a year later, until I had most of the tests passing, save for one specific category: the doctests. Specifically, the problem I had was with unicode literals (like u'str'). For Python 2.7 doctests, I needed the unicode annotation to pass the test. In Python 3, all strings are unicode; while unicode literals can be used in source code in Python 3.3 and later, the repr() of a string always lacks the unicode annotation. This resulted in lots of test failures like this:

======================================================================
FAIL: client (couchdb)
Doctest: couchdb.client
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.3/doctest.py", line 2154, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for couchdb.client
  File "./couchdb/client.py", line 8, in client

----------------------------------------------------------------------
File "./couchdb/client.py", line 15, in couchdb.client
Failed example:
    doc['type']
Expected:
    u'Person'
Got:
    'Person'
----------------------------------------------------------------------
File "./couchdb/client.py", line 17, in couchdb.client
Failed example:
    doc['name']
Expected:
    u'John Doe'
Got:
    'John Doe'

While these simple cases might have been easy to fix some other way (e.g. by printing the value instead of just asking for the representation), other cases would be significantly harder to fix that way. Here's one example:

----------------------------------------------------------------------
File "./couchdb/mapping.py", line 343, in couchdb.mapping.Document.items
Failed example:
    sorted(post.items())
Expected:
    [('_id', 'foo-bar'), ('author', u'Joe'), ('title', u'Foo bar')]
Got:
    [('_id', 'foo-bar'), ('author', 'Joe'), ('title', 'Foo bar')]

After asking around on the Python 3 porting mailing list, Lennart Regebro (the author of the Porting to Python 3 book) kindly pointed me to the relevant section of his book, but it didn't contain any great suggestions for this particular problem. It took me a few months to get back into it, but I started looking into the doctest APIs yesterday, and managed to figure out a fairly clean solution:

class Py23DocChecker(doctest.OutputChecker):
  def check_output(self, want, got, optionflags):
    if sys.version_info[0] > 2:
      want = re.sub("u'(.*?)'", "'\\1'", want)
      want = re.sub('u"(.*?)"', '"\\1"', want)
    return doctest.OutputChecker.check_output(self, want, got, optionflags)

As it turns out, the doctest API is pretty well-designed, so it allows you to pass in your own OutputChecker object. As its name indicates, this is the bit of code that compares the actual output and the expected output of a given example. By slightly processing the expected value when running on Python 3, we can make sure that actual and expected output match on both versions. Use it like this:

doctest.DocTestSuite(mod, checker=Py23DocChecker())

Fixing these test failures has cleared the way (along with some other fixes) for a Python 3-compatible CouchDB-Python release soon. I hope this will enable other projects to start moving in the direction of 3.x; at the very least, it should significantly lower the barrier for my own projects to start using Python 3.

Are We Meeting Yet?

For a few months now, I've worked on a little single-file web thingy: Are We Meeting Yet? (AWMY for short). Here are two example URLs:

Gervase Markham kindly wrote about it on his blog after I recommended it for a Firefox development meeting, which made me think I should write about it here.

What it is

AWMY is a tool to communicate event (meeting) times to geographically dispersed and therefore timezone-challenged audiences. This means it displays date/time values in (a) an original timezone, (b) the UTC timezone and (c) the user's local timezone, with a title or description and a countdown timer.

Critically, it supports recurring meetings in a way that a single URL will show the next meeting in the series no matter when it's loaded into the browser. This makes it a good fit for use in automatically generated meeting announcements. Currently, the only supported repeating modes are weekly and bi-weekly.

One of the design goals is to have nice-looking URLs; ideally, you can understand the meeting date/time from the URL even without clicking the link. For now, hacking the URL is the only way to create a new event page; this should be easy in most cases. I hope to add a form to make it even easier sometime soon.

Timezone support is based on the venerable Olson timezone database. I've put some thought into handling events near daylight savings transitions and tried to put in some warnings, but it's probably not perfect yet. At least weekend events close to daylight savings transitions should be somewhat rare.

The domain name was chosen because it fits in with a Mozilla meme (e.g. fast, pretty, small, popular, flash and probably others); I couldn't come up with a better alternative that was also still available. This one will hopefully be memorable at least for some part of the intended audience.

How to use it

In the current iteration, the page accepts a maximum of 5 arguments:

  • A timezone: a subset of Olson timezones are accepted and can be referenced in a few different forms. Only the continent timezones are accepted (e.g. "America/Los_Angeles", "Europe/Amsterdam"), plus the "UTC" timezone. The continent is optional (and left out in the canonical versions). A space can be used where underscores are used in timezone names.
  • A date: an ISO 6801-formatted date, like "2013-08-26". A three-letter weekday abbreviation also works here (like "Mon"), but it will emit a warning if used without the weekly repeating mode.
  • A time: ISO 6801-formatted 24-hour time, like "15:30".
  • A repeating mode: currently "w" for weekly or "b" for bi-weekly.
  • A title: any text.

If no timezone is provided, it's assumed to be UTC. Some examples:

Why

I got started based on some discussion on the mozilla-governance mailing list. Most Mozilla meetings are coordinated based on the timezone for the Mozilla HQ, in California. For many non-US participants, it's easier if meeting times are communicated in UTC, because they know their own UTC offset. However, this would change actual local meeting times based on daylight savings, which is a bit of a pain for recurring meetings. Therefore, it makes more sense to keep the reference meeting time in a timezone that has daylight savings, on the premise that most people live in zones that use mostly similar daylight savings schedules.

Some tools exist: for example, here's a timeanddate.com link use for a Firefox developer meeting. Although timeanddate.com has most of the information available from AWMY, it's provided in a much more cluttered fashion. Personally, I find it quite hard to visually parse that page to find the data I need. Of course, it does provide other useful features that AWMY does not currently offer.

I've also seen everytimezone.com used for this kind of thing; here's an example. It does provide the user with a sense of context, which is probably useful when you want to see what meeting times make sense in timezones you care about. For the purpose of communicating a single meeting time, it feels rather unfocused.

The user experience for these tools doesn't work well for this use case, so I thought I might be able to do better. On top of that, the other tools don't appear to handle recurring meetings. Having a stable URL for a series of events is useful when you want to point to a meeting time from many different places, but having to update each pointer every week is kind of a drag. Thus was born AWMY.

Future plans

At the top of my to do list is a feature to combine event series. This is mostly inspired by CouchDB meetings, which take place at alternating 13:00 UTC and 19:00 UTC times to accomodate people in different timezones. My current implementation strategy is to have a "merge" flag that signals another meeting series, such that two bi-weekly events series can be joined together.

As mentioned before, friendlier UI to build new events is one of my other priorities. A few form elements could go a long way, though I probably want a slightly more polished experience. I'll also have to figure out how to make dealing with series easy, in particular when working with the merging feature.

It would make sense to add a few other repeating modes, in particular "3rd Wednesday of each month"-like functionality. Offering ICS downloads would be nice. I would like each page to show the next meeting instance, if only as an indication that you're dealing with a recurrent event.

Because there's no server side component, I really want to keep all state in the URLs. On the other hand, I also want readable URLs. These goals don't always align well, so balancing them is an interesting act. I'm thinking about a way to generate alternative URLs that aren't as readable, but significantly shorter.

Wrapping up

I hope this will be a useful tool for the open source community (and anybody else who has a use for it). I'd be interested to hear your thoughts on what features would be most useful to add. If you want to contribute some code, that would be even better; check it out via the Bitbucket project. All feedback is welcome!