This is a list of projects I am working on, or have worked on. All are open source.
- Current work
- Small current projects
- Retired projects:
- An ongoing experiment in sharing and resharing content on the web. What we share now is URLs, pointers, but this doesn’t give the sharer much control over what people will experience. PageShot tries to add that control by giving sharers tools to copy exact experiences, annotate and collect them, summarize and improve on that content.
This is available as an alpha-level demo at pageshot.dev.mozaws.net.
- This took some of the ideas of TogetherJS, some of the tech from Browser Mirror, and thinking about them all in the context of a browser session. It’s implemented as a Firefox addon. The model we came to is one where a group of people share everything they do in a specific browser, with mutual awareness of what each person is doing (both in general, and with in-page feedback using TogetherJS), and people can join and present to each other in this context.
Our longer-term goal was for people to be able to understand what each other are doing — what pages they are interacting with, but also how they are interacting with those pages, similar to how you can understand what a person is doing when they are doing physical work.
Another goal is to look at how, using the same data we capture in order to share a person’s actions with everyone else, we can also use that data to create a richer record of what people are doing. We wanted knowledge capture and transfer between group members that can encompass not just one site or document, but a session, discussion, and research.
I narrated a screencast that we made to explain some of the concepts and things we learned from the experiment.
- TogetherJS (formerly TowTruck)
- My previous project at Mozilla. This is a service to enable real-time collaboration on any site.
- An experiment in extracting structured data from web sites. Works as a Firefox addon. This is basically furloughed (as of April 2013) pending progress on Web Activities. The addon allows custom scrapers to be run on sites, with several levels of separation to both allow scraping scripts to run safely and isolated from the site receiving the data. There is some more information on this site.
- The Cut-Out
Small (but current) Stuff
- Misc recipes
- Sometimes when I want to try out an idea, I throw it in this directory. There’s a grab bag of experiments here.
- Browser Commander
- An incomplete prototype of a kind of file manager built in the browser. Includes a small server and client library to provide Node.js APIs in the browser (which are proxied over WebSockets to the server).
- A small script to use git more like rsync for the purpose of deployment. Motivation explained some in this post.
- Not very active right now. This is a library to do client verification of Open Web App receipts. It’s not very exciting, but it does do a heck of a lot of error handling. It was an exercise in thoroughness. Maintained by other people now.
- I wanted to compile Christmas carols that I liked, without any fluff. Also it’s an example of an Open Web App.
- An attempt to turn ambiguous touch events into modal selections, as a client library.
- Browser Mirror
- This project has largely been superceded by TogetherJS.. This is like a screensharing system, except it works with the DOM instead of pixels - the page you are viewing is transmitted to the other party, but not a “live” page, literally just the things you see. Things like clicks are transmitted back to the original browser. Just like screensharing…? Started out as a “this can’t possibly work” project, but then it kind of worked.
- An offshoot of BrowserMirror and related ideas, taking a snapshot of the DOM and saving it, then allowing annotation of that static page. Like inline commenting without any of the tricky web overlay ideas.
I don’t currently work on these Python projects, but I authored some important (and now core) parts of the Python packaging ecosystem.
- The Python Installer Program. A very popular installation tool for Python. I wrote this as a kind of response to
easy_install, not because I hated easy_install, but because I knew it should be better and that it had relatively small problems that made people hate it. Pip solved problems people wanted solved, and it’s been very popular since then.
- This is an environment isolation tool for Python. The functionality is now going to be built more directly into Python itself, but virtualenv remains very popular for managing projects. workingenv was an early attempt at the same concept, but virtualenv stuck because it works really well. It’s just the right level of hack to get everything to work consistently and well. ### Python Web There’s a bunch of projects that I’ve worked on in the web space, either authoring or making major contributions. I’m not actively working on any of these myself any more. Some of the projects in this list are popular, some I think were influential. Several of them I *wish* were influential.
WebOb is a Python request and response library (along with some other miscellaneous pieces).
WebOb was an extraction of the ideas in Paste (as seen below), but with a bit more opinion. It’s also the result of letting a bunch of ideas gel for a while. WebOb has a very specific scope, and implements everything within that scope in an incredibly thorough way. Its completeness exceeds any other Python project I’m aware of (and other languages to the degree I’m aware fo them). WebOb understands HTTP really well - and it knows how to both read and write HTTP, client and server. It’s primarily focused on being a library for servers, but it can generate requests and parse responses as well, making it uniquely appropriate for middle layers.
WebOb is the basis of many Python web frameworks, Pyramid/Pylons probably foremost among them. (Django and Werkzeug/Flask being the notable exceptions.)
- Paste Core
Paste, as well as Paste Deploy and Paste Script, were part of a general effort to write a web framework toolkit. Each contained low-level routines for different tasks.
Paste core is primarily concerned with a set of independent tools for use with WSGI. This includes things like simple routing, static file serving (later turned into
webob.static), exception reporting and debugging (later turned into WebError), validation (would become
wsgiref.validate), logging, testing (would turn into WebTest), CGI routing, and other stuff.
Where possible much of this has been moved into WebOb or other packages built on WebOb.
The context in which Paste was created was one with a lot of competing Python web frameworks, where there was little overlap in the technology those frameworks used. Paste tried to create a basic foundation for sharing technology, while still allowing diversity in the actual web frameworks themselves. It succeeded to a degree.
Paste contained the first web-based interactive debugger for Python (if or when similar tools were built for other languages, I don’t know). Other tools are better now, but Paste pioneered the idea.
- Page Deploy
- Part of the Paste project, this is a configuration system for web apps. It has been used for a few things; for configuration during actual deployment and for application composition primarily.
- Paste Script
- Also part of the Paste project, this is a kind of framework for building command-line tools to build applications, and some useful commands to go with. It includes things like a web server container (does that make sense?), and by far the most popular piece has been
paster createwhich helps setup boilerplate files for new projects.
- Silver Lining
- This was an attempt to create a general web application hosting framework. Vaguely like Heroku; you’d lay your application out in a certain way, and using Silver Lining you could create a new server instance and upload and update your application. It supported both Python and PHP, with the potential for other languages in the future. I thought it was really cool, but no one else really got on board. Maybe once you decide to give up this level of control you would rather just pay for the service instead of using an open source product. I’m not sure. This was kind of my last hurrah for server-side web projects.
- This is a WebOb rewrite of
paste.fixture. WebTest is a tool to make functional tests of your WSGI-using web application (not specifically WebOb applications). It makes it easy to create artificial but full HTTP requests, send them to your application, and provides a bunch of helpers to inspect the results. One feature I think is notable (not challenging, just notable) is that it always contains an implicit “this request should work” assertion. Other similar frameworks force you to write
assertEqual(resp.code, 200)all the time; what nonsense! In WebTest this is implied unless you say otherwise. It does form parsing and cookie storage and other stuff too.
- A form validation and conversion toolkit for web applications. It’s built on the idea of a two-way validation/conversion pipeline, structuring an destructuring the content of HTML forms. It also includes
formencode.htmlfill, a library to rewrite forms to insert form values; a novel approach, one that I think deserved to be used more, but like much of FormEncode probably obsolete now as client-side validation is more appropriate. Like many Hard Problems of web development in the previous decade, this hard problem is best avoided instead of solved.
- lxml.html (and cssselect)
I didn’t write lxml (the general XML and HTML Python library, based on libxml2). But I did write the HTML-specific wrapper in lxml, which exposes HTML-specific semantics. This adds knowledge of links to lxml, and forms, and some other stuff - all of which has made lxml an excellent screen scraping library
I also wrote
lxml.cssselect. libxml2 contains an XPath parser, and cssselect translates CSS3 selectors to XPath expressions. If you are curious what this looks like, css2xpath is a little webapp to do those same translations. I hope you will agree that XPath is ugly.
- An experiment in commenting on web pages. Worked as an intermediary to the web page, which turned out to be extremely fragile. Also this was before we had good HTML parsing available, so it had to use the Old Terrible Tools (like an HTML SAX parser). I think Rietveld’s UI was inspired by this.
- It’s been a long time since I worked on this. This is one of my first major open source projects, that actually had pickup. SQLObject is an ORM (Object Relational Mapper). It pioneered many metaprogramming techniques in Python, allowing a declarative class statement to be mapped to SQL, including SQL expressions. (NumPy championed some of these features, but I believe SQLObject was the first to use these features for SQL expressions.) The project is still active, but not very active. But SQLObject was an inspiration for features in frameworks like SQLAlchemy.
- A very small templating language. I sometimes wanted to generate strings and simply string substitution could be too hard, and I wanted a no-frills no-complications template language. I think Tempita is really pretty slick. It also avoids the unnecessary punctuation other template languages have. It also supports structured content, I think sqltemplate is really cool, though I stopped using SQL before I wrote it.
- An App Engine tool that uses the Google Analytics API to get a comprehensive list of referrers. You could use this to watch for every referrer that came to your site, and check them off as you checked each one out. No one ever cared, but I found it useful (until it bit rotted).
- This project is now pretty much gone. It was an HTTP proxy that would rewrite all the outgoing requests to apply styles to those requests. The idea was to allow a diverse set of applications to be styled in a consistent way, to make up a “site”.
- A script testing framework. Lets you run a script in a subprocess, and inspect the results of that script: what its output was, stderr, error return codes, and any file changes.
- A mock library specifically for use with doctest. Using doctest, I found it was possible to make a pretty good mock library in almost no code. (It’s gotten a little bigger since then.)
- Mentioned above
- A Logo interpreter written in Python. A pretty complete implementation of the language, with a lot of support for calling into and out of Logo to Python.
- A very incomplete design of an “Application Package” for Python. Some of the motivation described here. The spec is probably the most interesting part.
- An attempt to extract a best-practice format for locating Python objects with a string, something that Paste Deploy did in a kind of half-assed way.
- Check for bad URLs, why not?
- A little WSGI debugging tool to show HTTP headers.
- This was intended to be a little framework to make it easier to build high-quality command line utilities.
- This was a general build tool I built at The Open Planning Project. A dead end I suppose, but I liked the architecture. Now mostly I try to avoid building anything.
- A parser toolkit for .ini files. Much nicer parser than ConfigParser.
- This was an extraction of
paste.evalexception, the interactive debugger in Paste.
- A small WSGI tool to try to protect “developer” backdoor access to web applications, in a generalized way.
- A small application to email a page to a person, kind of “share by email”. Uses lxml to capture a page in a way that can be embedded in an email.
- An Atompub server library. Atompub was supposed to be the culmination of REST thinking. This was my attempt to assimilate those ideas. No one ever actually cared about Atompub though, go figure? This was an interesting experiment, but never useful.
- The Object HTTP Mapper. An attempt to create an ORM-like layer for exposing objects over HTTP. ORMs are a little suspicious, OHM was even more suspicious. These days I’m pretty over REST, mostly because attempts like this to embrace REST were so unsatisfying.
- TaggerClient and TaggerStore
- An attempt to create a generalized tagging library. Part of a concept of postmodern, ad hoc, and eclectic components making up a web site.
- My best project name ever. This was a little piece of WSGI middleware (a wrapper you could apply around an application) to give live referrer statistics, so you could obsess over who came to your site.
- A silly hack of a WSGI middleware. This would spawn threads so that a request that took a really long time could progress, while still serving up some status information to the user. People use Celery queues and stuff like that these days, but for a little while some admin tasks could take a couple minutes, causing timeouts that would then abort the task.
- An experiment with HTML Overlays, kind of a microformat-style way of applying a template to a site, using simple directives to lay the content of a page over a template. Deliverance was a far more complete implementation of the idea, and one that didn’t require application cooperation, but this is the more refined and simple implementation of the idea.
- This would become
webob.client. This is a way of taking a WSGI request and sending it to another server.
- Embeds PHP in Python, as though a PHP app is another Python WSGI resource. Basically a hack to send a FastCGI request to an embedded PHP process.
- An overly ambitious attempt to embed structured data in web requests, in a way that they could be extracted “efficiently” in certain server layouts. Kind of optimizing an HTTP request into a function call when both endpoints were on the same server.
- Another framework! A clone of the CherryPy interfaces. It was kind of an attempt to bully CherryPy into using WSGI better.
- An attempt to wrap CherryPy in Paste-like semantics.
- An attempt to apply Paste containerization to Django.
- Trac with Paste Deploy configuration.
- A framework I wrote! Just for the heck of it.
- Another silly little framework.
- A little App Engine wiki. wikistorage is the underlying storage library for App Engine. I wanted to play around with the sandboxing App Engine provides, to allow more promiscuous code execution on a wiki.
- This would take a really large page and try to split it up into smaller pages.
- This never really got anywhere, but it was my aborted attempt to create a scraping framework for Python. SeeItSaveIt would be my current thinking on the matter.
- This is a web application that lets you transclude content, by basically providing an endpoint that will fetch a given page and parse and rewrite it so it can be included inline in a page.
- My most terrible monkeypatching hack ever! It allows you do to
from dtopt import ELLIPSISinside a running doctest, and add the global ELLIPSIS option based on that. Actually looks up the stack frame and injects bytecode into a function to accomplish this.
- Remember FIT? Nah, no one does. This was my attempt to create something like that for Python. It’s a silly idea all around.
- When App Engine first started it had some things that just felt needlessly different from “normal” Python. Probably foremost among these was that it had its own HTTP client library. This library expressed the normal HTTP libraries in terms of this App Engine specific library. That code was eventually integrated into App Engine itself. (App Engine’s own framework, webapp and now webapp2, also happen to use WebOb.)
- This was an Open Plans project to give a small REST service to map locations to areas. It used PostgreSQL, had some import tools, did it’s thing. My first and last real geospatial project.
- Twill is an old timey functional testing DSL for web applications. This was an attempt to use the same DSL in the browser for functional testing.
- An attempt to provide instructional overlays, so multi-step instructions on how to use a web page could be created.
- This did client-side transclusion using the microformat
<a href="resource" rel="include">content</a>
Like packaging, I love and hate testing. As a result I’ve tried a bunch of tools.
Small Python Projects
In addition to those bigger projects, there’s a bunch of small stuff I did that is no longer active.