Ian Bicking: a blog - mozillahttps://ianbicking.org/2015-12-29T00:00:00-06:00TogetherJS as a Postmodern Programming Tool2013-10-31T00:00:00-05:002013-10-31T00:00:00-05:00Ian Bickingtag:ianbicking.org,2013-10-31:/blog/2013/10/togetherjs-a-postmodern-tool.html<p>One of the papers that I continue to refer to in my own thinking about technology is <a href="http://www.mcs.vuw.ac.nz/comp/Publications/CS-TR-02-9.abs.html">Notes on Postmodern Programming</a>. Martin Fowler has a <a href="http://martinfowler.com/bliki/PostModernProgramming.html">short summary</a>:</p>
<blockquote>
<p>The essence of it (at least for me) is that software development has long had a modernist viewpoint that admirable software systems are …</p></blockquote><p>One of the papers that I continue to refer to in my own thinking about technology is <a href="http://www.mcs.vuw.ac.nz/comp/Publications/CS-TR-02-9.abs.html">Notes on Postmodern Programming</a>. Martin Fowler has a <a href="http://martinfowler.com/bliki/PostModernProgramming.html">short summary</a>:</p>
<blockquote>
<p>The essence of it (at least for me) is that software development has long had a modernist viewpoint that admirable software systems are composed of uniform components, composed in a uniform and simple way. (Smalltalk and Lisp are good examples of this kind of thinking.) A post-modern view is that software is all sorts of different very different stuff glued together in all sorts of different ways (think Perl and Unix), and this style of software (big bucket of glue) isn’t a bad thing.</p>
</blockquote>
<p>In the built world I think of roads and rail as an interesting analog, roads as postmodern and rail as modern. Rail has a lot of cool properties. It’s <a href="http://www.csx.com/index.cfm/about-csx/projects-and-partnerships/fuel-efficiency/">really efficient</a>, one driver can handle a hundred cars, you can feel confident about where the train will and won’t go, and it can be a pretty smooth ride. But roads have some great properties too. You can ride all kinds of vehicles down them. It’s easy to build a driveway to access a road. If someone stops on the road you can go around them.</p>
<p>Or another perspective: postmodern developments accept the world as it is, while modern developments imagine a new better world. A cohesive modern technology can be great, completist, robust; and without those kinds of systems we couldn’t build what we have, it would all fall down as the geometrically cumulative nature of failure and multiple components would doom our systems to constant collapse. But modern systems are also more apt to fail during their development, to solve the wrong problem, to demand tolerances that are too low, or to be too ambitious.</p>
<p>During the design of <a href="https://togetherjs.com">TogetherJS</a> — a realtime collaboration and co-browsing library — I’ve approached it as a postmodern component.</p>
<p>The basic integration for TogetherJS isn’t an <span class="caps">API</span>, or a data model: it’s the <span class="caps">DOM</span>. It involves scanning the page, looking for changes and events that are the exposed artifacts an application can’t hide even if it wanted to.</p>
<p>There’s a lot we can do with the standard <span class="caps">DOM</span>, but we aren’t above integrating with other components. We scan for <a href="http://codemirror.net/">CodeMirror</a> and <a href="http://codemirror.net/"><span class="caps">ACE</span></a> editors by looking for the attributes they attach to elements. If you opt-in to YouTube support, we load their libraries to interact with Flash elements. We can’t support <em>everything</em>, but we’re not taking any principled stance on how much we’re willing to poke directly into other project’s artifacts.</p>
<p>From the perspective of code isolation, we try to insulate TogetherJS from the page. There’s only one exposed object (<code>TogetherJS</code>), there is a specific set of methods on that (our public interface). We bundle jQuery but do not use any version of jQuery already on the page, nor do we encourage anyone to use our version of jQuery. We try to avoid getting caught up in any messiness of the host application, but we do not judge you for your messiness.</p>
<p>But we also expose TogetherJS. You can import the modules, and we are not shy about exposing “private” parts of TogetherJS. You can see all the messages that go back and forth between clients, regardless of what part of TogetherJS produces or consumes those messages.</p>
<p>This creates potential fragility, but we try to mitigate that by making it very easy make your own static, stable, frozen copy of the client code. Once you get that working, it’s working. And this is why we work hard to keep the server as simple as possible. Any smarts go in the client unless it is <em>absolutely</em> necessary that it go in the server. (Arguably our server is modernist in design.)</p>
<h2>Why?</h2>
<p>I’m not always sure whether to say “I” or “we” — the architectural choices I’m describing are things that we came to consensus on as a team. But exactly <em>why</em> we consensed on them I am not sure, I mostly know my own motivations for the architecture.</p>
<p>We have built TogetherJS with an eye towards the breadth of the web. We considered creating a collaborative editor (with some other special features), but we didn’t really want to create a cool site, we wanted to create something that could magnify <em>other people’s</em> cool sites. What cool sites? We didn’t know! There has and continues to be a tension among all of us in the team as to how generic this tool can or should be: do we want to create an awesome tool for education, or support, or collaboration, or presentation, or…?</p>
<p>But we are not domain experts, and we have not created a domain-specific tool. This itself is a tension I see on the web in general: developers create the best tools for the domain they understand, development. Developing something like TogetherJS requires some considerable expertise in lots of technical areas (the <span class="caps">DOM</span>, browser security models, Javascript code organization, server integration, and so on). The domain experts are experts in other things. We want to empower those experts, and empower them to make tools, and not just use the tools that computer experts have made. We hope TogetherJS can meet them halfway.</p>
<p>So we’ve created a tool that does a bunch of stuff by default, and doesn’t care too much about how you went about building your site or app. With just a little integration you get lots of functionality. I think it is relatively hackable.</p>
<p>But we also want to enable really good collaboration, not just okay default collaboration. This is why we build ways to customize the tool, and we’re always open to new ways to do that customization. We also require integration for the client-side dynamic parts of your application, as you can’t seamlessly make everything work. You <em>kind of</em> can make everything work (which I’ll talk about below), but we want collaboration experience that is high-fidelity and context-aware.</p>
<p>Some examples where the context depends on the tools: you might want to let two people edit one document, but only one person can save it. You may want to allow two people to draw together, but simultaneously using different tools. You might want to run a game where you score people separately… or maybe you want to score them together. You might want a save to fork a document to some personal space, or you might want to keep the two people in sync. I could go on about the choices an application might encounter for a long time.</p>
<p>We also want something like progressive enhancement: you start out right away with something that works, and then improve upon that. The tool itself serves as an introduction to the tool.</p>
<h2>Roads Not Taken</h2>
<p>Our basic task/intention with TogetherJS has been to enable real-time collaboration, co-browsing, co-presence. Given that goal there are some two other viable approaches that I see, and that we didn’t choose to pursue. I’d classify each as more modernist than what we’ve done.</p>
<h3>The Realtime Database</h3>
<p>One path is to create a modernist realtime collaborative foundation for your application’s data. In this category is <a href="http://firebase.com/">Firebase</a>, <a href="http://www.meteor.com/">Meteor</a>, the <a href="https://developers.google.com/drive/realtime/">Google Realtime <span class="caps">API</span></a>, or <a href="http://sharejs.org/">ShareJS</a> With these tools you synchronize your Javascript models across all clients, and the rest of the collaboration flows from that.</p>
<p>Obviously this approach this has a strong appeal, as the area is quite active. The technique is robust, the tools can make a lot of guarantees about the models and consistency. To the degree you create reasonable deterministic predictable views on your models you can be assured of some consistency in the experience for all participants. Because the tools are low-level you can create a variety of collaboration experiences based on their core functionality.</p>
<p>There are three reasons we didn’t want to take this approach:</p>
<ol>
<li>
<p>These tools work best with greenfield development. You have to make all your models aware of these external data and event sources. Some of the greatest benefits come when you rely on these tools for much of your persistence.</p>
</li>
<li>
<p>The tools don’t apply very well to traditional websites. Sites that use <span class="caps">HTTP</span> requests and responses and dynamically generate <span class="caps">HTML</span> don’t have browser-accessible models to be synchronized (they have server-side models, but those don’t need to be synchronized and synchronizing them doesn’t itself provide a realtime experience).</p>
</li>
<li>
<p>Database tools are not <span class="caps">UI</span>-aware. A good collaboration experience doesn’t just involve synchronized state, it requires that people <em>understand</em> what is happening, and understand when other people are invoking actions.</p>
</li>
</ol>
<p>TogetherJS isn’t incompatible with realtime databases. In fact I think they should be very complementary: if your models are synchronized through a backend database you don’t have to synchronize through TogetherJS, removing a lot of the more challenging integration work with Javascript-heavy applications.</p>
<h3>The Screenshare</h3>
<p>Traditional screensharing happens at the pixel level, but I’m going to refer to something I’ll call <span class="caps">DOM</span> Screensharing. In this model the current/live state of the <span class="caps">DOM</span> is transferred from one browser to another. I’ll call the browser that starts the screensharing is the “source” and the browser that receives the <span class="caps">DOM</span> is the “viewer”.</p>
<p>Examples of this are <a href="http://usefirefly.com/">Firefly</a> or a now-dormant project of mine, <a href="https://github.com/mozilla/browsermirror">Browser Mirror</a>. I believe some other customer support tools also use this technique, but in general the technique seems fairly obscure.</p>
<p>In this approach you look at all the elements on the source browser, scrub out any scripts or event handlers, maybe scrub hidden elements (or don’t). Then you serialize this and send it to the viewer. The page is then recreated at a new <span class="caps">URL</span>, visually hard to distinguish from the original page, but it’s “dead”. Like with video you send diffs to save effort. You’ll want special handling for form elements. There are some other corner cases, but it’s all relatively doable. When I first got Browser Mirror working it surprised me how feasible it is, though it’s certainly a 90/10 kind of problem.</p>
<p>The benefit of this approach is that It Just Works in a lot of cases. It works behind authentication, works when pages are personalized, and interacts pretty well with dynamic pages (which ultimately display dynamic elements in the browser through the <span class="caps">DOM</span>). It is robust and broadly applicable. It is modernist while being almost the polar opposite of the realtime database approach.</p>
<p>Because the viewer has a dead version of the application, it’s not exactly easy to interact with. Form fields are easy. Rich controls though are <em>really hard</em>. You <em>can</em> pass events back to the source browser. In Browser Mirror you could click on anything on the viewer, and that click would be transmitted back to the source browser. This actually worked in lots of cases, though with a lot of latency. In some cases it couldn’t realistically work — can you sync mousedown, mousemove, or hover events? Unfortunately it’s also not possible to detect the presence of listeners on the <span class="caps">DOM</span> to detect what events are interesting, but even if you could those events are still too low-level.</p>
<p>Also the screensharing technique <em>because</em> of its broad applicability is not contextually aware. Do you want to give the viewer access to the application as though they are the same user as the person using the source browser? You can allow things like both people being scrolled to a different part of the page, but beyond that parallel work processes across a site or using overlapping tools is not really feasible.</p>
<p>Given these restrictions I was excited to learn from Browser Mirror but in TogetherJS to pursue something with the potential to be of a higher quality of experience.</p>
<p>Still I’ve considered resurrecting some portion of Browser Mirror within TogetherJS. Embracing a diversity of techniques only makes the tool more postmodern ;)</p>
<h2>An Additive Approach</h2>
<p>This acceptance of postmodern approaches means TogetherJS is being developed using an additive approach. More stuff. More configuration. More flags. More use cases.</p>
<p>The additive approach nearly always produces better results with each addition. But we all know where else it leads: unwieldy complexity, lack of focus, unreliable combinations. The approach is perilous. Still we must learn from the past without overlearning from the past. As a project we have and continue to invest time in code organization, and remind ourselves of the dangers. We’re trying our best to engage with the inherent complexity rather than denying or avoiding it.</p>
<p>We’ll see where it goes…</p>Nouning the Verb of Browsing2013-11-05T11:34:00-06:002013-11-05T11:34:00-06:00Ian Bickingtag:ianbicking.org,2013-11-05:/blog/2013/11/nouning-the-verb-of-browsing-and-activity.html<p>I was talking for a while with <a href="https://twitter.com/gregglind">Gregg Lind</a> about <a href="https://togetherjs.com">TogetherJS</a> and about all the ways it <em>could</em> and <em>should</em> be cool, if we keep building out this idea. Both to build out TogetherJS, but also the general area of <a href="http://en.wikipedia.org/wiki/Cobrowsing">cobrowsing</a> (cobrowsing is where two or more people can browse …</p><p>I was talking for a while with <a href="https://twitter.com/gregglind">Gregg Lind</a> about <a href="https://togetherjs.com">TogetherJS</a> and about all the ways it <em>could</em> and <em>should</em> be cool, if we keep building out this idea. Both to build out TogetherJS, but also the general area of <a href="http://en.wikipedia.org/wiki/Cobrowsing">cobrowsing</a> (cobrowsing is where two or more people can browse the web together, each from their own device).</p>
<p>In the course of the discussion Gregg had an idea that I’m becoming increasingly excited about. Can we use this to give users something new to own? Specifically: their actions.</p>
<p>A side-effect of any cobrowsing tool is that you send information about what you are doing to the other person. That’s how people see what each other are doing. Those messages are the “noun” I am referring to the title: all the actions you make become a set of messages that form a record of your actions. There’s a half-baked feature in TogetherJS where you can type <code>/record</code> in the chat window and it will pop up a window where you see a record of what you do, as a sequence of <span class="caps">JSON</span> messages.</p>
<p><center><img style="width: 100%" src="/static/media/togetherjs-record-screenshot.png"></center></p>
<p>There’s a lot of chatter in that log, but still it’s a relatively high-level log of actions, one that you could compress (e.g., by combining adjacent edits), filter, search, replay.</p>
<p>What we’re really talking about is a series of events. Not quite what TogetherJS produces, but the kind of document I’m talking about looks kind of like this:</p>
<style>
.user-log td {border: 1px #666 solid; padding: 0.4em;}
.user-log {margin-bottom: 1em;}
</style>
<table class="user-log">
<tr>
<th>Date</th>
<th>Type</th>
<th>Data</th>
</tr>
<tr>
<td>T+0</td>
<td><code>load</code></td>
<td>url: http://example.com</td>
</tr>
<tr>
<td>T+0.5</td>
<td><code>mousemove<code></td>
<td>element: #content:nth-child(2):nth-child(1); offset: 50%, 18%</td>
</tr>
<tr>
<td>T+1.6</td>
<td><code>click</code></td>
<td>element: #content:nth-child(2):nth-child(1)</td>
</tr>
</table>
<p>I.e, a list of actions. We try to anchor actions to elements instead of absolute coordinates. We give elements names; here I am using <span class="caps">CSS</span> notation, using <code>nth-child()</code> to handle elements that don’t have ids. It’s fairly simple.</p>
<h2>But Anyone Can Do This!</h2>
<p>Actually collecting that information and creating a bunch of <span class="caps">JSON</span> messages isn’t actually that hard. More importantly it’s not <em>new</em>. And yet a whole new category of development using this description of a person’s actions has not emerged. Why would this be any different in the context of cobrowsing?</p>
<h3>Information needs to be used</h3>
<p>One of the principles of <a href="http://microformats.org/about">Microformats</a> that I’ve most appreciated is the principle that data needs to be visible in order to be accurate. That is, if you have an address in the body of a page, and then a hidden <a href="http://en.wikipedia.org/wiki/Resource_Description_Framework"><span class="caps">RDF</span></a> address alongside it, everyone will be proofreading only the visible address.</p>
<p><em>Visibility</em> is not exactly what you need: you need someone to be using data in order to ensure that the data is accurate. Someone has to <em>care</em>. You don’t view the <code>src</code> attribute on images, but you can tell when it’s broken (though there’s lots of things you can’t tell, like when you are accidentally pointing to an offsite image – those problems are much more likely to slip through). Real visibility is nice, though, because in addition to detecting problems it also tends to make it much easier to fix problems. But I digress…</p>
<p>Cobrowsing means that something is built to actually <em>consume</em> that data. That means we’re checking the data and it means we’re selecting the data that actually means something to someone else.</p>
<h3>Free the data from the browser and page</h3>
<p>Browsers actually have all kinds of great information, normal developers just can’t <em>get</em> any of that information.</p>
<p>As I mentioned, I like the concept of Microformats. But the reality of Microformats has been disappointing. You can’t <em>do</em> anything with them. They are just stuck in a page on the browser. The creative remixing of that data is possible with effort, but apparently never enough reward.</p>
<p>Cobrowsing means always exporting that data, at least to your collaborator. That means a really big barrier is automatically overcome.</p>
<p>The recorder I mention above is actually just a mock collaborator, that instead of interacting just remembers everything that happens.</p>
<h2>But It Makes Me Afraid!</h2>
<p>When I talk about <a href="https://togetherjs.com">TogetherJS</a> people frequently comment (with a sly wink): <em>this would be a great tool to spy on people with, wouldn’t it?</em></p>
<p>And in a sense, yes. But TogetherJS doesn’t do anything that a website can’t do already, and which many sites actually do right now. Because TogetherJS runs in content, at the behest of the site owner, it doesn’t really change what’s <em>possible</em>.</p>
<p>With more expansive cobrowsing this starts to change. Give people a new thing to own, and you also create a new thing that can be stolen. This might also be a justification: the harm will be in proportion to the good.</p>
<p>One benefit to the concrete nature of cobrowsing is that it builds awareness in people about what exactly is being exported. When you watch your collaborator while cobrowsing, you see their mouse, inputs and edits and backspaces, new URLs they go to, etc. It becomes clear that these are the things you are sharing.</p>
<p>This does not keep the tool from transmitting hidden information, which is why by principle we must implement these tools exposing only what information we need to: not because of simple conservatism, but because the information we <em>need</em> to export is also the information we present to the collaborator, and so it is the information that a user understands is being exposed. The more thoroughly we utilize information the better the user’s understanding of the scope of that information.</p>
<h2>And Why Is This Cool?</h2>
<p>What, do I have to spell everything out for you? <strong>I’m writing this so other people come up with cool ideas.</strong></p>
<p>But anyway, a few thoughts:</p>
<ul>
<li>
<p>cobrowsing is really cool. And it’s a whole category of interactions, not just a single tool.</p>
</li>
<li>
<p>Provides a kind of high-level recording of an interaction. Like a screencast, but you can parse it as something other than pixels. Replaying later you can add contextual navigation, like automatic detection of “interesting” events.</p>
</li>
<li>
<p>Test recording.</p>
</li>
<li>
<p>Sequencing user actions alongside application state, for understanding bugs or usability issues.</p>
</li>
<li>
<p>Automation: turn a series of actions into a bookmarklet to repeat those actions.</p>
</li>
<li>
<p>Depending on what is exported, it could be a form of data extraction.</p>
</li>
<li>
<p>As we become better at understanding these logs of activity, we can start remixing or editing them before sending them to these other tools.</p>
</li>
<li>
<p>Tools that <em>consume</em> these activity logs become powerable by external sources. For instance, the automation could take the form of an external robot producing events, but what it does would use all the same permission and auditing abilities you’d have for working with other collaborators (who, like a robot, you may only half-trust to do things correctly).</p>
</li>
</ul>
<p>Of course a bunch of these things are being done right now. But they aren’t very accessible, they don’t scale to the uninitiated, they typically lack transparency, they tend to be fragile. And the idea of a <em>log of actions</em> isn’t central to existing techniques – they typically go from capturing events directly to producing whatever the final creation is.</p>
<p>In the short term, given a cobrowsing tool, creating an interesting robot to collaborate with is incredibly easy. This leaves room for people to spend their intellectual effort on doing cool stuff.</p>
<p>This is the kind of data I’d be excited to hack on, and I hope this article gets you thinking the same way. Now we just have to get this cobrowsing thing going…</p>Live Programming, Walkabout.js2013-11-27T11:30:00-06:002013-11-27T11:30:00-06:00Ian Bickingtag:ianbicking.org,2013-11-27:/blog/2013/11/live-programming-walkabout.html<p>There’s a number of “live programming” environments used for education. <a href="https://www.khanacademy.org/cs">Khan Academy</a> is one example. In it, you write code on the left hand side, and you immediately see the result on the right hand side. You don’t hit “save” or “run” — it’s just always running.</p>
<p><center><img style="width: 100%" src="/static/media/khan-screenshot.png"></center></p>
<p>There …</p><p>There’s a number of “live programming” environments used for education. <a href="https://www.khanacademy.org/cs">Khan Academy</a> is one example. In it, you write code on the left hand side, and you immediately see the result on the right hand side. You don’t hit “save” or “run” — it’s just always running.</p>
<p><center><img style="width: 100%" src="/static/media/khan-screenshot.png"></center></p>
<p>There are a lot of nice features to this. There’s the feedback cycle: everything always <em>happens</em>. Or, if you get something wrong, it distinctly <em>doesn’t happen</em>. It’s similar to the static analysis we so often use — from the simplest case of syntax highlighting (which often finds syntax errors) to code lint tools, type checking or <a href="http://en.wikipedia.org/wiki/Intelli-sense">Intelli-sense</a>. Live coding takes this further and makes execution itself somewhat static.</p>
<p>One of the nice parts about actually <em>running</em> the code is that you aren’t relying on static analysis, which is always limited. The only thorough analysis is to model the program’s execution by executing the program. Not to mention it allows the programmer to detect bugs that just cause the program to do the wrong thing, or to be incomplete, but not clearly incorrect, not in error. For instance, in the Khan example I make the shapes transparent:</p>
<p><center><img style="width: 100%" src="/static/media/khan-screenshot-transparent.png"></center></p>
<p>No static analysis could tell me that this produces an unattractive picture of a person. Proponents of static analysis tend to have a limited concept of “bug” that doesn’t include this sort of problem.</p>
<p>To imagine what live execution might look like when applied more dramatically, you might want to check out <a href="http://worrydream.com/LearnableProgramming/">Learnable Programming</a> by Bret Victor. Underlying all his mockups is the expectation that the code is being run and analyzed at all times.</p>
<p><center><img style="width: 100%" src="/static/media/learnable-screenshot.png"></center></p>
<p>That’s all cool… except you can’t just <em>run</em> code all the time. It works for code that produces basically the same output every time it is run, that requires no input, that isn’t reactive or interactive. This is all true for <a href="http://processingjs.org/">Processing.js</a> programs which Khan Academy and the other live programming environments I’ve seen use (and Khan Academy even disables random numbers to ensure consistency). Processing.js is focused on drawing pictures, and drawing via code is okay, but… it doesn’t excite me. What excites me about code is its emergent properties, how the execution of the program evolves. When you write interesting code you can enable things you didn’t realize, things that you won’t realize until you explore that same code. What happens when you interact with it in a new order? What happens when you give it new input? When a program always produces the same output it makes me feel like the program could be substituted by its output. Who needs to program a drawing when you can just use a drawing program?</p>
<p>I was thinking about these things when I was looking at <a href="http://waterbearlang.com/">Waterbear</a>, which is a graphical/pluggable-tile programming language (very similar to <a href="http://scratch.mit.edu/">Scratch</a>).</p>
<p><center><img style="width: 100%" src="/static/media/waterbear-screenshot.png"></center></p>
<p>A nice aspect of that sort of language is that you are forced to think in terms of the <a href="http://en.wikipedia.org/wiki/Abstract_syntax_tree"><span class="caps">AST</span></a> instead of text, because all those tiles <em>are</em> the <span class="caps">AST</span>. You also get a menu of everything the language can do, including its primitive functions.</p>
<p><center><img src="/static/media/waterbear-screenshot-list.png"></center></p>
<p>With the language laid out like that, I saw that most of it was nice and static and deterministic. Control structures are deterministic: <code>if COND then IFTRUE else IFFALSE</code> always executes the same code given the same input. Most everything is: appending to a list always produces the same result, adding numbers always produces the same result. The list of the non-deterministic building blocks of a program is <em>really small</em>.</p>
<p>And this is exciting! If you can find all the non-deterministic parts of a program and come up with a range of viable results to plug in (i.e., mock) then you can run more-or-less the entire program. And the more I think about it, the more I realize that the list of non-deterministic parts can be quite small for many programs.</p>
<p>For instance, consider this program:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">random</span>
<span class="k">def</span> <span class="nf">guesser</span><span class="p">():</span>
<span class="n">number</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"I'm thinking of a number between 1 and 10"</span><span class="p">)</span>
<span class="k">while</span> <span class="kc">True</span><span class="p">:</span>
<span class="n">i</span> <span class="o">=</span> <span class="nb">input</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">i</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">ValueError</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Please enter a number"</span><span class="p">)</span>
<span class="k">continue</span>
<span class="k">if</span> <span class="n">i</span> <span class="o">==</span> <span class="n">number</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"You win!"</span><span class="p">)</span>
<span class="k">break</span>
<span class="k">elif</span> <span class="n">i</span> <span class="o"><</span> <span class="n">number</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Too small!"</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Too large!"</span><span class="p">)</span>
</pre></div>
<p>This is a simple program, but it can execute in lots of ways. There’s two non-deterministic parts: <code>random.randint()</code> and <code>input()</code>. The first can be made deterministic by seeding the random number generator with a known value (and the program can be exercised with multiple runs with multiple seeds). The second is trickier. We know <code>input()</code> returns a string that the user inputs, one line long. But if you throw random strings at the program you won’t get something very interesting. So we need just a little more help, a suggestion of what the person might return. E.g., <code>input.suggest_returns = lambda: str(random.randint(-1, 11))</code> — it’s still valid that it can return anything, but we’ll be able to best exercise the program with those inputs. We still don’t have a smart player for our game, but it’s something.</p>
<p>This approach to exercising code is exciting because it’s basically automatic: you write your program, and if you are using primitives that have been setup for mocking, then it’s testable. You can build tools around it, the tools can find cases where things go wrong and replay those specific cases for the programmer until they are fixed.</p>
<p>It’s still a challenge to actually get deep into the program: the primitives often don’t express the expectation. For instance in this guessing program it’s valid to enter “one”, but it’s not not very <em>interesting</em>. If you are testing something interactive you might have a Cancel button that undoes a bunch of inputs; while it’s worth hitting Cancel every so often, generally it’s not interesting, even anti-interesting.</p>
<p>But with these thoughts in mind I was naturally drawn to the browser. A browser Javascript program is handy because it has very specific and a fairly limited set of primitives. Nearly everything that’s not deterministic would be considered part of the <a href="http://en.wikipedia.org/wiki/Document_Object_Model"><span class="caps">DOM</span></a>, which includes not just the <span class="caps">HTML</span> page but also (at least in the terminology used by browser insiders) includes all the browser-specific functions exposed to content.</p>
<p>In the case of a browser program, the program tends to be fairly reactive: much of what happens is the program listening for events. This means much of the logic of the program is invoked from the outside. This is helpful because (with some effort) we can detect those listeners, and figure out what events the program is actually interested in (since something like a click can happen <em>anywhere</em>, but usually to no effect). Then you must also filter out handlers that apply to something that is not at the moment possible, for instance a click handler on an element that is not visible.</p>
<p>Trying to exercise a program is not the same as actually confirming the program did the right thing. This testing practice will reward the program that is littered with asserts. Asserts can’t be statically examined, and in that way they are worse than static types, but they can address things that can’t be statically described.</p>
<p>I believe there is a term for this concept: <em>generative testing</em> (for example, <a href="https://github.com/strangeloop/strangeloop2012/blob/master/slides/sessions/SpiewakBedra-PontificatingQuantification.pdf">some slides from a presentation</a>. Most of what I’ve seen under that name involves relatively small examples, with explicitly defined domains of input and output. I’m proposing to do this at the scale of an application, not a routine; to define inputs as any non-deterministic query or listener; and to define failure as some inline assertion error or warning.</p>
<h2>Let’s Do It…?</h2>
<p>With this in mind I created a library: <a href="https://github.com/ianb/walkabout.js">Walkabout.js</a>. This either uses the evidence jQuery leaves about bound event handlers, or it can use source code rewriting to track event handlers (tracking event handlers is harder than I would like). From this list it can create a list of plausible actions that can take place, seeing what elements might be clicked, hovered over, selected, etc., filtering out elements that aren’t visible, and so on. Then it uses a pseudo-random number generator to select an action, while checking for uncaught exceptions or warnings written to the console.</p>
<p>The library isn’t complete in what it mocks out, but that’s just a matter of doing more work. It’s a little harder to mock out server interaction, because there’s easy no way to know what exactly to expect back from the server — though if the server is deterministic (and the server’s state can be reset each run) then it’s okay to use it without mocking. <strong>Nothing deterministic need be mocked</strong> including external components.</p>
<p>There’s a lot I’d like to change about Walkabout.js’s code itself (my opinions on Javascript have changed since I first wrote it), but I worry I get ahead of myself by doing another round of development on it right now. There’s non-trivial tooling required to use this tool, and I need to find a larger environment where it can make sense. Or at least I <em>want</em> to find that environment, because I think the result will be more compelling.</p>
<p>Another big task to consider is how to actually explore the program in depth. It’s easy to come up with really boring, long, useless sequences of actions. Open dialog, close dialog x 100. Enter text, clear text x 10. Hitting some control that terminates the application is only interesting once. And though computers are <em>fast</em> they aren’t so fast they can spend most of their time doing completely useless things. I want my failures now!</p>
<p>To explore an application in depth we need to effectively <em>search</em> the application, using the range of possible inputs. The first idea for scoring a result that I thought of is code coverage: if you are reaching new code, then you are doing something interesting. Then the tooling becomes even more heavy-weight, you have to do code coverage and constantly track it to find productive avenues. Then a second, simpler idea: look for new sets of available inputs. If there’s a new button to click or new fields to interact with, then we’ve probably accomplished something. Continue to explore from that point forward. This option requires only the tooling we already have!</p>
<h2>Why Are We Doing This Again?</h2>
<p>In addition to just thinking about “live programming” I think this can be a great testing tool in general. And generally I’m suspicious of programming tools that are only applicable to toy/educational programming environments.</p>
<p>A common alternative approach to what I describe is to <em>record</em> user input, and then replay it as a test. It’s like random testing, only instead of a random number generator you have a person. This is basically a refinement of the standard practice of creating a script for a functional test that exercises your full application.</p>
<p>If you’ve used this approach you’ve probably found it annoying. Because it is. When you replay a recording and it doesn’t work, what is more likely: the application is broken, or you deliberately changed the application in a way that affects how the recording replays? In my experience 9 times out of 10 it’s the latter. <strong>We spend too much time fixing test failures that are not bugs.</strong></p>
<p>The beauty of the generative approach is that it responds to your changes. It takes your program as it is, not as you might wish it to be. It runs the actions that are valid with <em>this</em> code, not some past version of your code. And the “tests” aren’t expected input and output, they are assertions, and those assertions live right with the code and stay updated with that code. If we care about testing, why don’t we include testing in the code itself? If you want to entertain various possible inputs why not suggest what you are expecting directly in the code?</p>
<p>Once you are exercising the code, you can also learn a lot more about the code at runtime. What kinds of object are assigned to a particular variable? How are pieces of code linked? What is the <em>temporally</em> related code? Given code coverage, you could isolate patterns that exercise a particular line of code. Having found a bug, you also have a script to reach that bug. Having made a change, you could identify past scripts that reach that changed area, giving you a chance to dive into the effect of that change. Many of these kinds of tools would be valid in a general sense, but require a well-exercised program to be useful — because most software tooling doesn’t include a “do lots of stuff” option we’re holding ourself back when it comes to runtime analysis.</p>
<p>So what do you think?</p>
<p>If you want to give it a really quick/rough try, go <a href="http://ianb.github.io/walkabout.js/">here</a>, grab the bookmarklet, and go to a single-page app and try it out. It might do silly things, or nothing, but maybe it’ll do something interesting?</p>Hubot, Chat, The Web, and Working in the Open2014-02-14T12:26:00-06:002014-02-14T12:26:00-06:00Ian Bickingtag:ianbicking.org,2014-02-14:/blog/2014/02/hubot-chat-web-working-in-the-open.html<p>I was listening to a <a href="http://hanselminutes.com/375/on-culture-and-remoteness-at-github-with-paul-betts-and-justin-spahr-summers">podcast with some people from GitHub</a> and I was struck by <a href="http://hubot.github.com/">Hubot</a>.</p>
<p>My understanding of what they are doing: Hubot is a chat bot — in this case it hangs out in <a href="https://campfirenow.com/">Campfire</a> chat rooms, but it could equally be an <a href="http://en.wikipedia.org/wiki/Internet_Relay_Chat_bot"><span class="caps">IRC</span> bot</a>. It started out …</p><p>I was listening to a <a href="http://hanselminutes.com/375/on-culture-and-remoteness-at-github-with-paul-betts-and-justin-spahr-summers">podcast with some people from GitHub</a> and I was struck by <a href="http://hubot.github.com/">Hubot</a>.</p>
<p>My understanding of what they are doing: Hubot is a chat bot — in this case it hangs out in <a href="https://campfirenow.com/">Campfire</a> chat rooms, but it could equally be an <a href="http://en.wikipedia.org/wiki/Internet_Relay_Chat_bot"><span class="caps">IRC</span> bot</a>. It started out doing silly things, as bots often do, then started offering up status messages. Eventually it got a command language where you could actually <em>do</em> things, like deploy servers.</p>
<p>As described, as Hubot grew new powers it has given people at GitHub new ways to work together with some interesting features:</p>
<ol>
<li>
<p>Everyone else (in your room/group) can see you interacting with Hubot. This gives people awareness of what each other are doing.</p>
</li>
<li>
<p>There’s organic knowledge sharing. When you watch someone doing stuff, you learn how to do it yourself. If you ask a question and someone answers the question by doing stuff <em>in that same channel</em> then the learning is very concrete and natural.</p>
</li>
<li>
<p>You get a history of stuff that was done. In GitHub’s case they have custom logging and search interfaces for their Campfire channels, so there’s a searchable database of everything that happens in chat rooms.</p>
</li>
<li>
<p>What makes search, learnability, and those interactions so useful is that <em>actions</em> are intermixed with <em>discussion</em>. It’s only modestly interesting that you could search back in history to find commands to Hubot. It’s far more interesting if you can see the context of those commands, the intentions or mistakes that lead to that command.</p>
</li>
</ol>
<p>This setup has come back to mind repeatedly while I’ve been thinking about the concepts that <a href="http://www.whatthedruck.com/">Aaron</a> and I have been working through with <a href="https://togetherjs.com/">TogetherJS</a>, my older <a href="https://github.com/mozilla/browsermirror">Browser Mirror project</a> and now with <a href="https://github.com/mozilla/hotdish/">Hotdish, our new experiment in browser collaboration</a>.</p>
<p>With each of these I’ve found myself expanding the scope of what we capture and share with the group — a single person’s session (in Browser Mirror), multiple people working in parallel across a site (in TogetherJS), and then multiple people working across a browser session (Hotdish). One motivation for this expansion is to place these individual web interactions in a social, and purposeful, context. In the same way your Hubot interactions are surrounded by a conversation, I want to surround web interactions in a person’s or group’s thought process: to expose some of the <em>why</em> behind those actions.</p>
<p>What would it look like if we could get these features of Hubot, but with a workflow that encompasses any web-based tool? I don’t know, but a few thoughts taken from the previous list:</p>
<ol>
<li>
<p>Expose your team-related browsing to your team. Give other people some sense of what you are doing. <strong>Questions</strong>: should you lead in with an explicit “I am trying to do X”? Or can a well-connected team infer purpose or query you about your purpose given just a set of actions? If you use a task management tool — issue tracker, project management tool, <span class="caps">CRM</span>, etc — is that launching point itself sufficient declaration of intent?</p>
</li>
<li>
<p>Let other people jump in, watching or participating in a session. You might start with an overview of their browsing activity, as it’s just too much information to watch it all flow by, as you might be able to do with Hubot. But then you want to support closer interaction. It might be a little like being the passenger in a <a href="https://bitbucket.org/spooning/">pair programming</a> situation, except instead of watching the other person by literally looking over their shoulder, we can let you opt in to watching remotely, and maybe allow for catching up or summarizing segments of the work, instead of requiring the two people to be linked in real time through the entire process. <strong>Questions</strong>: how do you determine that something is going to be of interest to you? Do the participants stay in well-defined leading/following roles, or do they switch?</p>
</li>
<li>
<p>Record actions. Maybe this means “going on the record” sometimes. Ideally you’d be able to go on the record retroactively, like holding a recording locally and allowing you to put that recording in a global record if you decide it is needed. One can imagine different levels of granularity possible for the recording. A simple list of URLs you visited. A recording of <span class="caps">DOM</span> states. Some applications might be able to expose their own internal states that can be reconstructed, like in an automatically versioned resource like Google Docs the internal version numbers would be sufficient to see the context at that moment. <strong>Questions</strong>: how do you figure out what information is actually useful? Is it possible to save everything and analyze later, or is that too much data (and traffic)? Can we automatically curate?</p>
</li>
<li>
<p>Push enough communication through the browsing context and collaboration tool that there is a context for the actions. This helps identify false starts (both to trim them, but also as an opportunity to help with future similar false starts), underlying purposes, bugs in the communication process itself (“I was trying to ask you to do X, but you thought I meant Y”), and give a resource to match future goals and purposes against past work. <strong>Questions</strong>: does this make voice communication sub-optimal (compared to searchable text chat)? Do we want to identify subtasks? Or is it better to flatten everything to the group’s purpose — in some sense all tasks relate to the purpose?</p>
</li>
</ol>
<p>Now you might ask: why web/browser focused instead of application-focused, or a tool that coordinates all these tasks (Google Docs/Apps? Wave?), or communication-tool-focused (like Hubot and Campfire are)? Mostly because I think that web-based tools encompass <em>enough</em> and will consistently encompass more of our work, and because the web makes these things feasible — it might be a half-assed semantic system, but it’s more semantic than anything else. And of course the web is cloudy, which in this case is important because it means a third party (someone watching, or a recording) has a similar perspective to the person doing the action. Personal computing is challenging because of a huge local state that is hard to identify and communicate to observers.</p>
<p>I think there’s an idea here, and one that doesn’t require recreating every tool individually to embody these ideas, but instead can happen at the platform level (the platform here being the browser).</p>Defaulting To Together2014-02-17T12:31:00-06:002014-02-17T12:31:00-06:00Ian Bickingtag:ianbicking.org,2014-02-17:/blog/2014/02/defaulting-to-together.html<p>I’ve been working on an experiment, <a href="https://github.com/mozilla/hotdish/">Hotdish</a>, for several weeks now with <a href="http://www.whatthedruck.com/">Aaron Druck</a> and <a href="https://twitter.com/gregglind">Gregg Lind</a>. I’m really excited about what we’re doing, and in particular I’m excited about some of the principles we are bringing to the design. Hotdish is an experiment in sharing …</p><p>I’ve been working on an experiment, <a href="https://github.com/mozilla/hotdish/">Hotdish</a>, for several weeks now with <a href="http://www.whatthedruck.com/">Aaron Druck</a> and <a href="https://twitter.com/gregglind">Gregg Lind</a>. I’m really excited about what we’re doing, and in particular I’m excited about some of the principles we are bringing to the design. Hotdish is an experiment in sharing a browser session among a group of peers — you activate Hotdish on one browser window and everyone in your group sees what you do in that window, and we have tools to interact in the context of that session.</p>
<p>We’ve started the design with the expectation that Hotdish will be used with a group of people you want to be working with, and we expect that you trust this group of people. We’re not building this as an internet-wide tool, one where people will be trolling each other, or one where some people will be an order of magnitude more noisy than everyone else. So when we take a regular internet collaboration/cooperation idea and rephrase it in the context of Hotdish we think about how we can change default behaviors to make use of that trust.</p>
<p>Instead of using the tool to restrict people from bothering each other, we want to create a tool that <strong>enables powerful new ways for one person to bother another</strong> in the group. If that’s a problem, we expect you to deal with that socially rather than building something into the tool. (<a href="http://www.youtube.com/watch?v=GtrSn8WwCa4">Use your words!</a>)</p>
<h3>Pushing tabs instead of posting links</h3>
<p>In Hotdish instead of <em>asking</em> someone to go to a link, you <em>push</em> a link to everyone. The normal flow is you copy the <span class="caps">URL</span>, you go to your communication medium (instant message, chat room, shared document), you paste the <span class="caps">URL</span>, and then you beg everyone to please click it. Then you ask if everyone is really there yet? Then you make sure everyone knows it wasn’t the second-to-last link, but the very last link you pasted. Just this minute. Wait, no, not the link that other person pasted in, though I suppose you should all go there too. <span class="caps">OK</span>, are we all on the same page now? Oh wait, I just made a change, can everyone reload?</p>
<p>No: with Hotdish you just push the link, <em>make</em> it open for everyone, and once you’ve done it you’ll even get a second confirmation because we show who on a page. The only problem is right now we open a background tab (because forcing a tab switch is too jarring), but I want to figure out how to push even harder, to let people be more assertive if they choose. Like maybe if you push twice in succession everyone gets a big “Alice really wants you to see: [page]” notification that you can’t really ignore, or if you explicitly ignore it then Alice knows you decided to ignore it.</p>
<h3>Presenting and peeking in</h3>
<p>Another example where we’re being aggressive: in Hotdish you can <em>present</em> a page to someone, showing them exactly what you see (including details that might not be the same for them if they went to the same <span class="caps">URL</span>). But you can also <em>view</em> someone else’s page. When you “view” a page you are viewing the page exactly as the other person sees it (as opposed to simply visiting the same <span class="caps">URL</span> as the other person). It’s like the ability to peek over anyone’s shoulder.</p>
<h3>Audio/video <span class="caps">CB</span>/walkie-talkie mode</h3>
<p>We haven’t added any voice tools to Hotdish yet. It didn’t seem like the heart of what we were trying to explore — of course we knew we’d want to enable communication among the group, but we didn’t think we’d bring anything new, and so the effort didn’t feel like it would bring much. But then we also hadn’t thought about how we might rethinking the ideas in this concept, instead we were just borrowing what seemed like the obvious interface.</p>
<p>After some thought a feature I’d like to try is a talk-at-anyone mode. This is a little like <a href="https://www.sqwiggle.com/">Sqwiggle</a>, where you can talk with anyone without confirmation. Still I find the Sqwiggle model a little much, where anyone can watch me on their own volition. Maybe it’s fine, I might be wrong. But I’m more open to anyone talking <em>to</em> me, and then requiring confirmation before they can listen or watch me. Having someone yell at me spontaneously would probably be annoying, but that’s a social problem.</p>
<h3>Sharing a record of your activity</h3>
<p>The core feature of Hotdish is that everyone can see any of the pages you open, and see some of your navigational behavior — like when you go to a new page or change active tabs. Realistically you would not do everything in your Hotdish window (Hotdish exposes only one browser window to the group — everything you do in other windows remains personal), but I hope that Hotdish could allow people to do more things in front of their peers. For instance, from an article <a href="https://source.opennews.org/en-US/learning/making-remote-work-work/"><em>Making Remote Work Work</em></a>:</p>
<blockquote>
<p>It’s my earnest belief that some people will have higher expectations for you because you work remotely. It’s very easy for them to believe you’re in your underwear playing Final Fantasy instead of slogging through the documentation for Django. Not all work has obvious output and when they can’t see you at your desk, it’s tempting to log those blank hours as time wasted.</p>
</blockquote>
<p>One of the hidden parts of work (paid work, school work, or volunteer work) that I want to expose with Hotdish is <strong>research</strong>. Research is slogging through the docs, finding out if some idea you’ve had <em>maybe</em> exists (and perhaps finding out doesn’t), it’s finding the right term, looking up a date or meeting… it’s all these little things we constantly do. But those things are never the focus of “collaboration”. Research is the stuff you do <em>before</em> you can tell anyone what you’ve learned. It gets seen as a prerequisite to accomplishing real work, instead of a part of what it means to <em>do</em> real work. And yet when asked everyone will defend the value of research: we know we should value this thing, but because we have a hard time seeing it too often we do not.</p>
<p>By putting your work in front of anyone — even if you aren’t trying to share anything with anyone, or have not come to any determination — I hope we can make research as collaborative as conclusions are.</p>
<h3>Recording history</h3>
<p>A feature we are exploring is called the <em>Activity Log</em>: a persistent record what happens in the group. Our first foray into this is somewhat comically primitive, we are just pasting the activities into an <a href="http://etherpad.org/">Etherpad</a> document. It’s primitive but I’m going to have a hard time getting myself to replace it because there’s something that just feels really <em>right</em> about using an editor.</p>
<p>After my <a href="http://www.ianbicking.org/blog/2014/02/hubot-chat-web-working-in-the-open.html">last post</a> I got <a href="https://plus.google.com/+IanBicking/posts/NvnBBQ6eCFe">a comment</a> challenging me to consider the “social implications of showing others one’s mistakes”. A fair challenge, and honestly I have <em>tried</em> to ignore that for all the reasons I’m talking about here. But I think in this silly model of recording activities to a text editor there is also a response to this: we keep a record, along with everything else we do, because we want to build a model for a constructive and supportive group to enhance their work together. But a constructive and supportive group is also based on trust. One of the ways we demonstrate trust is with things like using an editable document instead of a strict log: we should trust each other to edit history, and edit out history. We should trust that people use that power well.</p>
<p>In a live environment like Hotdish it’s hard to actually make sure no one saw something. You open a link, it shows up to everyone that moment. You close it, and maybe we can figure out a way to keep it out of history, or allow you to remove it from history, but we can’t remove it from the memories of everyone who saw it. But this is another kind of politeness: we ask that people respect even our retroactive attempts at privacy. This is something that Facebook, for instance, works pretty hard at — they do their best to make deleted content really disappear. Programmer-designed tools tend to be horrible at this. I think because the programmer knows you can’t <em>really</em> delete history, you can’t know what has been recorded on other clients, you can’t erase people’s memory. They don’t put value on politely agreeing to forget. And so programmer-designed tools almost never let you edit history. We will not make this mistake.</p>
<p>So that’s some of what we’re thinking. Some of these ideas aren’t going to work out. Dogfooding will be essential. But we can’t see how far we can go with putting people together unless we go too far and then pull back.</p>
<p>I’m interested in other ideas for somewhat uncomfortably intimate browser-mediated sharing experiences. Have any?</p>Collaboration as a Skeuomorphism for Agents2014-02-21T15:06:00-06:002014-02-21T15:06:00-06:00Ian Bickingtag:ianbicking.org,2014-02-21:/blog/2014/02/collaboration-as-a-skeuomorphism-for-agents.html<p>In concept videos and imaginings about the Future Of Computing we often see <a href="https://en.wikipedia.org/wiki/Software_agent#User_agents_.28personal_agents.29">Intelligent Agents</a>: smart computer programs that work on your behalf.</p>
<p>But to be more specific, I’m interested in agents that don’t work through formal rules. An <a href="https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol#Outgoing_mail_SMTP_server"><span class="caps">SMTP</span> daemon</a> acts on your behalf routing messages to …</p><p>In concept videos and imaginings about the Future Of Computing we often see <a href="https://en.wikipedia.org/wiki/Software_agent#User_agents_.28personal_agents.29">Intelligent Agents</a>: smart computer programs that work on your behalf.</p>
<p>But to be more specific, I’m interested in agents that don’t work through formal rules. An <a href="https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol#Outgoing_mail_SMTP_server"><span class="caps">SMTP</span> daemon</a> acts on your behalf routing messages to your intended destination, but they do so in an entirely formal way, one that is “correct” or “incorrect”. And if such agents act with initiative, it is initiative based on formal rules, and those formal rules ultimately lead back to the specific intentions of whoever wrote the rules, the rules defined in terms of unambiguous inputs.</p>
<p>Progress on intelligent agents seems to be thin. Gmail sorts some stuff for us in a “smart” way. There are some smart command-based interfaces like Siri, but they are mostly smart frontends for formalized backends, and they lack initiative. Maybe we get an intelligent alert or two, but it’s a tiny minority given all the dumb alerts we get.</p>
<p>One explanation is that we don’t have intelligent agents because we haven’t figured out intelligence. But whatever, intelligent is as intelligent does, if this was the only reason then I would expect to see more dumb <em>attempts</em> at intelligent agents.</p>
<p>It seems worth approaching this topic with more <a href="http://johncarlosbaez.wordpress.com/2013/09/29/levels-of-excellence/">mundane attempts</a>. But there are reasons we (“we” being “us technologists”) don’t.</p>
<h3>Intelligent agents will be chronically buggy</h3>
<p>If we want agents to do things where it’s not clear what to do, then sometimes they are going to do the wrong thing. It might be a big-scale wrong thing, like they buy airplane tickets and we wanted them to buy concert tickets. Or a small thing, like we want them to buy airplane tickets and something changed about the interface to buy those tickets and now the agent is just confused.</p>
<p>Intelligent agents will be accepting rules from the people they are working for, from normal people. Then normal users become programmers in a sense. Maybe it’s a hand-holding cute and fuzzy programming language based on natural language, but it is the nature of programming that you will create your own bugs. Only a minority of bugs are created because you expressed yourself incorrectly, most bugs are because you thought it through incorrectly, and no friendly interface can fix that.</p>
<p>How then do we deal with buggy intelligent agents, while also allowing them to do useful things?</p>
<p>There are two things that come to mind: logging and having the agent check before doing something. Both are hard in practice.</p>
<p><strong>Logging</strong>: this lets you figure out who was responsible for a bad action, or the reasoning behind an action.</p>
<p>Programmers do this all the time to understand their programs, but for an intelligent agent the user is also a developer. When you ask your agent to watch for something, or you ask it to act under certain circumstances, then you’ve programmed it, and you may have programmed it wrong. Fixing that doesn’t mean looking at stack traces, but there has to be some techniques.</p>
<p>You don’t want to have to take users into the mind of the person who programmed the agent. So how can you log actions so they are understandable?</p>
<p><strong>Checking</strong>: you’ll want your agent to check in with you before doing some things. Like before actually buying something. Sometimes you’ll want the agent to check in even more often, not because you expect the agent to do something impactful, but because it might do something impactful due to a bug. Or you are just getting to know each other.</p>
<p>Among people this kind of check-in is common, and we have a rich language to describe intentions and to implicitly get support for those intentions. With computer interactions it’s a little less clear: how does an agent talk about what it thinks it <em>should</em> do? How do we know what it says it thinks it should do is what it actually plans to do?</p>
<h3>Collaboration</h3>
<p>We deal with lots of intelligent agents all the time: each other. We can give each other instructions, and in this way anyone can program another human. We report back to each other about what we did. We can tell each other when we are confused, or unable to complete some operation. We can confirm actions. Confirmation is almost like functional testing, except often it’s the person who receives the instructions who initiates the testing. And all of this is rooted in <em>empathy</em>: understanding what someone else is doing because it’s more-or-less how you would do it.</p>
<p>It’s in these human-to-human interactions we can find the metaphors that can support computer-based intelligent agents.</p>
<p>But there’s a problem: computer-based intelligent agents perform best at computer-mediated tasks. But we usually work alone when we personally perform computer-mediated tasks. When we coordinate these tasks with each other we often resort to low-fidelity check-ins, an email or <span class="caps">IM</span>. We don’t even have ways to delegate except via the wide categories of permission systems. If we want to build intelligent agents on the intellectual framework of person-to-person collaboration, we need much better person-to-person collaboration for our computer-based interactions.</p>
<p>(I will admit that I may be projecting this need onto the topic because I’m very interested in person-to-person collaboration. But then I’m writing this post in the hope I can project the same perspective onto you, the reader.)</p>
<p>My starting point is the kind of collaboration embodied in <a href="https://togetherjs.com">TogetherJS</a> and some follow-on ideas I’m <a href="https://togetherjs.com/hotdish/">experimenting with</a>. In this model we let people see what each other are “doing” — how they interact with a website. This represents a kind of log of activity, and the log is presented as human-like interactions, like a recording.</p>
<p>But I imagine many ways to enter into collaboration: consider a mode for teaching, where one person is trying to tell the other person how to do something. In this model the helper is giving directed instructions (“click here”, “enter this text”). For teaching it’s often better to tell than to do. But this is also an opportunity to check in: if my intelligent agent is instructing me to do some action (perhaps one I don’t entirely trust it to do on my behalf) then I’m still confirming every specific action. At the same time the agent can benefit me by suggesting actions I might not have figured out on my own.</p>
<p>Or imagine a collaboration system where you let someone pull you in part way through their process. A kind of “hey, come look at this.” This is where the diligent intelligent agent can spend its time checking for things, and then bring your attention when it’s appropriate. Many of the same controls we might want for interacting with other people (like a “busy” status) apply well to the agent who also wants to get our attention, but should maybe wait.</p>
<p>Or imagine a “hey, what are you doing, let me see” collaboration mode, where I invite myself to see what you are doing. Maybe I’ve set up an intelligent agent to check for some situation. Anytime you set up <em>any</em> kind of detector like this, you’ll wonder: is it really still looking? Is it looking for the right thing? I think it should have found something, why didn’t it? This is where it would be nice to be able to peek into the agent’s actions, to watch it doing its work.</p>
<p>If applications become more collaboration-aware there are further possibilities. For instance, it would be great if I could participate in a collaboration session in GitHub and edit a file with someone else. Right now the other person can only “edit” if they also have permission to “save”. As GitHub is now this makes sense, but if collaboration tools were available we’d have a valid use case where only one of the people in the collaboration session could save, while the other person can usefully participate. There’s a kind of cooperative interaction in that model that would be perfect for agents.</p>
<p>We can imagine agents participating already in the collaborative environments we have. For instance, when a continuous integration system detects a regression on a branch destined for production, it could create its own GitHub pull request to revert the changes that led to a regression. On Reddit there’s a <a href="https://github.com/Deimos/AutoModerator">bot</a> that I’ve encountered that allows Subreddits to create fairly subtle rules, like allow image posts only on a certain day, ban short comments, check for certain terms, etc. But it’s not something that blocks submission (it’s not part of Reddit itself), instead it uses the same moderator interface that a person does, and it can use this same process to explain to people why their posts were removed, or allow other moderators to intervene when something valid doesn’t happen to fit the rules.</p>
<h3>What about APIs?</h3>
<p>In everything I’ve described agents are interacting with interfaces in the same way a human interacts with the interface. It’s like everything is a screen scraper. The more common technique right now is to use an <span class="caps">API</span>: a formal and stabilized interface to some kind of functionality.</p>
<p>I suggest using the interfaces intended for humans, because those are the interfaces humans understand. When an agent wants to say “I want to submit this post” if you can show the human the filled-in form and show that you want to hit the submit button, you are using what the person is familiar with. If the agent wants to say “this is what I looked for” you can show the data in the context the person would themselves look to.</p>
<p>APIs usually don’t have a staging process like you find in interfaces for humans. We don’t expect humans to act correctly. So we have a shopping cart and a checkout process, you don’t just submit a list of items to a store. You have a composition screen with preview, or interstitial preview. Dangerous or destructive operations get a confirmation step — a confirmation step that could be just as applicable of a warning for an agent as it is for a human.</p>
<p>None of this invalidates the reasons to use an <span class="caps">API</span>. And you can imagine APIs with these intermediate steps built in. You can imagine an <span class="caps">API</span> where each action can also be marked as “stage-only” and then returns a link where a human can confirm the action. You can imagine an <span class="caps">API</span> where each data set returned is also returned with the <span class="caps">URL</span> of the equivalent human-readable data set. You can imagine delegation APIs, where instead of giving a category of access to an agent via OAuth, you can ask for some more selective access. All of that would be great, but I don’t think there’s any movement towards this kind of <span class="caps">API</span> design. And why would there be? There’s no one eager to make use of it.</p>
<h3>That fancy Skeuomorphism term from the title</h3>
<p>A <a href="https://en.wikipedia.org/wiki/Skeuomorph#Digital_skeuomorphs">Skeuomorphism</a> is something built to be reminiscent of an existing tool, not out of any necessity, but because it provides some sense of familiarity. Our calendar software looks like a physical calendar. We talk of “folders”. We make our buttons look depressable even though it is all a simulation of a physical control.</p>
<p>This has come to mind when I talk of using the same metaphors for interacting with a computer program that we do for interacting with a human.</p>
<p>When we need a new way for people to work with computers a lot of success has come from finding bridges between our existing practices and a computer-based practice. The desktop instead of the command line, the use of cards on mobile, the many visual metaphors that we use, the way we phrase emails as letters, etc. Sometimes these are just scaffolding while people get used to the new systems (maybe <a href="http://gizmodo.com/what-is-flat-design-508963228">flat design</a> is an example). And of course you can pick the wrong metaphors (or <a href="http://www.youtube.com/watch?v=ZegWedG-jk4">go too far</a>)</p>
<p>In this case the metaphor isn’t using the representation of a physical object in the computer, but using the representation of a fellow human as a stand-in for a program.</p>
<p>The goal is enabling a whole list of <em>maybe</em> actions. Maybe “intelligent” doesn’t really mean “knowledgable and smart” but “is not formally verifiable as correct” and “successfully addresses a domain that cannot be fully understood”. You don’t need formal <span class="caps">AI</span> for these kinds of tasks. Heuristics don’t need to be sophisticated. But we need interfaces where a computer can make <em>attempts</em> without demanding correctness. And human interaction seems like the perfect model for that.</p>Towards a Next Level of Collaboration2014-03-03T12:40:00-06:002014-03-03T12:40:00-06:00Ian Bickingtag:ianbicking.org,2014-03-03:/blog/2014/03/towards-next-level-of-collaboration.html<p>With <a href="https://togetherjs.com">TogetherJS</a> we’ve been trying to make a usable tool for the web we have, and the browsers we have, and the web apps we have. But we’re also accepting a lot of limitations.</p>
<p>For a particular scope the limitations in TogetherJS are reasonable, but my own goals …</p><p>With <a href="https://togetherjs.com">TogetherJS</a> we’ve been trying to make a usable tool for the web we have, and the browsers we have, and the web apps we have. But we’re also accepting a lot of limitations.</p>
<p>For a particular scope the limitations in TogetherJS are reasonable, but my own goals have been more far-reaching. I am interested in collaboration with as broad a scope as the web itself. (<a href="http://www.ianbicking.org/blog/2014/02/saying-goodbye-to-python.html">But no broader than the web because I’m kind of biased.</a>)
“Collaboration” isn’t quite the right term — it implies a kind of active engagement in creation, but there’s more ways to work together than collaboration. TogetherJS was previously called TowTruck, but we wanted to rename it to something more meaningful. While brainstorming we kept coming back to names that included some form of “collaboration” but I strongly resisted it because it’s such a mush-mouthed term with too much baggage and too many preconceptions.</p>
<p>When we came up with “together” it immediately seemed right. Admittedly the word feels a little cheesy (<em>it’s a web built out of hugs and holding hands!</em>) but it covers the broad set of activities we want to enable.</p>
<p>With the experience from TogetherJS in mind I want to spend some time thinking about what a less limited tool would look like. Much of this has become manifest in <a href="https://github.com/mozilla/hotdish">Hotdish</a>, and the notes below have informed its design.</p>
<h2>Degrees of collaboration/interaction</h2>
<p>Intense collaboration is cool, but it’s not comprehensive. I don’t <em>want</em> to always be watching over your shoulder. What will first come to mind is privacy, but that’s not interesting to me. I would rather address privacy by helping you scope your actions, let you interact with your peers or not and act appropriately with that in mind. I don’t want to engage with <em>my</em> collaborators all the time because it’s boring and unproductive and my eyes glaze over. I want to engage with other people <em>appropriately</em>: with all the intensity called for given the circumstances, but also all the passivity that is also sometimes called for.</p>
<p>I’ve started to think in terms of categories of collaboration:</p>
<h3>1. Asynchronous message-based collaboration</h3>
<p>This includes email of course, but also issue trackers, planning tools, any notification system. If you <a href="https://www.google.com/search?q=collaboration+software">search for “collaboration software”</a> this is most of what you find, and much of the innovation is in representing and organizing the messages.</p>
<p>I don’t think I have any particularly new ideas in this well-explored area. That’s not to say there aren’t lots of important ideas, but the work I want to do is in complementing these tools rather than competing with them. But I do want to note that they exist on this continuum.</p>
<h3>2. <strong>Ambient awareness</strong></h3>
<p>This is the awareness of a person’s presence and activity. We have a degree of this with Instant Messaging and chat rooms (<span class="caps">IRC</span>, Campfire, etc). But they don’t show <em>what we are actively doing</em>, just our presence or absence, and in the case of group discussions some of what we’re discussing with other people.</p>
<p>Many tools that indicate presence also include status messages which would purport to summarize a person’s current state and work. I’ve never worked with people who keep those status messages updated. It’s a very explicit approach. At best it devolves into a record of what you <em>had been doing</em>.</p>
<p>A more interesting tool to make people’s presence more <em>present</em> is <a href="https://www.sqwiggle.com/">Sqwiggle</a>, a kind of always-on video conference. It’s not exactly always-on, there is a low-fidelity video with no audio until you start a conversation with someone and it goes to full video and audio. This way you know not only if someone is actually sitting at the computer, but also if they are eating lunch, if they have the furrowed brows of careful concentration, or are frustrated or distracted. Unfortunately most people’s faces only show that they are looking at a screen, with the slightly studious but mostly passive facial expressions that we have when looking at screens.</p>
<p>Instant messaging has grown to include an additional the presence indicator: <em>I am currently typing a response</em>. A better fidelity version of this would indicate if I am typing right now, or if I forgot I started typing and switched tabs but left text in the input box, or if I am trying hard to compose my thoughts (typing and deleting), or if I’m pasting something, or if I am about to deliver a soliloquy in the form of a giant message. (Imagine a typing indicator that gives a sense of the number of words you have typed but not sent.)</p>
<p>I like that instant messaging <em>detects</em> your state automatically, using something that you are already engaged with (the text input box). Sqwiggle has a problem here: because you aren’t trying to project any emotions to your computer screen, Sqwiggle catches expressions that don’t mean anything. We can engage with our computers in different ways, there’s something there to express, it’s just not revealed on our faces.</p>
<p>I’d like to add to the activity indicators we have. Like the pages (and web apps) you are looking at (or some privacy-aware subset). I’d like to show <em>how</em> you are interacting with those pages. Are you flopping between tabs? Are you skimming? Scrolling through in a way that shows you are studying the page? Typing? Clicking controls?</p>
<p>I want to show something like the body language of how you are interacting with the computer. First I wondered if we could interpret your actions and show them as things like “reading”, “composing”, “being pissed off with your computer”, etc. But then I thought more about body language. When I am angry there’s no “angry” note that shows up above my head. A furrowed brow isn’t a message, or at least mostly not a message. Body language is what we read from cues that aren’t explicit. And so we might be able to show <em>what</em> a person is doing, and let the person watching figure out <em>why</em>.</p>
<h3>3. Working in <strong>close parallel</strong></h3>
<p>This is where both people (or more than 2 people) are actively working on the same thing, same project, same goal, but aren’t directly supporting each other at every moment.</p>
<p>When you’ve entered into this level of collaboration you’ve both agreed that you are working together — you’re probably actively talking through tasks, and may regularly be relying on each other (“does what I wrote sound right?” or “did you realize this test is failing” etc). A good working meeting will be like this. A bad meeting would probably have been better if you could have stuck to ambient awareness and promoted it to a more intense level of collaboration only as needed.</p>
<h3>4. <strong>Working directly</strong></h3>
<p>This is where you are both locked on a <em>single</em> task. When I write something and say “does what I wrote sound right?” we have to enter this mode: you have to look at exactly what I’m talking about. In some sense “close parallel” may mean “prepared to work directly”.</p>
<p>I have found that video calls are better than audio-only calls, more than I would have expected. It’s not because the video content is interesting. But the video makes you work directly, while being slightly uncomfortable so you are encouraged to acknowledge when you should end the call. In a way you want your senses filled. Or maybe that’s my propensity to distraction.</p>
<p>There’s a lot more to video calls than this (like the previously mentioned body language). But in each feature I suspect there are parallels in collaborative work. Working directly together should show some of the things that video shows when we are focused on a <em>conversation</em>, but can’t show when we are focusing on <em>work</em>.</p>
<h3>5. <strong>Demonstrating</strong> to another person</h3>
<p>This is common for instruction and teaching, but that shouldn’t be the only case we consider. In Hotdish we have often called it “presenting” and “viewing”. In this mode someone is the driver/presenter, and someone is the passenger/viewer. When the presenter focuses on something, you want the viewer to be aware of that and follow along. The presenter also wants to be confident that the viewer is following along. Maybe we want something like how you might say “uh huh” when someone is talking to you — if a listener says nothing it will throw off the talker, and these meaningless indications of active listening are important.</p>
<p>Demonstration could just be a combination of direct work and social convention. Does it need to be specially mediated by tools? I’m not sure. Do we need a talking stick? Can I take the talking stick? Are these interactions like a conversation, where sometimes one person enters into a kind of monologue, but the rhythm of the conversation will shift? If we focus on the demonstration tools we could miss the social interactions we are trying to support.</p>
<h3>Switching modes</h3>
<p>Between each of these styles of interaction I think there must be some kind of positive action. A natural promotion of demotion of your interaction with someone should be mutual. (A counter example would be the dangling <span class="caps">IM</span> conversation, where you are never sure it’s over.)</p>
<p>At the same time, the movement between modes also builds your shared context and your relationship with the other person. You might be proofing an article with another person, and you say: “clearly this paragraph isn’t making sense, let me just rewrite it, one minute” — now you know you are leaving active collaboration, but you also both know you’ll be reentering it soon. You shouldn’t have to record that expectation with the tool.</p>
<p>I’m reluctant to put boundaries up between these modes, I’d rather tools simply <em>inform</em> people that modes are changing and not <em>ask</em> if they can change. This is part of the principles behind <a href="http://www.ianbicking.org/blog/2014/02/defaulting-to-together.html">Defaulting To Together</a>.</p>
<h2>Ownership</h2>
<p>At least in the context of computers we often have strong notions of <em>ownership</em>. Maybe we don’t have to — maybe it’s because we have to hand off work explicitly, and maybe we have to hand off work explicitly because we lack fluid ways to interact, cooperate, delegate.</p>
<p>With good tools in hand I see “ownership” being exchanged more regularly:</p>
<ul>
<li>
<p>I find some documentation, then show it to you, and now it’s yours to make use of.</p>
</li>
<li>
<p>I am working through a process, get stuck, and need your skills to finish it up. Now it’s yours. But you might hand it back when you unstick me.</p>
</li>
<li>
<p>You are working through something, but are not permitted to complete the operation, you have to hand it over to me for me to complete the last step.</p>
</li>
</ul>
<p>Layered on this we have the normal notions of ownership and control — the login accounts and permissions of the applications we are using. Whether these are in opposition to cooperation or maybe complementary I have not decided.</p>
<h2>Screensharing vs. Peer-to-Peer</h2>
<p>Perhaps a technical aside, but when dealing with real-time collaboration (not asynchronous) there are two distinct approaches.</p>
<p><strong>Screensharing</strong> means one person (and one computer) is “running” the session — that one person is logged in, their page or app is “live”, everyone else sees what they see.</p>
<p>Screensharing doesn’t mean other people can’t interact with the screen, but any interaction has to go through the owner’s computer. In the case of a web page we can share the <span class="caps">DOM</span> (the current visual state of the page) with another person, but we can’t share the Javascript handlers and state, cookies, etc., so most interactions have to go back through the original browser. Any side effects have to make a round trip. Latency is a problem.</p>
<p>It’s hard to figure out exactly what interactivity to implement in a screensharing situation. Doing a view-only interaction is not too hard. There are a few things you can add after that — maybe you let someone touch a form control, suggest that you follow a link, send clicks across the wire — but there’s no clear line to stop at. Worse, there’s no clear line to <em>express</em>. You can implement certain <em>mechanisms</em> (like a click), but these don’t always map to what the user thinks they are doing — something like a drag might involve a mousedown/mousemove/mouseup event, or it might be implemented <a href="https://developer.mozilla.org/en-US/docs/DragDrop/Drag_and_Drop">directly as dragging</a>. Implementing one of those interactions is a lot easier than the other, but the distinction means nothing to the user.</p>
<p>When you implement incomplete interactions you are setting up a situation where a person can do something in the original application that viewers can’t do, even though it <em>looks</em> like the real live application. An uncanny valley of collaboration.</p>
<p>I’ve experimented with <span class="caps">DOM</span>-based screen sharing in <a href="https://github.com/mozilla/browsermirror">Browser Mirror</a>, and you can see this approach in a tool like <a href="https://surfly.com/">Surfly</a>. As I write this a minimal version of this is available in Hotdish.</p>
<p>In <strong>peer-to-peer</strong> collaboration both people are viewing their own version of the live page. Everything works exactly like in the non-collaborative environment. Both people are logged in as themselves. This is the model <a href="https://togetherjs.com">TogetherJS</a> uses, and is also present as a separate mode in Hotdish.</p>
<p>This has a lot of obvious advantages over the problems identified above for screensharing. The big disadvantage is that hardly anything is collaborative by default in this model.</p>
<p>In the context of the web the building blocks we <em>do</em> have are:</p>
<ul>
<li>
<p>URLs. Insofar as a <span class="caps">URL</span> defines the exact interface you look at, then putting both people at the same <span class="caps">URL</span> gives a consistent experience. This works great for applications that use lots of server-side logic. Amazon is pretty great, for example, or Wikipedia. It falls down when content is substantially customized for each person, like the Facebook frontpage or a flight search result.</p>
</li>
<li>
<p>Event echoing: events aren’t based on any internal logic of the program, they are something initiated by the user. So if the user can do something, a remote user can do something. Form fields are the best example of this, as there’s a clear protocol for doing form changes (change the value, fire a <code>change</code> event).</p>
</li>
</ul>
<p>But we <em>don’t</em> have:</p>
<ul>
<li>
<p>Consistent event <em>results</em>: events aren’t state changes, and transferring events about doesn’t necessarily lead to a consistent experience. Consider the modest toggle control, where a click on the toggler element shows or hides some other element. If our hidden states are out of sync (e.g., my toggleable element is hidden, yours is shown), sending the click event between the clients <em>keeps</em> them consistently and perfectly out of sync.</p>
</li>
<li>
<p>Consistent underlying object models. In a single-page app of some sort, or a whatever fancy Javascript-driven webapp, a lot of what we see is based on Javascript state and models that are not necessarily consistent across peers. This is in contrast to old-school server-side apps, where there’s a good chance the <span class="caps">URL</span> contains enough information to keep everything consistent, and ultimately the “state” is held on a single server or database that both peers are connecting to. But we can’t sync the client’s object models, as they are not built to support arbitrary modification from the outside. Apps that use a real-time database work well.</p>
</li>
</ul>
<p>To make this work the application usually has to support peer-to-peer collaboration to some degree. A <a href="http://www.ianbicking.org/blog/2013/10/togetherjs-a-postmodern-tool.html">messy approach</a> can help, but can never be enough, not complete enough, not robust enough.</p>
<p>So peer-to-peer collaboration offers potentially more powerful and flexible kinds of collaboration, but only with work on the part of each application. We can try to make it as <a href="https://hacks.mozilla.org/2013/10/introducing-togetherjs/">easy as possible</a>, and maybe integrate with tools or libraries that support the kinds of higher-level synchronization we would want, but it’s never reliably easy.</p>
<h2>Synchronized vs. Coordinated Experiences</h2>
<p>Another question: what kind of experiences do we <em>want</em> to create?</p>
<p>The most obvious real-time experience is: everything sees the same thing. Everything is fully synchronized. In the screensharing model this is what you always get and what you <em>have</em> to get.</p>
<p>The obvious experience is probably a good starting point, but shouldn’t be the end of our thinking.</p>
<p>The trivial example here is the cursor point. We can both be editing content and viewing each other’s edits (close to full sync), but we don’t have to be at exactly the same place. (This is something traditional screensharing has a hard time with, as you are sharing a <em>screen of pixels</em> instead of a <span class="caps">DOM</span>.)</p>
<p>But other more subtle examples exist. Maybe only one person has the permission to save a change. A collaboration-aware application might allow both people to edit, while still only allowing one person to save. (Currently editors will usually be denied to people who don’t have permission to save.)</p>
<p>I think there’s fruit in playing with the timing of actions. We don’t have to replay remote actions exactly how they occurred. For example, in a Demonstration context we might detect that when the driver clicks a link the page will change. To the person doing the click the order of events is: find the link, focus attention on the link, move cursor to the link, click. To the viewer the order of events is: cursor moves, maybe a short click indicator, and <em>boom</em> you are at a new page. There’s much less context given to the viewer. But we don’t have to display those events with the original timing for instance we could let the mouse hover over its target for a more extended amount of time on the viewer.</p>
<p>High-level (application-specific) representation of actions could be available. Instead of trying to express what the other person is doing through every click and scroll and twiddling of a form, you might just say “Bob created a new calendar event”.</p>
<p>In the context of something like a bug tracker, you might not want to synchronize the comment field. Instead you might want to show individual fields for all participants on a page/bug. Then I can see the other person’s in-progress comment, even add to it, but I can also compose my own comment as myself.</p>
<p>This is where the peer-to-peer model has advantages, as it will (by necessity) keep the application in the loop. It does not demand that collaboration take one form, but it gives the application an environment in which to build a domain-specific form of collaboration.</p>
<p>We can imagine moving from screenshare to peer-to-peer through a series of enhancements. The first might be: let applications opt-in to peer-to-peer collaboration, or implement a kind of transparent-to-the-application screensharing, and from there tweak. Maybe you indicate some scripts should run on the viewer’s side, and some compound <span class="caps">UI</span> components can be manipulated. I can imagine with a component system like <a href="http://mozilla.github.io/brick/">Brick</a> where you could identify safe ways to run rich components, avoiding latency.</p>
<h2>How do you package all this?</h2>
<p>Given tools and interactions, what is the actual context for collaboration?</p>
<p>TogetherJS has a model of a persistent session, and you invite people to that session. Only for technical reasons the session is bound to a specific domain, but not a specific page.</p>
<p>In Hotdish we’ve used a group approach: you join a group, and your work clearly happens in the group context or not.</p>
<p>One of the interesting things I’ve noticed when getting feedback about TogetherJS is that people are most interested in controlling and adding to how the sessions are setup. While, as an implementor, I find myself drawn to the tooling and specific experiences of collaboration, there’s just as much value in allowing new and interesting groupings of people. Ways to introduce people, ways to start and end collaboration, ways to connect to people by role instead of identity, and so on.</p>
<p>Should this collaboration be a conversation or an environment? When it is a <em>conversation</em> you lead off with the introduction, the “hello” the “so why did you call?” and finish with “talk to you later” — when it is an <em>environment</em> you enter the environment and any coparticipants are just there, you don’t preestablish any specific reason to collaborate.</p>
<h2>And in conclusion…</h2>
<p>I’m still developing these ideas. And for each idea the real test is if we can create a useful experience. For instance, I’m pretty sure there’s some ambient information we want to show, but I haven’t figured out what.</p>
<p>Experience has shown that simple history (as in an activity stream) seems too noisy. And is history shown by group or person?</p>
<p>In the past I unintentionally exposed all tab focus and unfocus in TogetherJS, and it felt weird to both expose my own distracted state and my collaborator’s distraction. But part of why it was weird was that in some cases it was simply distraction, but in other cases it was useful multitasking (like researching a question in another tab). Was tab focus too much information or too little?</p>
<p>I am still in the process of figuring out how and where I can explore these questions, build the next thing, and the next thing after that — the tooling I envision doesn’t feel impossibly far away, but still more than one iteration of work yet to be done, maybe many more than one but I can only see to the next peak.</p>
<p>Who else is thinking about these things? And thinking about how to <strong>build</strong> these things? If you are, or you know someone who is, please <a href="mailto:ian@ianbicking.org">get in contact</a> — I’m eager to talk specifics with people who have been thinking about it too, but I’m not sure how to find these people.</p>A Product Journal: Conception2015-01-15T00:00:00-06:002015-01-15T00:00:00-06:00Ian Bickingtag:ianbicking.org,2015-01-15:/blog/2015/01/product-journal-conception.html<blockquote>
<p>I’m going to try to journal the process of a new product that I’m developing in <a href="https://blog.mozilla.org/services/">Mozilla Cloud Services</a></p>
</blockquote>
<p>When <a href="http://www.ianbicking.org/blog/2014/09/professional-transitions.html">Labs closed and I entered management</a> I decided not to do any programming for a while. I had a lot to learn about management, and that’s what …</p><blockquote>
<p>I’m going to try to journal the process of a new product that I’m developing in <a href="https://blog.mozilla.org/services/">Mozilla Cloud Services</a></p>
</blockquote>
<p>When <a href="http://www.ianbicking.org/blog/2014/09/professional-transitions.html">Labs closed and I entered management</a> I decided not to do any programming for a while. I had a lot to learn about management, and that’s what I needed to focus on. Whether I learned what I need to I don’t know, but I have been getting <a href="http://www.ianbicking.org/blog/2015/01/being-a-manager-is-lonely.html">a bit tired</a>.</p>
<p>We went through a fairly extensive planning process towards the end of 2014. I thought it was a good process. We didn’t end up where we started, which is a good sign – often planning processes are just documenting the conventional wisdom and status quo of a group or project, but in a critically engaged process you are open to considering and reconsidering your goals and commitments.</p>
<p>Mozilla is undergoing some stress right now. We <a href="https://blog.mozilla.org/press/2014/11/yahoo-and-mozilla-form-strategic-partnership/">have a new search deal</a>, which is good, but we’ve been seeing <a href="http://www.forbes.com/sites/antonyleather/2014/08/04/google-chrome-browser-market-share-tops-20-leaves-firefox-in-its-dust/">declining marketshare</a> which is bad. And then when you consider that desktop browsers are themselves a decreasing share of the market it looks worse.</p>
<p>The first planning around this has been to decrease attrition among our existing users. Longer term much of the focus has been in increasing the quality of our product. A noble goal of course, but does it lead to growth? I suspect it can only address attrition, the people who don’t use Firefox but could won’t have an opportunity to see what we are making. If you have other growth techniques then focusing on attrition can be sufficient. Chrome for instance does significant advertising and has deals to side-load Chrome onto people’s computers. Mozilla doesn’t have the same resources for that kind of growth.</p>
<p>When finished up the planning process I realized, <em>damn</em>, all our plans were about product quality. And I liked our plan! But something was missing.</p>
<p>This perplexed me for a while, but I didn’t really know what to make of it. Talking with a friend about it he asked <em>then what do you want to make?</em> – a seemingly obvious question that no one had asked me, and somehow hearing the question coming at me was important.</p>
<p>Talking through ideas, I reluctantly kept coming back to sharing. It’s the most incredibly obvious growth-oriented product area, since every use of a product is a way to implore non-users to switch. But sharing is so competitive. When I first started with Mozilla we would obsess over the problem of Facebook and Twitter and silos, and then think about it until we threw our hands up in despair.</p>
<p>But I’ve had this trick up my sleeve that I pull out for one project after another because I think it’s a really good trick: make a static copy of the live <span class="caps">DOM</span>. Mostly you just iterate over the elements, get rid of scripts and stuff, do a few other clever things, use <code><base href></code> and you are done! It’s like a screenshot, but it’s also still a webpage. I’ve been trying to do something with this <a href="http://pythonhosted.org/Deliverance/">for a long time</a>. This time let’s use it for sharing…?</p>
<p>So, the first attempt at a concept: freeze the page as though it’s a fancy screenshot, upload it somewhere with a <span class="caps">URL</span>, maybe add some fun features because now it’s disassociated from its original location. The resulting page won’t 404, you can save personalized or dynamic content, we could add highlighting or other features.</p>
<p>The big difference with past ideas I’ve encountered is that here we’re not trying to compete with <em>how</em> anyone shares things, this is a tool to improve <em>what</em> you share. That’s compatible with Facebook and Twitter and <span class="caps">SMS</span> and <em>anything</em>.</p>
<p>If you think pulling a technology out of your back pocket and building a product around it is like putting the cart in front of the horse, well maybe… but you have to start somewhere.</p>
<p>[The next post in the series is <a href="http://www.ianbicking.org/blog/2015/01/product-journal-tech-demo.html">The Tech Demo</a>]</p>A Product Journal: The Technology Demo2015-01-22T00:00:00-06:002015-01-22T00:00:00-06:00Ian Bickingtag:ianbicking.org,2015-01-22:/blog/2015/01/product-journal-tech-demo.html<blockquote>
<p>I’m going to try to journal the process of a new product that I’m developing in <a href="https://blog.mozilla.org/services/">Mozilla Cloud Services</a>. My previous and first post was <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html"><em>Conception</em></a>.</p>
</blockquote>
<p>As I <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html">finished my last post</a> I had a product idea built around a strategy (growth through social tools and sharing) and …</p><blockquote>
<p>I’m going to try to journal the process of a new product that I’m developing in <a href="https://blog.mozilla.org/services/">Mozilla Cloud Services</a>. My previous and first post was <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html"><em>Conception</em></a>.</p>
</blockquote>
<p>As I <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html">finished my last post</a> I had a product idea built around a strategy (growth through social tools and sharing) and a technology (freezing or copying the <a href="https://en.wikipedia.org/wiki/Document_Object_Model">markup</a>). But that’s not a concise product definition centered around user value. It’s not even <em>trying</em>. The result is a technology demo, not a product.</p>
<p>In my defense I’m searching for some product, I don’t know what it is, and I don’t know if it exists. I have to push this past a technology demo, but if I have to start with a technology demo then so it goes.</p>
<p>I’ve found a couple specific experiences that help me adapt the product:</p>
<ul>
<li>
<p>I demo the product and I sense an excitement for something I didn’t expect. For example, a view that I thought was just a logical necessity might be what most appeals to someone else. To do this I have to show the tool to people, and it has to include things that <em>I</em> think are somewhat superfluous. And I have to be actively reading the person viewing the demo to sense their excitement.</p>
</li>
<li>
<p>Remind myself continuously of the strategy. It also helps when I remind other people, even if they don’t need reminding – it centers the discussion and my thinking around the goal. In this case there’s a lot of personal productivity use cases for the technology, and it’s easy to drift in that direction. It’s easy because <em>the technology</em> facilitates those use cases. And while it’s cool to make something widely useful, that won’t make this tool work the way I want as a product, or work for Mozilla. (And because I plan to build this on Mozilla’s dime it better work for Mozilla! But that’s a discussion for another post.)</p>
</li>
<li>
<p>I’ll poorly paraphrase something I’m sure someone can source in the comments: <em>a product that people love is one that makes those people feel great about themselves</em>. In this case, makes them feel like a journalist and not just a crank, or makes them feel like they are successfully posing as a professional, or makes them feel like what they are doing is appreciated by other people, or makes them feel like an efficient organizer. In the product design you can exult the product, try to impress people, try to attract compliments on your own prowess, but love comes when a person is impressed with themselves when they use your product. This advice helps keep me from valuing cleverness.</p>
</li>
</ul>
<p>A common way to pull people out of technology-focused thinking is to ask “what problem does this solve?” While I appreciate this question more than I used to, it still makes me bristle. Why must everything be focused on <em>problems</em>? Why not opportunities! Why? An answer: problems are cases where a person has already articulated a tension and an openness to resolution. You have a customer in waiting. But must we confine ourselves to the partially formed conventional wisdom that makes something a “problem”? (One fair answer to this question is: yes. I remain open to other answers.) Maybe a more positive alternative to “what problem does this solve?” is “what does this let people do that they couldn’t do before?”</p>
<p>What I’m certain of is that you should constantly remember the people using your tool will care most about <em>their</em> interests, goals, and perspective; and will not care much about the interests, goals, or perspective of the tool maker.</p>
<p>So what should this tool do? If not technology, what defines it? A pithy byline might be <em>share better</em>. I don’t like pithy, but maybe a whole bag of pithy:</p>
<ul>
<li>Improving on the <span class="caps">URL</span></li>
<li>Own what you share</li>
<li>Share content, not pointers</li>
<li>Share what you see, anything you see</li>
<li>Every share is a message, make it your message<br> Dammit, why do I feel compelled to noun “share”?</li>
<li>Share the context, the journey, not just the web destination</li>
<li>Own your perspective, don’t give it over to site owners</li>
<li>Know how and when people see what you share</li>
<li>Build better content, even if the publisher doesn’t</li>
<li>Trade in content, not promises for content</li>
<li>Copy/enhance/share</li>
</ul>
<p>No… quantity doesn’t equal quantity I suppose. Another attempt:</p>
<p>When you share, you are a publisher. Your medium is the <span class="caps">IM</span> text input, or the Facebook status update, or the email composition window. It seems casual, it seems pithy, but that individual publishing is what the web is built on. I respect everyone as a publisher, every medium as worthy of improvement, and this project will respect your efforts. We will try to make a tool that can make every instance just a little bit better, simple when all you need is simple, polished if you want. We will defer your decisions because you should decide in context, not make decisions in the order that makes our work easier; we will be transparent to you, your audience, and your source; respect for the reader is part of our brand promise, and that adds to the quality of your shares; we believe content is a message, a relationship between you and your audience, and there is no universally appropriate representation; we believe there is order and structure in information, but only when that information is put to use; we believe our beliefs are always provisional and tomorrow it is our prerogative to rebelieve whatever we want most.</p>
<p>Who is <em>we</em>? Just me. A pretentiously royal <em>we</em>. It can’t stay that way for long though. More on that soon…</p>
<p>[The next post in this series is <a href="http://www.ianbicking.org/blog/2015/01/product-journal-mvp.html">To <span class="caps">MVP</span> Or Not To <span class="caps">MVP</span></a>]</p>A Product Journal: To MVP Or Not To MVP2015-01-27T00:00:00-06:002015-01-27T00:00:00-06:00Ian Bickingtag:ianbicking.org,2015-01-27:/blog/2015/01/product-journal-mvp.html<blockquote>
<p>I’m going to try to journal the process of a new product that I’m developing in <a href="https://blog.mozilla.org/services/">Mozilla Cloud Services</a>. My previous post was <a href="http://www.ianbicking.org/blog/2015/01/product-journal-tech-demo.html"><em>The Tech Demo</em></a>, and the first in the series is <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html"><em>Conception</em></a>.</p>
</blockquote>
<h2>The Minimal Viable Product</h2>
<p>The Minimal Viable Product is a popular product development approach …</p><blockquote>
<p>I’m going to try to journal the process of a new product that I’m developing in <a href="https://blog.mozilla.org/services/">Mozilla Cloud Services</a>. My previous post was <a href="http://www.ianbicking.org/blog/2015/01/product-journal-tech-demo.html"><em>The Tech Demo</em></a>, and the first in the series is <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html"><em>Conception</em></a>.</p>
</blockquote>
<h2>The Minimal Viable Product</h2>
<p>The Minimal Viable Product is a popular product development approach at Mozilla, and judging from Hacker News it is popular everywhere (but that is a wildly inaccurate way to judge common practice).</p>
<p>The idea is that you build the smallest thing that could be useful, and you ship it. The idea isn’t to make a great product, but to make <em>something</em> so you can learn in the field. A couple definitions:</p>
<blockquote>
<p>The Minimum Viable Product (<span class="caps">MVP</span>) is a key lean startup concept popularized by Eric Ries. The basic idea is to <strong>maximize validated learning for the least amount of effort</strong>. After all, why waste effort building out a product without first testing if it’s worth it.</p>
</blockquote>
<p><span class="dquo">“</span>”“– from <a href="http://practicetrumpstheory.com/how-i-built-my-minimum-viable-product/">How I built my Minimum Viable Product</a> (emphasis in original)”“”</p>
<p>I like this phrase “validated learning.” Another definition:</p>
<blockquote>
<p>A core component of Lean Startup methodology is the build-measure-learn feedback loop. The first step is figuring out the problem that needs to be solved and then developing a minimum viable product (<span class="caps">MVP</span>) to begin the process of learning as quickly as possible. <strong>Once the <span class="caps">MVP</span> is established, a startup can work on tuning the engine.</strong> This will involve measurement and learning and must include actionable metrics that can demonstrate cause and effect question.</p>
</blockquote>
<p><span class="dquo">“</span>”“– <a href="http://theleanstartup.com/principles">Lean Startup Methodology</a> (emphasis added)”“”</p>
<p>I don’t like this model at all: “once the <span class="caps">MVP</span> is established, a startup can work on <strong>tuning the engine</strong>.” You <em>tune</em> something that works the way you want it to, but isn’t powerful or efficient or fast enough. You’ve established almost nothing when you’ve created an <span class="caps">MVP</span>, no aspect of the product is validated, it would be premature to tune. But I see this antipattern happen frequently: get an <span class="caps">MVP</span> out quickly, often shutting down critically engaged deliberation in order to Just Get It Shipped, then use that product as the model for further incremental improvements. Just Get It Shipped is okay, incrementally improving products is okay, but together they are boring and uncreative.</p>
<p>There’s another broad discussion to be had another time about how to enable positive and constructive critical engagement around a project. It’s not easy, but that’s where learning happens, and the <strong>purpose of the <span class="caps">MVP</span> is to learn, not to produce</strong>. In contrast I find myself impressed by the shear willfulness of the <a href="http://www.gamasutra.com/view/feature/131815/the_cabal_valves_design_process_.php">Halflife development process</a> which apparently involved months of six hour design meetings, four days a week, producing large and detailed design documents. Maybe I’m impressed because it sounds <em>so exhausting</em>, a feat of endurance. And perhaps it implies that waterfall can work if you invest in it properly.</p>
<h2>Plan plan plan</h2>
<p>I have a certain respect for this development pattern that Dijkstra describes:</p>
<blockquote>
<p><strong>Q:</strong> In practice it often appears that pressures of production reward clever programming over good programming: how are we progressing in making the case that good programming is also cost effective?</p>
<p><strong>A:</strong> Well, it has been said over and over again that the tremendous cost of programming is caused by the fact that it is done by cheap labor, which makes it very expensive, and secondly that people rush into coding. One of the things people learn in colleges nowadays is to think first; that makes the development more cost effective. I know of at least one software house in France, and there may be more because this story is already a number of years old, where it is a firm rule of the house, that for whatever software they are committed to deliver, coding is not allowed to start before seventy percent of the scheduled time has elapsed. So if after nine months a project team reports to their boss that they want to start coding, he will ask: “Are you sure there is nothing else to do?” If they say yes, they will be told that the product will ship in three months. That company is highly successful.</p>
</blockquote>
<p><span class="dquo">“</span>”“– from <a href="https://www.cs.utexas.edu/users/EWD/misc/vanVlissingenInterview.html">Interview Prof. Dr. Edsger W. Dijkstra, Austin, 04–03–1985</a>“”“</p>
<p>Or, a warning <a href="http://www.linfo.org/q_programming.html">from a page full of these kind of quotes</a>: “Weeks of programming can save you hours of planning.” The planning process Dijkstra describes is intriguing, it says something like: if you spend two weeks making a plan for how you’ll complete a project in two weeks then it is an appropriate investment to spend another week of planning to save half a week of programming. Or, if you spend a month planning for a month of programming, then you haven’t invested enough in planning to justify that programming work – to ensure the quality, to plan the order of approach, to understand the pieces that fit together, to ensure the foundation is correct, ensure the staffing is appropriate, and so on.</p>
<p>I believe “Waterfall Design” gets much of its negative connotation from a lack of good design. A Waterfall process requires the design to be <em>very very good</em>. With Waterfall the design is too important to leave it to the experts, to let the architect arrange technical components, the program manager to arrange schedules, the database architect to design the storage, and so on. It’s anti-collaborative, disengaged. It relies on intuition and common sense, and those are not powerful enough. I’ll quote Dijkstra again:</p>
<blockquote>
<p>The usual way in which we plan today for tomorrow is in yesterday’s vocabulary. We do so, because we try to get away with the concepts we are familiar with and that have acquired their meanings in our past experience. Of course, the words and the concepts don’t quite fit because our future differs from our past, but then we stretch them a little bit. Linguists are quite familiar with the phenomenon that the meanings of words evolve over time, but also know that this is a slow and gradual process.</p>
<p>It is the most common way of trying to cope with novelty: by means of metaphors and analogies we try to link the new to the old, the novel to the familiar. Under sufficiently slow and gradual change, it works reasonably well; in the case of a sharp discontinuity, however, the method breaks down: though we may glorify it with the name “common sense”, our past experience is no longer relevant, the analogies become too shallow, and the metaphors become more misleading than illuminating. This is the situation that is characteristic for the “radical” novelty.</p>
<p>Coping with radical novelty requires an orthogonal method. One must consider one’s own past, the experiences collected, and the habits formed in it as an unfortunate accident of history, and one has to approach the radical novelty with a blank mind, consciously refusing to try to link it with what is already familiar, because the familiar is hopelessly inadequate. One has, with initially a kind of split personality, to come to grips with a radical novelty as a dissociated topic in its own right. Coming to grips with a radical novelty amounts to creating and learning a new foreign language that can not be translated into one’s mother tongue. (Any one who has learned quantum mechanics knows what I am talking about.) Needless to say, adjusting to radical novelties is not a very popular activity, for it requires hard work. For the same reason, the radical novelties themselves are unwelcome.</p>
</blockquote>
<p><span class="dquo">“</span>”“– from <a href="https://www.cs.utexas.edu/~EWD/transcriptions/EWD10xx/EWD1036.html"><span class="caps">EWD</span> 1036, On the cruelty of really teaching computing science</a>“”“</p>
<h1>Research</h1>
<p>All this praise of planning implies you know what you are trying to make. Unlikely!</p>
<p>Coding can be a form of planning. You can’t research how interactions feel without having an actual interaction to look at. You can’t figure out how feasible some techniques are without trying them. Planning without collaborative creativity is dull, planning without research is just documenting someone’s intuition.</p>
<p>The danger is that when you are planning with code, it <em>feels</em> like execution. You can <a href="http://c2.com/cgi/wiki?PlanToThrowOneAway">plan to throw one away</a> to put yourself in the right state of mind, but I think it is better to simply be clear and transparent about <em>why</em> you are writing the code you are writing. Transparent because the danger isn’t just that <em>you</em> confuse your coding with execution, but that anyone else is likely to confuse the two as well.</p>
<p>So code up a storm to learn, code up something usable so people will use it and then you can learn from that too.</p>
<h1>My own conclusion…</h1>
<p>I’m not making an <span class="caps">MVP</span>. I’m not going to make a maximum viable product either – rather, the next step in the project is not to make a viable product. The next stage is research and learning. Code is going to be part of that. Dogfooding will be part of it too, because I believe that’s important for learning. I fear thinking in terms of “<span class="caps">MVP</span>” would let us lose sight of the <em>why</em> behind this iteration – it is a dangerous abstraction during a period of product definition.</p>
<p>Also, if you’ve gotten this far, you’ll see I’m not creating minimal viable blog posts. Sorry about that.</p>A Product Journal: Building for a Demo2015-02-18T00:00:00-06:002015-02-18T00:00:00-06:00Ian Bickingtag:ianbicking.org,2015-02-18:/blog/2015/02/product-journal-building-a-demo.html<p>I’ve been trying to work through a post on technology choices, as I had it in my mind that we should rewrite substantial portions of the product. We’ve just upped the team size to two, adding <a href="http://donovanpreston.blogspot.com/">Donovan Preston</a>, and it’s an opportunity to share in some of …</p><p>I’ve been trying to work through a post on technology choices, as I had it in my mind that we should rewrite substantial portions of the product. We’ve just upped the team size to two, adding <a href="http://donovanpreston.blogspot.com/">Donovan Preston</a>, and it’s an opportunity to share in some of these decisions. And get rid of code that was desperately expedient. The server is only <a href="https://github.com/mozilla-services/pageshot/blob/f3df30ccaf64b75426e87325addc6fac373ba220/appengine/pageshotpages/main.py">400ish lines</a>, with some significant copy-and-paste, so we’re not losing any big investment.</p>
<p>Now I wonder if part of the <a href="http://www.joelonsoftware.com/articles/fog0000000069.html">danger of a rewrite</a> isn’t the effort, but that it’s an excuse to go heads-down and starve your situational awareness.</p>
<p>In other news there has been a <a href="http://blog.johnath.com/2015/02/17/home-for-a-rest/">major resignation</a> at Mozilla. I’d read into it largely what Johnathan implies in his post: things seem to be on a good track, so he’s comfortable leaving. But the <span class="caps">VP</span> of Firefox can’t leave without some significant organizational impact. Now is an important time for me to be situationally aware, and for the product itself to show situational awareness. The technical underpinnings aren’t that relevant at this moment.</p>
<p>So instead, if only for a few days, I want to move back into expedient demoable product mode. Now is the time to explain the product to other people in Mozilla.</p>
<p>The choices this implies feel weird at times. What is most important? Security bugs? Hardly! It needs to demonstrate some things to different stakeholders:</p>
<ol>
<li>
<p>There are some technical parts that require demonstration. Can we freeze the <span class="caps">DOM</span> and produce something usable? Only an existence proof is really convincing. Can we do a login system? Of course! So I build out the <span class="caps">DOM</span> freezing and fix bugs in it, but I’m preparing to build a login system where you type in your email address. I’m sure you wouldn’t lie so we’ll just believe you are who you say you are.</p>
</li>
<li>
<p>But I want to get to the interesting questions. Do we require a login for this system? If not, what can an anonymous user do? I don’t have an answer, but I want to engage people in the question. I think one of the best outcomes of a demo is having people think about these questions, offer up solutions and criticisms. If the demo makes everyone really impressed with how smart I am that is very self-gratifying, but it does not engage people with the product, and I want to build engagement. To ask a good question I do need to build enough of the context to clarify the question. I at least need fake logins.</p>
</li>
<li>
<p>I’ve been getting design/user experience help from <a href="http://www.brampitoyo.com/">Bram Pitoyo</a> too, and now we have a number of interesting mockups. More than we can implemented in short order. I’m trying to figure out how to integrate these mockups into the demo itself — as simple as “also look at this idea we have”. We should maintain a similar style (colors, basic layout), so that someone can look at a mockup and use all the context that I’ve introduced from the live demo.</p>
</li>
<li>
<p>So far I’ve put no effort into onboarding. A person who picks up the tool may have no idea how it is supposed to be used. Or maybe they would figure it out: I haven’t even thought it through. Since <em>I</em> know how it works, and I’m doing the demo, that’s okay. My in-person narration is the onboarding experience. But even if I’m trying to explain the product internally, I should recognize I’m cutting myself off from an organic growth of interest.</p>
</li>
<li>
<p>There are other stakeholders I keep forgetting about. I need to speak to the <a href="https://www.mozilla.org/en-US/about/manifesto/">Mozilla Mission</a>. I think I have a good story to tell there, but it’s not the conventional wisdom of what it means to embody the mission. I see this as a tool of direct outward-facing individual empowerment, not the mediated power of federation, not the opting-out power of privacy, not the committee-mediated and developer driven power of standards.</p>
</li>
<li>
<p>Another stakeholder: people who care about the Firefox brand and marketing our products. Right now the tool lacks any branding, and it would be inappropriate to deploy this as a branded product right now. But I can <em>demo</em> a branded product. There may also be room to experiment with a <a href="https://en.wikipedia.org/wiki/Call_to_action_(marketing)">call to action</a>, and to start a discussion about what that would mean. I shouldn’t be afraid to do it really badly, because that starts the conversation, and I’d rather attract the people who think deeply about these things than try to solve them myself.</p>
</li>
</ol>
<p>So I’m off now on another iteration of really scrappy coding, along with some strategic fakery.</p>A Product Journal: As A Working Manager2015-03-10T00:00:00-05:002015-03-10T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-03-10:/blog/2015/03/product-journal-as-a-working-manager.html<p>One of the bigger changes going from engineer to manager was to redefine what I meant by the question: <em>how are we going to do this?</em> As an engineer I would deconstruct that question to ask what is the software we need to build, and the technical barriers we need …</p><p>One of the bigger changes going from engineer to manager was to redefine what I meant by the question: <em>how are we going to do this?</em> As an engineer I would deconstruct that question to ask what is the software we need to build, and the technical barriers we need to remove, to achieve our goals. As a manager I would deconstruct that question to ask what is the process by which we achieve our goals.</p>
<p>When I wear my manager hat and ask “how are we going to do this?” I get a little frustrated when I get the answer “I don’t know.” But that’s unfair – there are always problems to which we do not know the answer. What makes me frustrated is when the answer comes too quick, when someone says “I don’t know” because they are missing something they feel they need to come up with an answer. <em>I don’t know</em> because we have to write more code before we know if the idea is feasible. <em>I don’t know</em> because the decision is someone else’s, and so on.</p>
<p>You know! If the decision is someone else’s, then the answer to the question is: we are going to do this by asking that other person what they want and how they are going to make that decision. If we don’t know if the idea is feasible, then the answer to the question is: we are going to do this by exploring the feasibility of this technique, and doing another iteration of planning once we know more. “I don’t know because…” is fine because it is an answer of sorts, it lets the team make an answer in the form of a process. “I don’t know.” – ended with a period – is even okay for a moment, if you treat it as meaning “I don’t know so we are going to do this by learning.” It’s the “I don’t know, let’s move on” that I don’t like.</p>
<p>But I’m being a little unfair. It’s my job as a manager to answer at the process level. While I try very hard not to pigeonhole people, maybe I should also work harder at accepting when people establish bounds to their role. When you are trying to <em>produce</em> it can make sense to stay focused, to resist going meta. When you are working in a team, you should rely on the diverse skills of your teammates to let go of certain parts of the project. It can be okay to go heads-down. (Sometimes; and sometimes everyone on the team must lift their heads and respond to circumstance.)</p>
<p>This is a long-winded way of saying that I appreciate more of the difference in perspective of an engineer and a manager. It’s hard to hold both perspectives at once, and harder still to act on both.</p>
<p>In my new project I am returning to development, and entering into the role of <em>working manager</em>, an odd way to say that I am performing the same tasks that I am also managing. I <a href="http://www.ianbicking.org/blog/2014/09/professional-transitions.html">cut myself off from programming</a> when I started management so that I would not let myself be distracted from a new role and the considerable learning I had to do. Returning to programming, I can tell I was right to do so.</p>
<p>Moving between these two mindsets, and two very different ways of working, is challenging. In both I want to be proactive, but as a manager towards people, and as an engineer towards code. With people I’m investing my time in small chunks, trying to keep a good velocity of communication, watching for dropped balls, and the payoffs are largely indirect and deferred. With code it takes time to navigate my next task, I want to focus, I’m constantly trying to narrow that focus. And the payoff is direct and immediate: working code. This narrowed focus is a way to push forward much more reliably, so long as I know which way is forward.</p>
<p>But I’m a working manager. Is now the right time to investigate that odd log message I’m seeing, or to think about who I should talk to about product opportunities? There’s no metric to compare the priority of two tasks that are so far apart.</p>
<p>If I am going to find time to do development I am a bit worried I have two options:</p>
<ol>
<li>Keep doing programming after hours</li>
<li>Start dropping some balls as a manager</li>
</ol>
<p>I’ve been doing a little of both. To mitigate the effect of dropping balls I’ve tried my best to be transparent about this. It may have been effective: I am not doing my best work on X, because I’m trying to do my best work on Y. But I won’t really know if this has worked until later, turnaround on relationship feedback takes a while.</p>
<p>An aside: I’ve been learning a bit about <a href="http://www.gv.com/lib/how-google-sets-goals-objectives-and-key-results-okrs">Objectives and Key Results</a>, a kind of quarterly performance analysis structure, and I particularly appreciate how it asks people to attempt to achieve 70% of their identified goals, not 100%. If you commit to 100% then you’ve committed yourself to a plan you made at the beginning of the quarter. You’ve erased your agency to prioritize. </p>
<p>Anyway, onward and upward, and wish me luck in letting the right balls drop.</p>A Product Journal: The Evolutionary Prototype2015-03-20T00:00:00-05:002015-03-20T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-03-20:/blog/2015/03/product-journal-evolutionary-prototype.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I came upon a new (for me) term recently: <a href="https://en.wikipedia.org/wiki/Software_prototyping#Evolutionary_prototyping">evolutionary prototyping</a>. This is in contrast to the <a href="https://en.wikipedia.org/wiki/Software_prototyping#Throwaway_prototyping">rapid or throwaway prototype</a>.</p>
<p>Another term for the rapid prototype: the “close-ended …</p><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I came upon a new (for me) term recently: <a href="https://en.wikipedia.org/wiki/Software_prototyping#Evolutionary_prototyping">evolutionary prototyping</a>. This is in contrast to the <a href="https://en.wikipedia.org/wiki/Software_prototyping#Throwaway_prototyping">rapid or throwaway prototype</a>.</p>
<p>Another term for the rapid prototype: the “close-ended prototype.” The prototype with a sunset, unlike the evolutionary prototype which is expected to become the final product, even if every individual piece of work will only end up as disposable scaffolding for the final product.</p>
<blockquote>
<p>The main goal when using Evolutionary Prototyping is to build a very robust prototype in a structured manner and constantly refine it.</p>
</blockquote>
<p>The first version of the product, written primarily late at night, was definitely a throwaway prototype. All imperative jQuery <span class="caps">UI</span> and lots of copy-and-paste code. It served its purpose. I was able to extend that code reasonably well – and I played with many ideas during that initial stage – but it was unreasonable to ask anyone else to touch it, and even I hated the code when I had stepped away from it for a couple weeks. So most of the code is being rewritten for the next phase.</p>
<blockquote>
<p>To minimize risk, the developer does not implement poorly understood features. The partial system is sent to customer sites. As users work with the system, they detect opportunities for new features and give requests for these features to developers. Developers then take these enhancement requests along with their own and use sound configuration-management practices to change the software-requirements specification, update the design, recode and retest.</p>
</blockquote>
<p>Thinking about this, it’s a lot like the <a href="https://en.wikipedia.org/wiki/Minimum_viable_product">Minimal Viable Product</a> approach. Of which <a href="http://www.ianbicking.org/blog/2015/01/product-journal-mvp.html">I am skeptical</a>. And maybe I’m skeptical because I see <span class="caps">MVP</span> as reductive, encouraging the aggressive stripping down of a product, and in the process encouraging design based on conventional wisdom instead of critical engagement. When people push me in that direction I get cagey and defensive (not a great response on my part, just acknowledging it). The framing of the evolutionary prototype feels more humble to me. I don’t want to focus on the question “how can we most quickly get this into users hands?” but instead “what do we know we should build, so we can <a href="http://kiriakakis.net/comics/mused/a-day-at-the-park">collect</a> a fuller list of questions we want to answer?”</p>A Product Journal: What Are We Making?2015-04-21T00:00:00-05:002015-04-21T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-04-21:/blog/2015/04/product-journal-what-are-we-making.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I’ve managed to mostly avoid talking about what we’re making here. Perhaps shyness, we (the PageShot team) don’t yet know where it’s going, or if …</p><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I’ve managed to mostly avoid talking about what we’re making here. Perhaps shyness, we (the PageShot team) don’t yet know where it’s going, or if we’ll manage to get this into Firefox.</p>
<p>We are making a tool for sharing on the web. This tool creates a new kind of <em>thing</em> to share, it’s not a communication medium of any kind. We’re calling it <strong>PageShot</strong>, similar to a screenshot but with all the power we can add to it since web pages are much more understandable than pixels. (The things it makes we call a <strong>Shot</strong>.)</p>
<p>The tool emphasizes sharing clips or highlights from pages. These can be screenshots (full or part of the screen) or text clippings. Along with those clips we keep an archival copy of the entire web page, preserving the full context of the page you were looking at and the origin of each clip. Generally we try to save as much information and context about the page as we can. We are trying to avoid <em>choices</em>, the burdensome effort to decide what you might want in the future. The more effort you put into using this tool, the more information or specificity you can add to your Shot, but we do what we can to save <em>everything</em> so you can sort it out later if you want.</p>
<p>I mentioned <a href="http://www.ianbicking.org/blog/2015/01/product-journal-conception.html">earlier</a> that I started this idea thinking about how to make use of frozen copies of the <span class="caps">DOM</span>. What we’re working on now looks much more like a screenshotting tool that happens to keep this copy of the page. This changed happened in part because of <a href="https://blog.mozilla.org/ux/2015/02/save-share-revisit/">user research done at Mozilla</a> around saving and sharing, where I became aware of just how prevalent screenshots had become to many people.</p>
<figure style="float: right; background-color: rgba(240, 240, 240, 0.4); padding: 5px; width: 50%; border: 1px solid #aaa; border-radius: 3px;"><a href="/static/media/pageshot-early-screenshot.png"><img style="width: 100%; height: auto" src="/static/media/pageshot-early-screenshot.png" /></a><figcaption style="font-size: 80%; line-height: 1em; text-align: center;">The current (rough) state of the tool</figcaption></figure>
<p>It’s not hard to understand the popularity of screenshots, specifically on mobile devices. iPhone users at least have mostly figured out screenshotting, functionality that remains somewhat obscure on desktop devices (and for the life of me I can’t get my Android device to make a screenshot). Also screenshots are the one thing that works across applications – even with an application that supports sharing, you don’t really know what’s going to be shared, but you know what the screenshot will contain. You can also share screenshots with confidence: the recipient won’t have to log in or sign up, they can read it on any device they want, once it has arrived they don’t need a network connection. Screenshots are a reliable tool. A lesson I try to regularly remind myself of: availability beats fidelity.</p>
<p>In a similar vein we’ve seen the rise of the animated gif over the video (though video resurging now that it’s <em>just a file</em> again), and the smuggling in of long texts to Twitter via images.</p>
<p>A lot of this material moves through communication mediums via links and metadata, but those links and metadata are generally under the control of site owners. It’s up to the site owner what someone sees when they click a link, it’s up to them what the metadata will suggest go into the image previous and description. PageShot gives that control to the person sharing, since each Shot is <em>your link</em>, your copy and your perspective.</p>
<p>As of this moment (April 2015) our designs are still ahead of our implementation, so there’s not a lot to try out at this moment, but this is what we’re putting together.</p>
<p>If you want to follow along, check out the <a href="https://github.com/mozilla-services/pageshot">repository</a>.</p>A Product Journal: As A Building Block2015-04-23T00:00:00-05:002015-04-23T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-04-23:/blog/2015/04/product-journal-building-block.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I teeter between thinking big about <a href="http://www.ianbicking.org/blog/2015/04/product-journal-what-are-we-making.html">PageShot</a> and thinking small. The benefit of thinking small is: how can this tool provide value to people who wouldn’t know if …</p><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I teeter between thinking big about <a href="http://www.ianbicking.org/blog/2015/04/product-journal-what-are-we-making.html">PageShot</a> and thinking small. The benefit of thinking small is: how can this tool provide value to people who wouldn’t know if it would provide any value? And: how do we get it done?</p>
<p>Still I can’t help but thinking big too. The web gave us this incredible way to talk about how we experience the web: the <span class="caps">URL</span>. An incredible amount of stuff has been built on that, search and sharing and archiving and ways to draw people into content and let people skim. Indexes, summaries, APIs, and everyone gets to mint their own URLs and accept anyone else’s URLs, pointing to anything.</p>
<p>But not everyone gets to mint URLs. Developers and site owners get to do that. If something doesn’t have a <span class="caps">URL</span>, you can’t point to it. And every <span class="caps">URL</span> is a pointer, a kind of promise that the site owner has to deliver on, and sometimes doesn’t choose to, or they lose interest.</p>
<p>I want PageShot to give a capability to users, the ability to address anything, because PageShot captures the state of any page at a moment, not an address so someone else can try to recreate that page. The frozen page that PageShot saves is handy for things like capturing or highlighting parts of the page, which I think is the feature people will find attractive, but that’s just a subset of what you might want to do with a snapshot of web content. So I also hope it will be a building block. When you put content into PageShot, you will know it is well formed, you will know it is static and available, you can point to exact locations and recover those locations later. And all via a tool that is accessible to anyone, not just developers. I think there are neat things to be built on that. (And if you do too, I’d be interested in hearing about your thoughts.)</p>A Product Journal: As We May Discuss2015-05-08T00:00:00-05:002015-05-08T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-05-08:/blog/2015/05/product-journal-as-we-may-discuss.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>In a presentation <a href="https://www.youtube.com/watch?v=2jTctBbX_kw">The Revolution Will Be Annotated</a>, Dan Whaley begins with a very high-minded justification for annotation: that it is essential for our existence that we act wisely …</p><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>In a presentation <a href="https://www.youtube.com/watch?v=2jTctBbX_kw">The Revolution Will Be Annotated</a>, Dan Whaley begins with a very high-minded justification for annotation: that it is essential for our existence that we act wisely, and that we can achieve that through deliberation, and that annotation is a building block for open deliberation.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/2jTctBbX_kw" frameborder="0" allowfullscreen></iframe>
<p>In response, let me digress wildly, and talk about elementary school.</p>
<p>It is common to cite large class sizes as a problem, and small class sizes as an opportunity to improve education. But there is debate about whether class size really matters; it certainly correlates with general privilege, but does it cause improvements? At the same time I’ve become much more familiar with the Montessori philosophy of education, and one of the surprising features is a relatively large ideal classroom size, in the 30s. And why? Dr. Montessori had positive theories about age mixture, community size, the culture of the classroom, and so forth – but I’ll add what I think is a Montessori-style reason it’s okay: it’s okay to have less teachers because learning isn’t caused by teachers. Learning is ultimately an internal process, an assimilation and construction of knowledge. Your environment can aid in that process, but the <em>cause</em> is still internal.</p>
<p>I share Dan’s enthusiasm about the importance of dialog to our collective wisdom. But I see dialog as supportive of <em>personal growth</em>, not of collective wisdom – our collective wisdom will increase as we individually grow.</p>
<p>Dan cites one problem with rationalism: we are good at constructing rational arguments to support what we already believed. The annotation remedy is to suppose that what we conveniently leave out of our arguments can be applied later by a diverse set of participants. Annotation makes it harder to make use of fallacies, harder to make use of limited narratives, because the annotator can add them back in.</p>
<p>I will cite another problem with rationalism: even a good rational argument is not good at convincing anyone of anything. A good rational argument is like teaching arithmetic by telling someone that 39301+9402=48703. Maybe even writing out the steps used to make that calculation. You can lay that in front of someone, you can lay a hundred similar examples in front of someone, and they will not learn arithmetic. If the person is disinterested they can just trust you – though then it hardly matters if you were right – but if they are interested I believe that the process of construction is necessary. You have to solve your own math problems to learn math.</p>
<p>Annotation is interesting because it gives another avenue for people to publish their beliefs and enter into dialog. But a global overlay of annotation is not a particularly appealing medium. It’s not a place to come to understanding, to practice the construction of ideas. And I think our collective wisdom depends on an incredible <em>volume</em> of discussion, not just on an increased quality, because you can’t get large scale individual growth without large scale discussion.</p>
<p>I see two features in the typical annotation system: one feature is the ability to talk <em>about</em> other things, with high fidelity. URLs have been a great start to being able to talk about things on the web, but they have some limits. The second feature is the ability to <em>view</em> these annotations implicitly. The second feature is the one I’ve seldom found interesting as a reader, and disagree with as a goal. Viewing annotations as a universal overlay of commentary asks too much of the annotations, provides too little to me as the reader, and I think is an attempt to pursue a kind of rational universal truth that I find little value in. It’s a sense that documents are there to teach us, and annotations make those documents even better teachers.</p>
<p><a href="https://github.com/mozilla-services/pageshot/">PageShot</a> takes a different approach: it gives anyone the ability to talk about anything on the web, but each time you do that you create a new resource, <em>your</em> discussion lives at <em>your</em> document. You can write your commentary for a specific audience, and then give it to that audience, without having your intended audience confused with the original author’s intended audience. You can throw away what you have to say. The person who clicks on your link did it to see what you said, they aren’t some passerby. You can say implicitly through highlighting, <em>here is what I thought was interesting</em>. But PageShot applies no universality to what you say, it is a tool only of dialog. This makes PageShot more modest, but intentionally so.</p>A Product Journal: Community Participation2015-05-13T00:00:00-05:002015-05-13T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-05-13:/blog/2015/05/product-journal-community-participation.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>Generally at Mozilla we want to engage and activate our community to further what we do. Because all our work is open source, and we default to open on …</p><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>Generally at Mozilla we want to engage and activate our community to further what we do. Because all our work is open source, and we default to open on our planning, we have a lot of potential to include people in our work. But removing barriers to participation doesn’t make participation happen.</p>
<p>A couple reasons it’s particularly challenging:</p>
<ol>
<li>
<p>Volunteers and employees work at different paces. Employees can devote more time, and can have pressures to meet deadlines so that sometimes the work just needs to get done. So everything is going fast and a volunteer can have a hard time keeping up. Until the project is cancelled and then <em>wham</em>, the employees are all gone.</p>
</li>
<li>
<p>Employees become acclimated to whatever processes are asked of them, because whether they like it or not that’s the expectation that comes with their paycheck. Sometimes employees put up with stupid shit as a result. And sometimes volunteers aren’t willing to make investments to their process even when it’s the smart thing to do, ‘cause who knows how long you’ll stick around?</p>
</li>
<li>
<p>Employee work has to satisfy organizational goals. The organization can try to keep these aligned with mission goals, and keep the mission aligned with the community, but when push comes to shove the organization’s goals – including the goals that come from the executive team – are going to take priority for employees.</p>
</li>
<li>
<p>Volunteers are unlikely to be devoted to Mozilla’s success. Instead they have their own goals that may intersect with Mozilla’s. This overlap may only occur on one project. And while that’s serendipitous, limited overlap means a limit on the relationships those volunteers can build, and it’s the relationships that are most likely to retain and reward participation.</p>
</li>
</ol>
<p>I have a theory that <em>agency</em> is one of the most important attractors to open source participation. Mozilla, because of its size and because it has a corporate structure, does not offer a lot of personal agency. Though in return it does offer some potential of leverage.</p>
<p>I am not sure what to do with respect to participation in <a href="https://github.com/mozilla-services/pageshot/">PageShot</a>. If I open things up more, will anyone care? What would people care about? Maybe people would care about building a product. Maybe the <a href="http://www.ianbicking.org/blog/2015/04/product-journal-building-block.html">building blocks</a> would be more interesting. We have an <a href="https://github.com/mozilla-services/pageshot/#participation"><span class="caps">IRC</span> channel</a>, but we also meet regularly over video, which I think has been important for us to assimilate the concept and goals of the project. Are there other people who would care to show up?</p>
<p>I’m also somewhat conflicted about trying to bring people in. Where will PageShot end up? The project could be cancelled. It’s open source, sure, but is it <em>interesting</em> as open source if it’s a deadend addon with no backing site? Our design is focused on making something broadly appealing such that it could be included in the browser – and if things go well, the addon will be part of the browser itself. If that happens (and I hope it will!) even my own agency with respect to the project will be at threat. That’s what it means to get organizational support.</p>
<p>If the project was devolved into a set of libraries, it would be easier to contribute to, and easier for volunteers to find value in their participation. Each piece could be improved on its own, and can live on even if the product that inspired the library does not continue. People who use those libraries will maintain agency, because they can remix those libraries however they want, include them in whatever product of their own conception that they have. The problem: I don’t care about the libraries! And I don’t want this to be a <a href="http://www.ianbicking.org/blog/2015/01/product-journal-tech-demo.html">technology demonstration</a>, I want it to be a product demonstration, and libraries shift the focus to the wrong part.</p>
<p>Despite these challenges, I don’t want to give up on the potential of participation. I just doubt would look like normal open source participation. I’ve <a href="https://github.com/mozilla-services/pageshot#participation">expanded our participation section</a>, including an invitation to our standup meetings. But mostly I need to know if anyone cares, and if you do: what do you care about and what do you want from your participation?</p>A Product Journal: Objects2015-07-16T00:00:00-05:002015-07-16T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-07-16:/blog/2015/07/product-journal-objects.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I’ve been reading the <a href="http://worrydream.com/EarlyHistoryOfSmalltalk/">Early History Of Smalltalk</a>, notes by Alan Kay, and <a href="http://worrydream.com/EarlyHistoryOfSmalltalk/#coda">this</a> small note jumped out at me:</p>
<blockquote>
<p>Another late-binding scheme that is already necessary is …</p></blockquote><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>I’ve been reading the <a href="http://worrydream.com/EarlyHistoryOfSmalltalk/">Early History Of Smalltalk</a>, notes by Alan Kay, and <a href="http://worrydream.com/EarlyHistoryOfSmalltalk/#coda">this</a> small note jumped out at me:</p>
<blockquote>
<p>Another late-binding scheme that is already necessary is to get away from direct protocol matching when a new object shows up in a system of objects. In other words, if someone sends you an object from halfway around the world it will be unusual if it conforms to your local protocols. At some point it will be easier to have it carry even more information about itself–enough so its specifications can be “understood” and its configuration into your mix done by the more subtle matching of inference.</p>
<p>[…]</p>
<p>This higher computational finesse will be needed as the next paradigm shift–that of pervasive networking–takes place over the next five years. Objects will gradually become active agents and will travel the networks in search of useful information and tools for their managers. Objects brought back into a computational environment from halfway around the world will not be able to configure themselves by direct protocol matching as do objects today. Instead, the objects will carry much more information about themselves in a form that permits inferential docking. Some of the ongoing work in specification can be turned to this task.</p>
</blockquote>
<p>An object, sent over the network; it does not exactly have a common protocol, class, or <span class="caps">API</span>, but enough information so it can be understood, matched up with some function or purpose according to inference. We could also assume given this is from Alan Kay that the vision here is that code, not just data, is part of the object and information (though to consider <em>code</em> to be <em>information</em>: that is quite a challenge to our modern sensibilities).</p>
<p>When I read this, it struck me that we have these objects all around us. The web page: remote, transferable, transformable, embodying functionality and data, with rich information suitable for inference.</p>
<p>The web page has a kind of minimal protocol, though nothing is entirely forbidden in how it is interpreted. For instance the page is named in its <code><title></code>. But probably it has a better name in its <code><meta name=og:title></code>, should one exist; nothing is truly formal except by how it will be conventionally interpreted. The protocol is flexible. It has internal but opaque state. The object can initiate activity in a few ways, primarily XMLHttpRequest and a small number of <a href="https://developer.mozilla.org/en-US/docs/WebAPI">APIs</a> available to it. The page receives copious input in the form of events.</p>
<p>It’s an impoverished object in so many ways. And it’s hardly what you would call universal, it’s always representing visual pages for the browser. When programming if the browser isn’t our intended audience then we choose something like <span class="caps">JSON</span> or <span class="caps">REST</span>: one dead data, one a possessed and untransferable object (I would assert that in <span class="caps">REST</span> the object is the server and not the document).</p>
<p>And yet the web page is an incredible object! Web pages are sophisticated and well cared for. Our understanding of them is meticulously documented, <em>including</em> the ambiguity. The web stack is something that has not just been “defined” or “fixed”, but also discovered. Web pages contain gateways into a tremendous number of systems, defined around a surprisingly small set of operations.</p>
<p>But we don’t look at them as objects, we don’t try to deduce or infer much about them. They don’t look like the objects we would define were we to design such a system. But if we shift our gaze from design to discovery then the wealth becomes apparent: these might not be the objects we would ask for, but given the breadth and comprehensiveness of web pages they are the objects we should use. And they actually <em>work</em>, they do a ton of useful things.</p>
<p>Stepping back from the specific product of <a href="https://github.com/mozilla-services/pageshot">PageShot</a>, this is the broad direction that excites me: to understand and make use of these objects that are all around us. (Objects to which Mozilla, with its user agent, has unique access.) But we need to look more broadly at what we can do with these objects. PageShot tries one fairly small thing: capture the visual state at one moment, <a href="http://www.ianbicking.org/blog/2015/04/product-journal-building-block.html">maybe do something with that state</a>. If we just had a handful of these operations, exposed properly (not trapped in the depths of monolithic browsers) I think there are some incredible things to be done. Maybe even a way to bridge from the ad hoc to something more formal; as crazy as the web page execution model seems, it has some nice features, and is the widest deployed sandboxing execution model we have.</p>
<p>In this sense <a href="http://www.2ality.com/2015/06/web-assembly.html">Web Assembly</a> and <a href="http://asmjs.org/"><span class="caps">ASM</span>.js</a> are interesting as effectively competitors to JavaScript, but not competitors to the web platform or web-page-as-object. That makes them different from <a href="https://en.wikipedia.org/wiki/Google_Native_Client">Google Native Client</a>. Yes, Web Assembly is essentially another language for the web platform. But Native Client uses <a href="https://en.wikipedia.org/wiki/Google_Native_Client#Pepper">Pepper</a> which is <em>not</em> the web platform, it’s a parallel platform that attempts to mimic the web platform. <span class="caps">ASM</span>.js and Web Assembler are a demonstrations that we can change significant parts of the code execution while retaining the outward <span class="caps">API</span> of these pages.</p>
<p>I find this all exciting, but I am somewhat half-hearted in my excitement. Reading The Early History Of Smalltalk there’s a certain spirit to their work that I love and often despair at recreating. There is a visionary aspect, but I think more importantly they took a holistic approach. There’s something exciting about opening your mind to far off concepts (a vision) but then try to tie them together creatively, trying different approaches in an effort to maintain simplicity, avoid compromises. The computing systems they worked on were like <a href="http://edutechwiki.unige.ch/en/Microworld">Microworlds</a> of their own creation, they could redefine problems, throw away state, reinvent any interface they chose. And maybe that is also available to us: only when we hopelessly despair about problems we cannot fix are we trapped by our legacy. That is, if you accept the web as it is there is a freedom, an agency in that, because you’ve put aside the things you can’t change.</p>
<p>I suspect Alan Kay would take a dim view of this whole notion. He is not a fan of the web. Another observation from that history:</p>
<blockquote>
<p>Four techniques used together—persistent state, polymorphism, instantiation, and methods-as-goals for the object—account for much of the power. None of these require an “object-oriented language” to be employed—<span class="caps">ALGOL</span> 68 can almost be turned to this style—an <span class="caps">OOPL</span> merely focuses the designer’s mind in a particular fruitful direction. However, doing encapsulation right is a commitment not just to abstraction of state, but to eliminate state oriented metaphors from programming.</p>
</blockquote>
<p>I can’t even begin to phrase web pages in these terms. State is a mess: much hosted on remote servers, some in the <span class="caps">URL</span>, some in the process of the running page, some in cookies or localStorage, all of it constantly being copied and thrown away. Is the <span class="caps">URL</span> the class and the <span class="caps">HTML</span> served over <span class="caps">HTTP</span> the instantiation? These are just painful contortions to find analogs. Methods-as-goals is the one that seems most interesting and challenging, because I cannot quite identify the goals behind this whole endeavour. Automation? Insight? Detection? Creation? Is it different from what Google is doing with its spiders? Is there something distinct about interpretation in the context of a user agent? And when the objects are not willing – I am proposing we bend pages to our will, wrestling control from the expectations of site owners – can you do any delegation? Is there an object waiting to be smithed that encapsulates the page? </p>
<p>More tensions than resolutions. Wish I had time to bathe in those tensions a bit longer.</p>Conway’s Corollary2015-08-27T00:00:00-05:002015-08-27T00:00:00-05:00Ian Bickingtag:ianbicking.org,2015-08-27:/blog/2015/08/conways-corollary.html<p><a href="http://www.melconway.com/Home/Conways_Law.html">Conway’s Law</a> states:</p>
<blockquote>
<p>organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations</p>
</blockquote>
<p>I’ve always read this as an accusation: we are doomed to recreate the structure of our organizations in the structure of software projects. And further: projects …</p><p><a href="http://www.melconway.com/Home/Conways_Law.html">Conway’s Law</a> states:</p>
<blockquote>
<p>organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations</p>
</blockquote>
<p>I’ve always read this as an accusation: we are doomed to recreate the structure of our organizations in the structure of software projects. And further: projects cannot become their True Selves, cannot realize the most superior design, unless the organization is itself practically structureless. That only without the constraints of structure can the engineer make the truly correct choices. Michelangelo sculpted from marble, a smooth and uniform stone, not from an aggregate, where any hit with the chisel might reveal only the chaotic structure and fault lines of the rock and not his vision.</p>
<p>But most software is built, not revealed. I’m starting to believe that Conway’s observation is a corollary, not so clearly cause-and-effect. Maybe we should work with it, not struggle against it. (With age I’ve lost the passion for pointless struggle.) It’s not that developers can’t imagine a design that goes contrary to the organizational structure, it’s that they can’t <em>ship</em> those designs. What we’re seeing is natural selection. And when through force of will such a design is shipped, that it survives and is maintained depends on whether that the organization changed in the process, whether a structure was created to support that design. </p>
<p>A second skepticism: must a particular construction and modularity of code be paramount? Code is malleable, and its modularity is for the purpose of humans. Most of what we do disappears anyway when the machine takes over – functions are inlined, types erased, the pieces become linked, and the machine doesn’t care one whit about everything we’ve done to make the software comprehensible. Modularity is to serve our purposes. And sometimes organization structure serves a purpose; we change it to meet goals, and we shouldn’t assume the people who change it are just busybodies. But those changes are often aspirational, and so those changes are setting ourselves up for conflict as the new structure probably does not mirror the software design.</p>
<blockquote>
<p>If the parts of an organization (e.g. teams, departments, or subdivisions) do not closely reflect the essential parts of the product, or if the relationship between organizations do not reflect the relationships between product parts, then the project will be in trouble… Therefore: Make sure the organization is compatible with the product architecture
– <a href="https://en.wikipedia.org/wiki/Conway%27s_law#cite_note-5">Coplien and Harrison</a></p>
</blockquote>
<p>So change the architecture! There’s more than one way to resolve these tensions.</p>
<p>A last speculation: as described in the <a href="http://c2.com/cgi/wiki?SecondSystemEffect">Second System Effect</a> we see teams rearchitect systems with excessive modularity and abstraction. Maybe because they remember all these conflicts, they remember all the times organizational structure and product motivations didn’t match architecture. The team makes an incorrect response by creating an architecture that can simultaneously embody all imagined organizational structures, a granularity that embodies not just current organizational tensions but also organizational boundaries that have come and gone. But the value is only in predicting future changes in structure, and only then if you are lucky.</p>
<p>Maybe we should look at Conway’s Law as a prescription: projects <em>should only</em> have hard boundaries where there are organizational boundaries. Soft boundaries and definitions still exist everywhere: just like we give local variables meaningful names (even though outside the function no one can tell the difference), we might also create abstractions and modularity that serve immediate and concrete purposes. But they should only be built for the moment and the task at hand. Extra effort should be applied to being <em>ready</em> to refactor in the future, not predicting and embodying those predictions in present modularity. Perhaps this is another rephrasing of Agile and <a href="http://martinfowler.com/bliki/Yagni.html"><span class="caps">YAGNI</span></a>. Code is a liability, agency over that code is an asset.</p>A Product Journal: CSS Object Model2015-12-29T00:00:00-06:002015-12-29T00:00:00-06:00Ian Bickingtag:ianbicking.org,2015-12-29:/blog/2015/12/product-journal-css-object-model.html<blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>And now for something entirely technical!</p>
<p>We’ve had a contributor from <a href="https://outernet.is/">Outernet</a> exploring ways of using PageShot for capturing pages for distribution on their network. Outernet satellite-based content …</p><blockquote>
<p>I’m blogging about the development of a new product in Mozilla, <a href="http://www.ianbicking.org/tag/product-journal.html">look here for my other posts in this series</a></p>
</blockquote>
<p>And now for something entirely technical!</p>
<p>We’ve had a contributor from <a href="https://outernet.is/">Outernet</a> exploring ways of using PageShot for capturing pages for distribution on their network. Outernet satellite-based content distribution network. It’s a neat idea, but one challenge is that it’s <em>very</em> one-way – anyone (given the equipment) can listen in to what the satellites broadcast, but that’s it (at least for the most interesting use cases). Lots of modern websites aren’t setup well for that, so acquiring content can be tricky.</p>
<p>Given that interest I started thinking more about inlining resources. We’ve been hotlinking to resources simply out of laziness. Some things are easy to handle, but <span class="caps">CSS</span> is a bit more annoying because of the indirection of <code>@import</code> and yet more relatively URLs. Until I started poking around I had no idea that there is a <a href="https://developer.mozilla.org/en-US/docs/Web/API/CSS_Object_Model"><span class="caps">CSS</span> Object Model</a>!</p>
<p>Given this there is now experimental support for inlining all <span class="caps">CSS</span> rules into the captured page in PageShot. The support is still incomplete, and my understanding of everything you can do with <span class="caps">CSS</span> is still incomplete. But the code isn’t very hard. One fun thing is that we can test each <span class="caps">CSS</span> rule against the page and see if it is needed. Doing this typically allows 80% of rules to be omitted.</p>
<p>Some highlights of what I’ve learned so far:</p>
<p>There’s two interesting objects: <a href="https://developer.mozilla.org/en-US/docs/Web/API/CSSStylesheet">CSSStylesheet</a> (which inherits from <a href="https://developer.mozilla.org/en-US/docs/Web/API/Stylesheet">Stylesheet</a>) and <a href="https://developer.mozilla.org/en-US/docs/Web/API/CSSRule">CSSRule</a>. </p>
<p><code>document.styleSheets</code>: a list of all stylesheets, both remote (<code><link></code>), inline, and imported (<code>@import</code>) stylesheets.</p>
<p><code>styleSheet.href</code>: the <span class="caps">URL</span> of the stylesheet (<code>null</code> if it was inline).</p>
<p><code>styleSheet.cssRules</code>: a list of all the rules in the stylesheet. </p>
<p><code>cssRule.type</code>: there’s <a href="https://developer.mozilla.org/en-US/docs/Web/API/CSSRule#Type_constants">several types of rules</a>. I’ve chosen to ignore everything but <code>STYLE_RULE</code> and <code>MEDIA_RULE</code> out of laziness.</p>
<p><code>cssRule.cssRules</code>: <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/@media">media rules</a> (like <code>@media (max-width: 600px) {.nav {display: none}}</code>) contain sub-rules (<code>.nav {display: none}</code> in this case).</p>
<p><code>cssRule.parentRule</code>: points back to a media rule if there is one.</p>
<p><code>cssRule.parentStyleSheet</code>: points back to the parent stylesheet. There are probably ways of nesting media rules and stylesheets (that can have <code>media</code> attributes) in ways to create compound media rules that I haven’t accounted for.</p>
<p><code>cssRule.cssText</code>: the text of the rule. This includes both selectors and style, or media queries and all the sub-rules. I just split on <code>{</code> to separate the selector or query. I <em>assume</em> these are representations of the parsed <span class="caps">CSS</span>, and so normalized, but I haven’t explored that in detail.</p>
<p>There’s all sorts of ways to create trees of media restrictions and other complexities that I know I haven’t taken account of, but things Mostly Work Anyway.</p>
<p>Here’s an example that makes use of this to create a single inline stylesheet for a page containing only necessary rules (using <span class="caps">ES6</span>):</p>
<div class="highlight"><pre><span></span><span class="kd">let</span> <span class="nx">allRules</span> <span class="o">=</span> <span class="p">[];</span>
<span class="c1">// CSS rules, some of which may be media queries, form a kind of tree; this gets</span>
<span class="c1">// this puts all the style rules in a flat list</span>
<span class="kd">function</span> <span class="nx">addRules</span><span class="p">(</span><span class="nx">sheet</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">rule</span> <span class="k">of</span> <span class="nx">sheet</span><span class="p">.</span><span class="nx">cssRules</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">rule</span><span class="p">.</span><span class="nx">type</span> <span class="o">==</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">MEDIA_RULE</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">addRules</span><span class="p">(</span><span class="nx">rule</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="nx">rule</span><span class="p">.</span><span class="nx">type</span> <span class="o">==</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">STYLE_RULE</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">allRules</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">rule</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// Then we traverse all the stylesheets and grab rules from each:</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">styleSheet</span> <span class="k">of</span> <span class="nb">document</span><span class="p">.</span><span class="nx">styleSheets</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">styleSheet</span><span class="p">.</span><span class="nx">media</span><span class="p">.</span><span class="nx">length</span> <span class="o">&&</span> <span class="nx">styleSheet</span><span class="p">.</span><span class="nx">media</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="s2">"*"</span><span class="p">)</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span> <span class="o">&&</span> <span class="nx">styleSheet</span><span class="p">.</span><span class="nx">media</span><span class="p">.</span><span class="nx">indexOf</span><span class="p">(</span><span class="s2">"screen"</span><span class="p">)</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// This is a stylesheet for some media besides screen</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="nx">addRules</span><span class="p">(</span><span class="nx">styleSheet</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Then we collect the rules up again, clustered by media queries (with</span>
<span class="c1">// rulesByMedia[""] for no media query)</span>
<span class="kd">let</span> <span class="nx">rulesByMedia</span> <span class="o">=</span> <span class="p">{};</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">rule</span> <span class="k">of</span> <span class="nx">allRules</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">selector</span> <span class="o">=</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">split</span><span class="p">(</span><span class="s2">"{"</span><span class="p">)[</span><span class="mi">0</span><span class="p">].</span><span class="nx">trim</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span> <span class="nb">document</span><span class="p">.</span><span class="nx">querySelector</span><span class="p">(</span><span class="nx">selector</span><span class="p">))</span> <span class="p">{</span>
<span class="c1">// Skip selectors that don't match anything</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="kd">let</span> <span class="nx">mediaType</span> <span class="o">=</span> <span class="s2">""</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">rule</span><span class="p">.</span><span class="nx">parentRule</span> <span class="o">&&</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">parentRule</span><span class="p">.</span><span class="nx">type</span> <span class="o">==</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">MEDIA_RULE</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">mediaType</span> <span class="o">=</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">parentRule</span><span class="p">.</span><span class="nx">cssText</span><span class="p">.</span><span class="nx">split</span><span class="p">(</span><span class="s2">"{"</span><span class="p">)[</span><span class="mi">0</span><span class="p">].</span><span class="nx">trim</span><span class="p">();</span>
<span class="p">}</span>
<span class="nx">rulesByMedia</span><span class="p">[</span><span class="nx">mediaType</span><span class="p">]</span> <span class="o">=</span> <span class="nx">rulesByMedia</span><span class="p">[</span><span class="nx">mediaType</span><span class="p">]</span> <span class="o">||</span> <span class="p">[];</span>
<span class="nx">rulesByMedia</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">rule</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Now we can create a new clean stylesheet:</span>
<span class="kd">let</span> <span class="nx">lines</span> <span class="o">=</span> <span class="p">[];</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">mediaType</span> <span class="k">in</span> <span class="nx">rulesByMedia</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">mediaType</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">lines</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">mediaType</span> <span class="o">+</span> <span class="s2">" {"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">rule</span> <span class="k">of</span> <span class="nx">rulesByMedia</span><span class="p">[</span><span class="nx">mediaType</span><span class="p">])</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">padding</span> <span class="o">=</span> <span class="nx">mediaType</span> <span class="o">?</span> <span class="s2">" "</span> <span class="o">:</span> <span class="s2">""</span><span class="p">;</span>
<span class="nx">lines</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">padding</span> <span class="o">+</span> <span class="nx">rule</span><span class="p">.</span><span class="nx">cssText</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">mediaType</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">lines</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="s2">"}"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kd">let</span> <span class="nx">style</span> <span class="o">=</span> <span class="s2">"<style>"</span> <span class="o">+</span> <span class="nx">lines</span><span class="p">.</span><span class="nx">join</span><span class="p">(</span><span class="s2">"\n"</span><span class="p">)</span> <span class="o">+</span> <span class="s2">"</style>"</span><span class="p">;</span>
</pre></div>
<p>Obviously there could be rules that apply to <span class="caps">DOM</span> elements that aren’t present <em>right now</em> but could be created. And I’m sure it’s omitting fonts and animations. But it’s fun to hack around with.</p>
<p>It might be fun to use this hooked up to <a href="https://developer.mozilla.org/en-US/docs/Web/API/MutationObserver">mutation observers</a> during your testing and find orphaned rules.</p>