@fitzyfitzyfitzy

Ex-goat herder. Living on a planet in space. Once and future thing. And this is my blog. It has my name on it, and there are dates, the blog-spoor are everywhere.

detecting a pattern with a convolutional network and no data

Imagine a buddy asks you to write a program to detect a simple visual pattern in a video stream. They plan to print out the pattern, hang it on the wall in a gallery, then replace it digitally with artworks to see how they’ll look in the space.

a pattern in a gallery

“Sure,” you say. This is a totally doable problem, since the pattern is so distinctive.

In classic computer vision, a young researcher’s mind would swiftly turn to edge detection, hough transforms, morphological operations and the like. You’d take a few example pictures, start coding something, tweak it, and eventually get something that works more or less on those pictures. Then you’d take a few more and realize it didn’t generalize very well. You’d agonize over the effects of bad lighting, occlusion, scale, perspective, and so on, until finally running out of time and just telling your buddy all the limitations of your detector. And sometimes that would be OK, sometimes not.

Things are different now. Here’s how I’d tackle this problem today, with convnets.

First question: can I get labelled data?

This means, can I get examples of inputs and the corresponding desired outputs? In large quantities? Convnets are thirsty for training data. If I can get enough examples, they stand a good shot at learning what I used to program, and better.

I can certainly take pictures and videos of me walking around putting this pattern in different places, but I’d have to label where the pattern is in each frame, which would be a pain. I might be able to do a dozen or so examples, but not thousands or millions. That’s not something I’d do for this alleged “buddy.” You’d have to pay me, and even then I’d just take your money and pay someone else to do it.

Second question: can I generate labelled data?

Can I write a program to generate examples of the pattern in different scenes? Sure, that is easy enough.

Relevance is the snag here. Say I write a quick program to “photoshop” the pattern in anywhere and everywhere over a large set of scenes. The distribution of such pictures will in general be sampled from a very different space to real pictures. So we’ll be teaching the network one thing in class and then giving it a final exam on something entirely different.

There are a few cases where this can work out anyway. If we can make the example space we select from varied enough that, along the dimensions relevant to the problem, the real distribution is contained within it, then we can win.

Sounds like a real stretch, but it’ll save hand-labelling data so let’s just go for it! For this problem, we might generate input/output pairs like this:

some generated samples

Here’s how these particular works of art were created:

  • Grab a few million random images from anywhere to use as backgrounds. This is a crude simulation of different scenes. In fact I got really lazy here and just used pictures of dogs and cats I happened to have lying around. I should have included more variety. pixplz is handy for this.
  • Shade the pattern with random gradients. This is a crude simulation of lighting effects.
  • Overlay the pattern on a background, with a random affine transformation. This is a crude simulation of perspective.
  • Save a “mask” verson that is all black everywhere but where we just put the pattern. This is the label we need for training.
  • Draw random blobs here and there across the image. This is a crude simulation of occlusion.
  • Overlay another random image on the result so far, with a random level of transparency. This is a crude simulation of reflections, shadows, more lighting effects.
  • Distort the image (and, in lockstep, the mask) randomly. This is a crude simulation of camera projection effects.
  • Apply a random directional blur to the image. This is a crude simulation of motion blur.
  • Sometimes, just leave the pattern out of all this, and leave the mask empty, if you want to let the network know the pattern may not always be present.
  • Randomize the overall lighting.

The precise details aren’t important, the key is to leave nothing reliable except the pattern you want learned. To classic computer vision eyes, this all looks crazy. There’s occlusion! Sometimes the pattern is only partially in view! Sometimes its edges are all smeared out! Relax about that. Here’s a hacked together network that should be able to do this. Let’s assume we get images at 256x256 resolution for now.

the model

A summary:

  • Take in the image, 256x256x3.
  • Batch normalize it - because life is too short to be fiddling around with mean image subtraction.
  • Reduce it to grayscale. This is me choosing not to let the network use color in its decisions, mainly so that I don’t have to worry too much about second-guessing what it might pick up on.
  • Apply a series of 2D convolutions and downsampling via max-pooling. This is pretty much what a simple classification network would do. I don’t try anything clever here, and the filter counts are quite random.
  • Once we’re down at low resolution, go fully connected for a layer or two, again just like a classification network. At this point we’ve distilled down our qualitative analysis of the image, at coarse spatial resolution.
  • Now we start upsampling, to get back towards a mask image of the same resolution as our input.
  • Each time we upsample we apply some 2D convolutions and blend in data from filters at the same scale during downsampling. This is a neat trick used in some segmentation networks that lets you successively refine a segmentation using a blend of lower resolution context with higher resolution feature maps. (Random example: https://arxiv.org/abs/1703.00551).

Tada! We’re done. All we need now is to pick approximately 4 million parameters to realize all those filters. That’s just a call to model.fit or the equivalent in your favorite deep learning framework, and a very very long cup of coffee while training runs. In the end you get a network that can do this on freshly generated images it has never seen:

testing on fresh generated images

I switched to a different set of backgrounds for testing. Not perfect by any means, I didn’t spend much time tweaking, but already to classical eyes this is clearly a different kind of beast - no sweat about the pattern being out of view, partial occlusion also no biggie, etc.

Now the big question - does it generalize? Time to take some pictures of the pattern, first on my screen, and then a print-out when I finally walked all the way upstairs to dust off the printer and bring the pattern to life:

real life testing

Yay! That’s way better than I’d have done by hand. Not perfect but pretty magical and definitely a handy tool! Easy to clean up to do the job at hand.

checking on the gallery image

One very easy way to improve this a lot would be to work at higher resolutions than 256x256 :-)

the year of poop on the desktop

CSV

PSV

So last month an idea surfaced from @j4mie for an alternative data format: poop separated values (PSV). Here’s the complete spec.

PSV Spec

At first, I laughed. Then, maybe I cried a little – 2016 went in a kind of 💩y direction at the end there. Finally I started thinking. I realized that PSV is a brilliant idea, and here’s why.

What we’re talking about here is the :poop: emoji, standarized as U+1F4A9 PILE OF POO in Unicode 6.0. Look at that unicode for a minute: U+1F4A9. That’s a big number. That’s outside the Basic Multilingual Planes my friends, and into the Astral Planes of the unicode standard. You have to have your act together to deal with this. For example, Mathias Bynens covers all the muddles javascript gets into with poop in Javascript has a Unicode Problem. This is not the 128-odd ASCII characters your grandmother grew up with.

So isn’t that an argument not to use an astral symbol as a separator? Let’s reflect on the failings of CSV. Jesse Donat has a great list in a piece called Falsehoods-Programmers-Believe-About-CSVs, mirroring similiar lists about names, geography and so on. The first 8 falsehoods are all encoding related. With CSV, if your data is a table of numbers, you don’t really have to think about encoding at all. That’s nice right up until the moment a non-ASCII character sneaks in there it all goes pear shaped. But if we lead with a mandatory poop symbol, from an astral plane no less, no-one is going to be able to punt on the encoding issue. You have to get it right up front.

The logical conclusion of this idea would be to borrow a string like Iñtërnâtiônàlizætiøn☃💩 from unit tests and use that as the delimiter. But 💩 alone gets us a good way there, and looks cleaner. Err.

Here’s an example of PSV in action. I’m editing a PSV file called checklist.psv that lists my current goals in life. I’m using emacs to edit the file, and git to view differences against a previous version.

PSV and daff

I’ve configured git here to use daff to view tabular differences cleanly. I’m doing this on my phone because phones currently excel at showing emoji – the same thing on a laptop also works fine but the poop is less cheerful looking.

One danger with PSV is that people could get sloppy about quoting rules. With CSV, you’ve a good chance of seeing a comma in your data, so you deal with quoting sooner rather than later to disambiguate that. The business data I’ve seen in CSV form has never had 💩 in it, so I could imagine someone skimping on quoting. One solution for that is for more of us to put poop in our names and transactions, Little Bobby Tables style (xkcd, @mopman’s company).

There aren’t a lot of programs supporting PSV yet. So far as I know, daff is the first. The purpose of daff is making tabular diffs and helping with version control of data, but until format converters crop up you can use it to convert to and from psv as follows:

pip install daff           # or npm install daff -g
daff copy foo.csv foo.psv  # convert csv -> psv
daff copy foo.psv foo.csv  # convert psv -> csv

Or you can write into the text boxes at the start of this post :-).

releasing a library for many languages using Haxe

npm/gem/pypi/php packages

Every programming language is a special snowflake with its own idiosyncratic beauty. Porting code from one language to another is an art, requiring dodging and weaving to give idiomatic results. If a library you’d like to use hasn’t been ported to your language, one option is to use a foreign function interface (FFI). A lot of reference implementations get written in C for this reason. The result is definitely not a thing of beauty, but it works.

Haxe gives another option to a library writer who needs to support communities using different languages. We can write the bulk of the library in Haxe, have that automatically transpiled to the languages we care about, and maybe add a little hand-written code in each language to make the API feel comfortable. This scales the effort involved way down.

For example, I wrote the daff library in Haxe and publish it to:

It turns out that a bunch of PHP users showed up, giving great feedback. That’s the target I personally know least about and would never have gotten around to supporting without Haxe.

The Ruby language was the one I personally cared most about at the time I started this. Ruby isn’t supported by Haxe, but it turns out to be surprisingly easy to add a target to Haxe that is “good enough” at least to translate code that is just a bunch of logic and algorithms (I did this for ruby here).

There’s an important downside to this approach though: you may not get as many pull requests. Users are not likely to be familiar with Haxe (yet), so working with the source will be a challenge for them. Haxe is a very straightforward, “common denominator” language to read and write – but it is a new language.

How square am I

Let me count the ways:

  • The icon my RSS client uses is of a newspaper printed on paper. RSS, RSS client, newspaper - that counts for three ways right there.
  • My t-shirts say nothing witty. I’m currently wearing one emblazoned with an enigmatic logo for a local bike/walk event and more importantly (if typography is any guide) THE LOGOS AND NAMES OF ALL ITS SPONSORS. There’s a white stain from some goop my daughter spilled on me that I half-heartedly and ineffectually tried to rub off with my nails. Update: I changed t-shirt.
  • My laptop has no funky stickers from events or projects. All there is, is a fading Intel Centrino Inside sticker it came with, that has been gradually rotating under my hand. It is now at a 30 degree angle, with traces of exposed glue on the upper side. I don’t plan to take any action about this anytime soon. Update: the sticker fell off.
  • I don’t have an interesting phone. My phone is as dumb as they come, and not in a hip way, more in a total-neglect and out-of-touchy kind of way. 2016 Update: gave in, now editing this in emacs on an new phone so I don’t even have the retro thing going for me any more. argh how do i type ctrl x ctrl s oh right volume down.
  • I’ve never changed my name on twitter. Why are people doing that? Update: I get this now, thanks to pikesley Update: I mean Only Zuul Update: I mean Galactus of the Left BIG PIPES Nun Of The Above Taylor's Wift Human Blockchain Unicode Batman Lord of the Flues Head of Snorkelling 6MillionBitcoinMan Sharing Economist Lol Cream Peter Gunn Jolly Local Swagman Dandy Highwayman Cubic's Rube Cyber with Rosie Cyber with Roadies Institute of Codeine Santos L. Halper Hugh Jif True Bill Clay Jeremy Kylo Ren New Year's Steve Triangle Man Cognitive Sourpuss Manic Bitcoin Miner Null of Kintyre Safety Third Phineas Gage Thousand-yard Stairs London Supercloud register.register WinstonNilesRumfoord Sabre Wulf Cable Select ISA Bus Sides of March Click Dummy Fatuous Pauper Ringlefinch Alonzo Mosely Ian Bothan R Dweeno Horse it's June When Devs Cry Roko's Obelisk Riemann's Zero Penn Rose-Tile Belouis None Rush Goalie 2-Tractor Auth All Is Aflame Jason Bourne Shell SCSI Terminator X bonsaikitten Kid Charlemagne Fronk-en-shteen Metric Martyr Billy Yum-Yum 2x2 Full-Sack Developer Konix Speedking Metropolitan Liberal Glidd of Glood Sexy Cartoon Fox Elite: Liberal Activist Grudge Gee Suswept Big Dope Shill Britain’s Best Tree Rogue Nun Deus X-Wing Ice-Cream Fan Man of Leisure SCMODS R Dweeno Alternative Fax Auntie Kithera Straight Banana iso8601 or GTFO Fake Gnus Two 9s Uptime Vex Machina several people Proof of Kraftwerk UNNECESSARY HASHTAGS Mutant Tycoon LXC Sale Seb O'Teur Spinach Recall.
  • The things I’m putting on this list and not putting on it no doubt reveal assumptions I’m making about what is hip that confirm further my squaritude in ways I can’t even imagine.
  • I totally get grumpy about the dumb things that kids are doing these days.
  • I totally get grumpy about how much sense the dumb things that kids are doing these days ends up making in the end.

Diff and merge CSV files in your git client

Sometimes, you want to version-control your data. As a programmer, many of us are used to putting everything in git. For large datasets, that is currently a recipe for sadness, but smaller ones can work just fine.

There are a few hurdles. CSV files have a special tabular structure that git knows nothing about. This means that diffs will be noisier than they need be, and that git may see conflicts when merging where there are none.

For diffs, James Smith has a great explanation and a good start at a solution. In the git client, he proposes a custom git diff driver that understands CSV structure. On the server side, he shows how to tweak gitlab or (via a plugin called CSVHub) github to get pretty diffs using the daff library.

For merges, on the client side, the coopy library has for some time provided a similar merge driver to let git understand and use CSV structure. As of today, daff can do the same, and it is much easier to install.

$ npm install daff -g
$ daff git csv

Once that is installed, you’ll get nice diffs produced by the same library James used for his github plugin:

Random diff

And you’ll get nice merges too. Let’s look at an example.

Suppose we have this table stored in digi.csv:

NAME,DIGIT
one,1
two,2
thre,33
four,4
five,5

And in one branch we correct thre to three, and in another branch we correct 33 to 3:

NAME,DIGIT        NAME,DIGIT      
one,1	   	  one,1	   
two,2	   	  two,2	   
three,33   	  thre,3	   
four,4	   	  four,4	   
five,5       	  five,5     

If we try to merge these files in vanilla git, we’ll get an ugly conflict:

Auto-merging digi.csv
CONFLICT (content): Merge conflict in digi.csv
Automatic merge failed; fix conflicts and then commit the result.

And the CSV file is no longer a valid CSV file, which is unfortunate if we’re using a CSV-aware editor for it:

NAME,DIGIT
one,1
two,2
<<<<<<< HEAD
thre,3
=======
three,33
>>>>>>> 634275495ecd86c287e292e2719e89a9c1188ed1
four,4
five,5

With a CSV-aware merge driver, we get:

Auto-merging digi.csv
Merge made by recursive.
 digi.csv |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

and the changes are correctly merged:

NAME,DIGIT
one,1
two,2
three,3
four,4
five,5

What if there was a real conflict? Suppose we replaced thre with three in one branch but thirty-three in another? We are told:

1 conflict
Auto-merging digi.csv
CONFLICT (content): Merge conflict in digi.csv
Automatic merge failed; fix conflicts and then commit the result.

And the conflicted file looks like this:

NAME,DIGIT
one,1
two,2
"((( thre ))) thirty three /// three",33
four,4
five,5

This remains a valid CSV file, and so can be edited in a CSV-aware editor - we aren’t suddenly kicked out into needing a text editor.

CosmicOS

Many years ago, I had a list of hobby projects I worked on from time to time, each with a little summary that began: “Until Google solves this problem nicely, …” Most of these problems have now been solved, except this one:

I’d like to be able to communicate with aliens over great distances. Until Google solves this problem nicely, I’m working on a cosmic OS.

So Google hasn’t yet sorted this out, but Hans Freudenthal made a great start back in 1960 with Lincos, a “Language for Cosmic Intercourse.” Lincos starts out in a by-now conventional way (though it was inventing the conventions) with 35 pages describing a message for teaching basic math from first principles. Then it moves on to 12 pages on time. Then (and this is where things get very interesting) a whopping 79 pages on behavior, with imaginary conversations between imaginary personalities called Ha and Hb.

Chapter III: Behavior

Ha and Hb discuss mathematics, since that’s about the only topic for conversation, but that is arbitrary. In their discussions, they introduce useful ideas such as good and bad (in the sense of constructive versus non-constructive). Now we are getting somewhere.

I was struck by the value Freudenthal was able to get from descriptions of extremely basic conversations, and wondered, what could we communicate through richer interactions? What if we described simulated environments that could actually be evaluated, and played forward or reversed, to see full simulated encounters take place? That was the seed for CosmicOS.

The idea with CosmicOS is to start with math, as Freudenthal did, and then build from there to a basic programming language, and then from there to programs and simulations. CosmicOS compiles down to a series of four arbitrary symbols that could be encoded and transmitted any way we like:

CosmicOS as digits

In human-readable form, it looks kind of Lisp-y, since that happened to be the syntax that introduced least complications. CosmicOS is communicated as a long series of definitions and demonstrations:

Human-readable CosmicOS

The initial language isn’t super important, because we quickly bootstrap to any language we want. At the time I was writing this part, I was keen on Java, so I wrote a translator for it, targeting what has to be the least efficient JVM ever written.

   ...
   (field q ((int) new))
   (method <init>-V
     (lambda () /
      let ((vars / cell new / make-hash / vector
                    (pair 0 (self)))
           (stack / cell new / vector)) /
      state-machine (vars) (stack) / ? jvm / ? x / cond
         ((= (x) 0) (jvm aload 0))
         ((= (x) 1) (jvm invokespecial <init>-V 0 0))
         ((= (x) 2) (jvm aload 0))
         ((= (x) 3) (jvm iconst 0))
         ((= (x) 4) (jvm putfield q (int)))
         ((= (x) 5) (jvm return))
         (jvm return))
   )
   ...

Then I wrote a little maze game in Java, shoved it in the message, and promptly dropped the whole project for several years :-). But now I’m back and fiddling with it again, mostly at my son’s goading. I’ve brought the project up-to-date enough to be able to get pull requests. You should contribute! You know you want to.

And just so its clear: I don’t have any particular belief in extraterrestrials or any special reason to be interested in contacting them. It is an interesting puzzle though, figuring out all the different ways we might try to do so. You may also want to check out the recent Archaeology, Anthropology, and Interstellar Communication book.

The Data Commons Co-op

The Data Commons Co-op is a quirky start-up that I’ve been helping out with. Its job is to maximize the impact of the data held by its members, and reduce costs in managing it. Its members are “alternative economy” organizations of all types. Dan Nordley calls it “perhaps the geekiest of all cooperative organizations on the planet!”

DCC Retreat

The infrastructure for collaborative data projects could be a lot more fun than it is now. Open Data initiatives are pushing things forward quite a bit, primarily with government data in mind. That is sort of a top-down direction of data flow. We’re looking at bottom-up, grass-roots economic organizing. Worker co-ops, buying clubs, community gardens, time banks, and so on. There’s a lot of overlap in communities, and potential for network effects. The Data Commons Co-op is a way to pay for the infrastructure that every one needs and no-one can make happen alone. So far we’ve produced a simple diff format for tables documented on the Data Protocols site (some background in an Open Knowledge Labs post), along with two programs called daff and coopy for comparing and merging table versions. Beyond the technology, we’re also figuring out how to a culture of sharing can work in the economy. There’s a lot of reflexive data-hoarding and hiding that goes on, which is totally understandable. For individual organizations, the cost of thinking about all the issues around sharing data can outweigh by far any potential benefit. Hopefully the DCC can tilt that equation!

Daff

I wrote daff to better visualize diffs between tables (daff = data diff). You don’t need this if you work with append-only data, for example a stream of events churned out by a sensor or bureaucracy. But if you have a collection of assertions that can change with time or need correcting, then data diffs are handy.

bridge diff

daff can be used from that command line, as a library, or on github, using James Smith’s CSVHub. CSVHub can convert a diff like this:

bus diff line based

to something like this:

bus diff

Fragments of identity

Hello good evening and welcome. I'm so totally going to make a website one of these days. In the meantime, here are some fragments of my identity.

@fitzyfitzyfitzy