Thursday, March 28, 2013

The Single Arity (aka Communicating and why I Quit Email: Series Finale)

Aside: In all, I'd much rather elaborate on wheres, joins, and orders by, but I have a few unfinished pieces. So on with it.

Laying here as I fade in and out of scope over the waves of recovery from back surgery, and in the quiet of these moments, I can enjoy the blissful roar of silence as it screams past my bedroom windows: the silence only an empty inbox can deliver. I started contemplating an email-abandon-ship (Part 1) a little short of a year ago. I let it simmer, and shortly after last September, I pulled the plug completely (Part 2).

Ultimately, I think I've proved my hypothesis: as a medium of communication, email is exactly as useful as flinging a fortune cookie into the ocean and hoping for a response. Email guarantees only two things: your fortune will go somewhere; and wherever it does go, it will probably arrive soggy.

With your head tilted at just the right angle, it's an interesting phenomenon. From an implementation perspective, it's very nearly like Microsoft's decision to reverse engineer JavaScript into JScript for Internet Explorer while faithfully and painfully preserving all of its bugs (I believe +Douglas Crockford's actual quote is in episode 2). It's as if the e-postal express pioneers in the early email client space deliberately preserved all of the faults of asynchronous analog communication for the digital revolution. Not satisfied with that, they added spam.

The web blossomed (eventually) atop the HTTP and peer protocols by standardizing the rules for client interfaces, by imposing constraints, by limiting behavior, and by frequently just refusing to do much of anything at all. The HTML spec is a wretched mess not worthy to spread across a puddle for the crossing of a peasant farmer, but it has not yet proved a total failure. Certainly, it's far from heaven for developers; but it has an API that is sometimes/sort of/maybe consistent across clients. An API just good enough to make beautiful things. Yet most end users are only ever vaguely aware of the browsers or the APIs or the compatibility matrices that stand as a middleman to the Internet--many don't know which browser they use or how or why in gods's names would you want to switch?

Compare that to email. Developers don't care: there is no standard, no API, nothing to build upon. Client "plugins" are mere novelty items. Developers are either wasting away, throwing great creativity after mediocre, building their own new email client which will solve the problems of the world; or developers are doing something useful with their skills (if only!). Users, on the other hand, can immediately tell you which of the thousand available email clients they have used, hated, and finally endured.

And that fundamental difference speaks to the abyss between email and productivity. No one asks, "How do I use the web?" Billions continue to ask, "How do I setup email?"

We need something better, and I don't think we can monkey patch it onto the bubonic plague infected collection of mail clients propagating throughout the world today.

If you peruse +Kwindla Kramer's post The Cloud I'd Like to See, I think his arguments for the future of distributed file systems paint a nice parallel to the problem of email. In fact, if we think of emails as just files in folders which are distributed, shared, portable, synchronized, secured, encrypted, searchable, editable, deletable, revocable, streamable, revisionable and tightly managed in the same way we already manage file systems (as well as including some of Mr. Kramer's insights), the solution seems a bit cleaner.

Of course, there are many more dimensions to communication beyond asynchronous, unidirectional transmission of content--our communication threads start from notes on napkins at the local cafe, transmogrify into mobile calls, mutate into group emails, condense over the course of physical meetings until something resembling a decent cup of tea pops out the end. To this end, its worth remembering that the web itself is a communication platform, not a content delivery platform; and maybe the evolution in multithreaded, asynchronous, highly concurrent, multi-directional, peer-to-peer communication will simply emerge as a natural extension of the web platform itself.

Regardless, I don't know what the solutions will be, but I can say this:

If the urge strikes you to start a message in a bottle, instead consider talking sternly at the nearest wall. It's guaranteed to be more productive.

As always, everything I write belongs to the Public Domain. Please take generously.


Sunday, March 17, 2013

Revisiting: How to Subclass an Array (Really)

Update: +Axel Rauschmayer has an even more succinct post on the subject, which I highly recommend.

In the course of building out my SQL prototype, it's immediately obvious that I have to touch the Array prototype. You could write layers of abstraction to get around this, but in my opinion it is not worth the engineering effort when extending the prototype is cleaner and low risk. Still, we are talking about Array--an object which already has inconsistencies on older browsers and which is ever expanding at the ES spec level, so the idea of subclassing an Array is nice.

Despite my own argument to the contrary over a year ago, I don't think it possible or wise to try Array subclassing. First, let's look at the code I wrote back then:

var array = function () {
    var retArray = Array.prototype.slice.apply(arguments, 0);

    retArray.contains = retArray.contains || function (value) {
        return retArray.indexOf(value) != -1;
    };
    
    return retArray;
};

This has two problems. First, it doesn't execute. We'll get a type exception on the first line. The code needs to be:

var array = function () {
    var slice = Array.prototype.slice;
    var retArray = slice.call(arguments, 0);    

    retArray.contains = retArray.contains || function (value) {
        return retArray.indexOf(value) != -1;
    };
    
    return retArray;
};

It's important to understand the difference between call and apply. While this code does return a new instance of an array with the new contains method--it hasn't actually subclassed. I've polluted the Array.prototype with my new method. In order to actually create a new subclass, you first need an abstraction to help think about prototypical inheritance.  Let's define it as:

  Object.defineProperty(Function.prototype, 'inheritsFrom', {
      value: function(parentClassOrObject) {
          if (parentClassOrObject.constructor === Function) {
              //Normal Inheritance
              this.prototype = new parentClassOrObject();
              this.prototype.constructor = this;
              this.prototype.parent = parentClassOrObject.prototype;
          }
          else {
              //Pure Virtual Inheritance
              this.prototype = parentClassOrObject;
              this.prototype.constructor = this;
              this.prototype.parent = parentClassOrObject;
          }
          return this;
      }

  });

Function inheritsFrom takes in an object and returns a 'this' which has been scoped as a derived class of the parentClassOrObject. This kind of prototype management is one of the reasons that it is harder (at least for me) to reason with this model. But now we have an abstraction to take care of this portion of the headache for us. Let's write the method to actually instance a new subclass:
  
 function makeSubClass(inheritsFrom, constructorCallBack) {

      //Define the method
      var ret = function() {
          //The body of the constructor
          var slice = Array.prototype.slice;
          var args = slice.call(arguments, 0);
          try {
              if (inheritsFrom) {
                  inheritsFrom.apply(this, args);
              }
              //Optional callBack if we want to inject our own logic on construction
              if (constructorCallBack) {
                  constructorCallBack.apply(this, args);
              }
          }
          catch (e) {
              console.error(e);
          }
      };
      //Do the subclassing
      if (inheritsFrom) {
          ret.inheritsFrom(inheritsFrom);
      }
      return ret;
  }

In a lot of use cases, this pattern will work just fine. And if you were to begin playing a new array subclass instanced in this way, it would largely behave normally.

var nuArray = makeSubClass(Array); 
var nuInst = new nuArray();
nuInst.push(1);
nuInst[0] === 1; //true
nuInst.length === 1; //true

But you may begin to notice the drawbacks. nuArray must be instanced with the new keyword, and it can't be instanced with data.

var nuArray = makeSubClass(Array); 
var nuInst2 = new nuArray(1,2,3);
nuInst2.length === 0; //true?!?
nuInst2[0] === undefined; //true?!?
//Try adding data by index
nuInst2[0] = 1;
nuInst2[0] === 1; //true
nuInst2.length === 0; //true?1?

And from here, the experience continues to degrade. Using most of the Array mutator methods and all of the Array iterator methods will operate on and return Array instances--not instances of your subclass. You'll quickly find instance mutation to be rampant and unpredictable.

You can continue down this path and try implementing your own overrides as callbacks. You can get really, really clever with this stuff; but ultimately, in my opinion--as written, Array was never intended to be the parent of a derived class. Just don't go in that pool.

Embrace the extension of native objects, because that use case was clearly planned from the start.

--As always, everything I write, in whatever language I write it, is fully released to the public domain.

Saturday, March 16, 2013

Currying Favor with Partial Application to get JavaScript SQL

There is a domain that attends to curry as one of the quintessential elements of regional cuisine. This is not that domain; though in this domain I would argue we need a paprika.

If the terms curry or partial application are at all unfamiliar to you, I highly recommend Reg Braithwaite's latest opus on the subject. Lending support are +Ben Alman with his own eviscerating tour de force as well as +Axel Rauschmayer with his very, extremely, routinely reputable entry. If you've ever followed me here before, you may remember a fog surrounding an idea that touched on partial application from Good Reads, where I linked to (again to Mr. Braithwaite) the skinny.

Do not be confused, dismayed, disheartened or discouraged. Currying and partial application are not easy-to-grok. The literature on the subject, while vast, is dense. Even in the most skilled hands, attempts to bring these tomes down from Mt. Sinai, have not routinely resulted in greater clarity.

But this is not yet-another-indoctrination on the subject. Defer to the authorities above for that. Here, I'm attempting  to exploit the potential for good. Using both ingredients, it is possible to create a semantic for querying JavaScript objects in a syntax that resembles SQL. Are there other solutions to do this? Yes.

Once you've apprehended grokation of the concepts, it should be straightforward to see the implementation of a few partial application standard methods: map, filter and fold. Lots of libraries have already done the diligence and written these for us but for the sake of having something to write, let's implement them again (In the real world, you're better off taking what functional js or wu.js have already built)!

function curryLeft(func) {
   var slice = Array.prototype.slice;
   var args = slice.call(arguments, 1);
   return function() {
       return func.apply(this, args.concat(slice.call(arguments, 0)));

   }
}

function foldLeft(func,newArray,oldArray) {
    var accumulation = newArray;
    each(oldArray, function(val) {
        accumulation = func(accumulation, val);
    });
    return accumulation;
    
}

function map(func, array) {
    var onIteration = function(accumulation, val) {
        return accumulation.concat(func(val));
    };
    return foldLeft(onIteration, [], array)
}

function filter(func, array) {
    var onIteration = function(accumulation, val) {
        if(func(val)) {
            return accumulation.concat(val);
        } else {
            return accumulation;
        }
    };
    return foldLeft(onIteration, [], array)
}

With just these, we can do something that's almost cool. We can extend the native Array class to add some new methods:

Object.defineProperties(Array.prototype, {
    '_where': {
        value: function(func) {
            return filter(func, this);
        }
    },
    '_select': {
        value: function(func) {
            return map(func, this);
        }
    }
});

At this point, given an instance of an Array (thanks to Faker), like:

var somePeople = [
    {"FirstName":"Cristina", "LastName":"Quigley", "PhoneNumber":"1-189-868-2830", "Email":"Imelda@lourdes.ca", "Id":0},
    {"FirstName":"Eriberto", "LastName":"Bailey", "PhoneNumber":"1-749-549-2050 x36612", "Email":"Pamela_Gaylord@ludie.net", "Id":1},
    {"FirstName":"Amina", "LastName":"Schaden", "PhoneNumber":"463-301-9579 x9511", "Email":"Conner_Gusikowski@jolie.tv", "Id":2}];

If we wanted to select the FirstName of each record, we could do something crude like:

somePeople._select(function(row) { return row.FirstName });

But this is still too obtuse. With a little curry, we can make it better. We used partial application to get to '_select', but we can switch gears to curry to get a better query mechanism. First, let's define a query method: Update: we technically don't need curry here, as we're not abstracting the 'query' object as a parameter. Thanks to Thomas Burette in the comments.

var query = function(array) {
    var tables = [];
    tables.push(array);
    var _query = {
        tables: tables,
        from: from,
        select: select,
        run: run
    };
    return _query;
};

select and from methods are straightforward:

function select() {
    var query = this;

    var slice = Array.prototype.slice;
    var args = slice.call(arguments, 0);
    query.columns = query.columns || [];
    each(args, function(argumentValue) {
        query.columns.push(argumentValue);
    });
    return query;
}

function from(array) {
    var query = this;
    query.tables.push(array);
    return query;
}
which then only leaves execution. I've deliberately not optimized this method for the purpose of illustration: in the absence of the tools, this is what such code looks like. Look at the redundancy and duplication. Marvel at the inelegance. Appreciate the fact that it works.

function run() {
    var query = this;
    var ret = [];
    if (query.columns.length > 0) {
        var results = [];
        each(query.columns, function(columnName) {

            each(query.tables, function(tbl) {
                if (Array.isArray(tbl)) {
                    var res = {};
                    var val = tbl._select(function(val) {
                        return val[columnName];
                    });
                    if (val) {
                        res[columnName] = val;
                        results.push(res);
                    }
                }
            }, true);

        });

        var returnRows = [];
        if(results && results.length > 0) {
            var firstResult = results[0];
            
            each(firstResult, function(val, key) {
                
                each(val, function(cell){
                    var row = {};
                    row[key] = cell;
                    each(results.slice(1), function(result) {
                        each(result, function(v,k){
                            each(v, function(c) {
                                row[k] = c;
                            })
                        },true)
                    },true)
                    returnRows.push(row);
                },true);
                
            },true)
            
        }
        
    }
    return returnRows;
}

Now, this yields a syntax which looks a lot more like SQL:

var newQuery = query(people).select('FirstName', 'LastName');
var results = newQuery.run();

From here, 'where', 'join' (yes I said JOIN), 'orderby' and 'groupby' are all implementation details. This is just a proof-of-concept post, but given some large Faker data sets it already works quite well given its limitations. Refactoring 'run' into a method which utilizes partial application will yield mountains.

As always, everything I blog and code is public domain. You can view the source from the oj-sql project here, collab with me on c9 here, or do whatever strikes your whim. May the wind that strikes your whim be always at your back.