The Big PictureA little bit over two years ago, Greg Kobele and me showed independently of each other that the feature refinement strategy that works for case-marking and gender agreement can be extended to a vast range of dependencies, including non-local ones. [1,2] As a matter of fact, almost all constraints in the syntactic literature can be compiled directly into the feature system so that they are implicitly enforced by Merge.
As so often in science, Greg and me were standing on the shoulders of giants --- the rare kind of giants with a predilection for formal language theory, who had already shown
- the correspondence between constraints and logical formulas,
- the equivalence of monadic second-order logic and finite-state tree automata,
- a strategy for compiling tree automata directly into grammars.
The Big Picture ExplainedIf you and your mind aren't, respectively, awestruck and blown by the revelation above, then that's probably due to those fancy-schmancy terms that do an excellent job of obscuring the beauty of the underlying idea.
Constraints and LogicLet's look at the correspondence of constraints and logic first. As linguists we usually encounter logic in semantics, where it is used as a description language for propositions. But logic can also be used to talk about structures, in particular trees. For example, we might want to ensure that every reflexive is c-commanded by a DP (the crudest of all crude approximations of Principle A). In first-order logic, this would be expressed by the formula
∀x [reflexive(x) → ∃y [DP(y) & c-commands(y,x)]]
"for every node x it holds that if x is a reflexive, then there is a node y that is labeled DP and c-commands x"
A tree is well-formed with respect to our original constraint iff the formula above is true in the tree. For the most part, then, we are simply using logic as a description language for defining constraints.
Monadic Second -Order LogicJust like grammar formalisms, logics differ with respect to their expressive power, so there are some constraints that can be formalized in logic A but not in logic B. Monadic second-order logic (MSO) is a small extension of first-order logic in which one may quantify not only over individual nodes but also over sets of nodes. Set quantification makes it very easy to talk about linguistic domains, which is why MSO has proven extremly useful for the formalization of syntactic constraints. Rogers (1998)  actually gives a full MSO implementation of GB with Relativized Minimality, which is very impressive considering the byzantine nature of GB.
There's only two things that are both beyond the reach of MSO and at least of peripheral interest to syntacticians. Despite being a logic, MSO has absolutely no grasp of semantic entailment. Zilch, nada. So MSO cannot regulate the distribution of heads and phrases based on semantic factors such as identity of meaning. It is also impossible to check whether two subtrees are identical, which is a problem if, say, your analysis of ellipsis involves deletion under syntactic identity. Everything else is pretty much fair game, though.
Tree AutomataFinite-state tree automata are closely related to MSO. If you have at least one course in formal language theory under your belt, all you need to know is that tree automata are the tree-analogue of finite-state automata over strings. If you don't (or you lost the belt), here's the quick gist: A tree automaton has a finite set of state symbols, and its job is to decide whether a tree is well-formed by assigning a state symbol to every node in the tree. Crucially, which state a node receives depends only on its label and the states of its daughters. So the assignment of states is a very local process, even more local than Merge. A tree is well-formed iff the state assigned to its root is a designated final state.
Despite their strict locality bound, tree automata are exactly as powerful as MSO --- every tree automaton can be converted into an MSO formula, and every MSO formula can be converted into an equivalent tree automaton. For our purposes, this means that MSO-definable long-distance dependencies can be enforced in a way that's even more local than Merge. Intuitively, one long-distance dependency is decomposed into a chain of many, many local dependencies. It isn't too surprising that Merge can enforce these local dependencies, and that is what allows it to emulate long-distance dependencies as long as they are MSO-definable.
Back to Feature RefinementNow that we have seen the big picture of things, how does it all tie into the kind of feature refinement we used for case and gender agreement? Well, the states the automaton assigns to the nodes in a tree can be pushed directly into the category system. Look at the tree below, where a gray box has been added for each state.
If you keep doing this kind of refinement until all lexical items have been refined in all licit ways, your tenacity will be rewarded with a grammar that generates only those trees that are both generated by the original grammar and deemed well-formed by the automaton.
In sum, if you want to express a constraint via Merge, you formalize it as an MSO formula, convert the formula into an automaton, and compile the state assignments of the automaton directly into the feature system. Charmingly straight-forward, isn't it?
 Graf, Thomas (2011): Closure Properties of Minimalist Derivation Tree Languages. In Sylvain Pogodalla and Jean-Philippe Prost (eds.), LACL 2011, Lecture Notes in Artificial Intelligence 6736, 96--111.
 Kobele, Gregory M. (2011): Minimalist Tree Languages are Closed under Intersection with Recognizable Tree Languages. In Sylvain Pogodalla and Jean-Philippe Prost (eds.), LACL 2011, Lecture Notes in Artificial Intelligence 6736, 129--144.
 Rogers, James (1998): A Descriptive Approach to Language-Theoretic Complexity. CSLI: Stanford.