Library Innovation Lab leader talks ‘unbinding the law’ with the Caselaw Access Project – Harvard Law Today | Harvard Law Today

Historically, libraries have been collections — books, multimedia materials and artwork. But increasingly they’re about connections, linking digital data in new and different ways, but Harvard Law’s Caselaw Access Project is a state-of-the-art example of that shift.

Source: Library Innovation Lab leader talks ‘unbinding the law’ with the Caselaw Access Project – Harvard Law Today | Harvard Law Today

An idea that would make GitHub really interesting for Open Law

Successfully architected solutions do two things: First, they rely on existing open standards rather than reinventing the wheel. They rely on some of the internet’s greatest hits, things like OAuth and REST, and store data in formats born in the internet age, formats like GeoJSON and markdown. No licenses, no SDKs, just data. Second, they’re built as a dumb core with a smart edge. Upgrading a standard is a monumental task. Upgrading a tool is trivial. But more importantly, there’s room at the edge for experimentation, and with readily available libraries, amazing vehicles of empowerment like, something that nobody knew could exist six months ago, suddenly start appearing over night.

Ben Balter :: That’s not how the internet works

First, go read the article, it’s really good and packed full of interesting points. I’ll wait.
Welcome back!
Now imagine the CFR stored as data on GitHub. A GitCFR repository would be open to all and exposed to the APIs of GitHub. Besides using GeoJSON to locate fire hydrants in your neighborhood you could use GitCFR to find regulations relevant to the manufacture of those fire hydrants. Any API call would return just a specific piece of the regs that could be displayed as the app builder desires.

Of course any of this would require that the GPO set up a system for loading the CFR to GitHub so we don’t have to worry about issues of authenticity. While anyone can grab the bulk XML of the CFR from the GPO’s FDsys website and load it into GitHub, it really needs to the be done by the GPO so that developers can rely on the authenticity of the data.


I know, your first question is “What format?”, but that doesn’t really matter. It could be be JSON, Asciidoc, Markdown, XML, anything so long as it’s regular and structured.Powered by Hackadelic Sliding Notes 1.6.5
That said it would certainly make for an interesting weekend project to throw some section of the CFR into GitHub and see what can be done with existing API calls.Powered by Hackadelic Sliding Notes 1.6.5

An Open Legal Taxonomy: It Just Starts With A List of Words

Lately I’ve been spending a fair amount of time thinking about taxonomies (and probably ontologies too, but I leave that distinction for another day) and their application to the various CALI and free law projects I work on. CALI maintains its own taxonomy for describing legal education materials. We call it the CALI Topic Grids or just Topics. Originally developed as away to guide our authors as they created CALI Lessons, the Topics were intended to identify what a professor wants to teach today. The level of specificity of a Topic is what is the point of a law to be covered in a particular classroom session. Like so many of our other resources the Topics are created by faculty teaching in the area covered by the Topics.

Now reaching beyond Lessons we use the Topics to describe podcasts, blog posts, crossword puzzles, chapters and sections of books, and most recently, court opinions. Like any good taxonomy Topics not only provide a useful way to describe the contents of a resource but also provide a useful finding aid. Indeed the Topics are best expressed as an outline, instantly recognizable to law students and faculty. The Topics serve as the headings for the outline with various resources gathered beneath, see for example

As the free law movement in the US grows one of the most pressing questions that arises is how to categorize and describe the immense body of law. Simple full text searching and basic gathering of meta data about the law is easy enough to accomplish, but all that doesn’t tell us what the law is about. How do I know if a court opinion deals with the formation of a contract or some obscure point of criminal procedure? The short answer is that unless you are using a very large commercial legal data service that includes the use of topics and headnotes in its products you don’t know what the opinion is really about without reading it. Sure you may have a clue from the search that turned up the document, but that isn’t really a lot of reliable data. You need to have that opinion tagged with a known taxonomy.

Applying a taxonomy to law seems like a daunting task, but not as impossible as it once was. Once upon a time the idea of applying a taxonomy to the law in the US was pretty much a non-starter because you couldn’t get access to the law you wanted to categorize. The good news is that we’ve gotten past much of that. We now have access to sizable portions of the law in the US, at least enough to begin applying a taxonomy. Which leads us to the question of the taxonomy itself. How do we do that?

The worlds of taxonomy and ontology (again, I know the 2 are different, but I’m lumping them together here for arguments sake) are awash in a sea of acronyms and competing standards. Most of that stuff is really about the application of a taxonomy or ontology in a given situation, more about the “how” of describing things. That isn’t the main problem. The main issue is words. At their base taxonomies or ontologies are just lists of words. Carefully chosen, domain specific words, but still a list of words. And once you have the words, then you can apply them as you wish.

The creation of a list of words intended to describe the law has been done. Some lists are proprietary and unavailable to the free law movement. Other lists may be too general, more for describing broader collections not individual resources. There is one list, the CALI Topics, that describes specific points of law in individual resources. I would recommend using the CALI Topics as a starting point for creating an open legal taxonomy.

The CALI Topics are not an exhaustive list but with 41 top level topics and 14 published full Topic Grids they are a good start. The Topics can be expanded to include more top level areas of the law and complete Topic Grids can be added to make the Topics more comprehensive. Because the Topics exist as just lists of words they can be adapted to just abut any taxonomy/ontology framework/specification.

By using an existing taxonomy as a base, the free law movement can save a considerable amount of time and effort in getting started on the task of describing the law. The resources saved by adopting an existing taxonomy can then be applied to really hard problem of actually figuring out how to apply specific terms to a given resource. I have some ideas for that too, but I leave those for another post.

So far it seems like most courts are using PDF, at least the opinions are available.

So, I decided to take a look at state court websites to see what opinions are available and in what format. I’m as far as Kentucky and the only real surprise so far is that Alabama wants me to buy a $200 subscription to search what appears to be a home-brewed legal info system. I have no idea what’s up with that. Beyond that everything else is all about the PDFs. The older sections of archives, if they exist, may include HTML and word processor files, but any sort of FTP or other bulk download mechanism is not to be found.

I’ll be pushing forward with this over the next couple of evenings. If you want to follow along, follow this delicious tag:


State of Delaware Offers Authenticated Regulations Online

The Office of the Registrar of Regulations, publishes various documents which conform to accepted standards regarding authenticated digital content. The Office certifes the authenticated documents published on this website and assures the authenticity of the author, source and origin of the authenticated documents when such document bears the following emblem:

DE certification image

via State of Delaware – Delaware Regulations – Home Page.

This is important stuff. By adding this digital signature to the electronic copy of the Delaware Regulations viewers of the the documents can rely on them as accurate and authentic. That means that there is no need to refer to any other copy of the regulations. This pronouncement of authenticity of the digital copy is something that is key to the success of the open access to law movement. Until the bodies that generate the law authenticate the digital copies that are available using them is risky.

For example, look at this disclaimer from the Delaware Code website:

DISCLAIMER: Please Note: With respect to the Delaware Code documents available from this site or server, neither the State of Delaware nor any of its employees, makes any warranty, express or implied, including the warranties of merchantability and fitness for a particular purpose, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. This information is provided for informational purposes only. Please seek legal counsel for help on interpretation of individual statutes.

In a nutshell it disclaims authenticity in the online version of the code making it effectively useless without reference to the authenticated (presumably) print version. You cannot legally rely on the electronic copy of the Delaware Code.

It is worth noting that starting with the 2009 – 2010 session, volume 77, Laws of Delaware are also authenticated with digital signatures. These are the session laws of the Delaware General Assembly.

I’m not sure if any other states are making authenticated copies of regulations or codes available or if any courts are offering authenticated opinions but it is something that needs to be done. In order for the open access and movement to really work there needs to be open access to authenticated copies of the law. Simply having access to the raw data, or to unauthenticated copies is a fine first step, but is really only useful for research and information. After all, the law belongs to all of us, we have a right to open access to authenticated copies of the law.

HT to Law Librarian Blog for the pointer to the beSpecific link to Delaware Regulations.