Free Law Reporter – My Roadmap

Now that the Free Law Reporter (FLR) has had a few weeks to settle processing Public.Resource.Org’s Report of Current Opinions (RECOP) XML feeds into valid HTML and making sure .epub ebooks is working (I even released the code on github), I thought it might be time to lay out where I see FLR heading in the coming months. Right now (5/17/11) you can visit FLR, search through slip opinions from over 60 state and federal jurisdictions, view the documents in HTML, download complete FLR volumes by jurisdiction as ebooks, and download all documents in a search result as an ebook.

Using this as a foundation, I will be adding several additional features to FLR over the coming months:

  • Advanced search and analysis features to tap the power of Solr
  • Create a library to provide a single point of access to the FLR volumes
  • Select specific documents from search results or browsing to add to custom ebooks
  • Increase the size of the corpus that makes up the Free Law Reporter
  • Edit selected documents from search results or browsing to create truly custom ebooks
  • Provide tools for the community to add value to the Free Law Reporter
  • Citations

The milestones that follow will occur in roughly the order they are listed, but there is no set time table for implementing these features. This is due primarily to the other development projects I have on my plate including the main CALI website, Classcaster, eLangdell, Legal Education Commons, and the CALIcon website.

As you will note a number of the features I have in mind for the FLR will require significant community involvement to really materialize. In this context, I see the community as law librarians, law faculty, and law school technologists with an interest in seeing open and unencumbered access to legal resources for everyone.

 

Advanced search and analysis features to tap the power of Solr – The index, analysis, and search features of the FLR are powered by Apache Solr. Right now I am exposing only a minimum of its potential to provide very basic searching of FLR documents. An advanced search option will provide Boolean operators, phrase and term proximity queries, sub queries, and date range queries. The facet search and “More like this” features of Solr will be exposed to provide drill-down capabilities and access to related documents. All of these will provide a much richer and more robust search environment for locating documents.

All of this development will be done in the open with the hope that the community will get involved in shaping how documents are indexed and analyzed, search is done, and results are displayed. Because Solr is an open source project, we have access to the complete inner workings of the engine. Imagine being able to tune Google specifically for legal resources or adjust WestLawNext to work better for law students and faculty. Those are the sorts of things that can be done with FLR because we have control over the system.

Create a library to provide a single point of access to the FLR volumes – With dozens of volumes being added to the FLR every week finding things becomes an issue. Right now search of the corpus returns links to volumes of the FLR that contain specific opinions allowing for the download of ebooks. That isn’t really helpful if all you want is to download the latest Iowa volume to your ereader. I will add a central library mechanism to track all of those volumes as they are created weekly. This work will be done using the Open Publication Distribution System (OPDS) Catalog specification which will generate feeds that can be consumed by various ereaders and will help locate and track FLR volumes. This OPDS feed will act as an interface that will allow community access to the FLR library. Using the OPDS feed, law libraries could add the Free Law Reporter to their local collections.

Select specific documents from search results or browsing to add to custom ebooks – Right now you can save the complete results from an FLR search as an ebook. While useful, this approach has drawbacks including the fact that all of the documents returned by your search may not ultimately be relevant to your search. I will add the ability to review documents returned in a search and select which documents get included in the ebook. That means that the custom FLR volumes you create will be more relevant to your needs. These custom volumes will be assigned a URL and saved in the FLR library so that they can be shared and downloaded again in the future.

This custom ebook feature will provide a way for  faculty and law librarians to assemble custom volumes of FLR documents that can be shared with students or added to a law library’s local collection. With a some work the community can create custom law reporters that are focused on a single topic.

Increase the size of the corpus that makes up the Free Law Reporter – Right now the FLR contains just the slip opinions from Carl Malamud’s RECOP feeds. That means it covers documents issued by over 60 state and federal jurisdictions since about January 1, 2011. This is a very limited scope for a project with this much potential. To expand the scope of the corpus, I plan on adding the approximately 1,000,000 other federal court opinions available on the Public.Resource.Org website. This will push the depth of the FLR collection to include many of the opinions in the Federal Reporter series. I will also add various other sets of documents that are available as HTML (or in XML that can be transformed) such as the U.S. Code to the collection. The addition of these documents will provide greater context for results found through the FLR search interface and more material that can be used to create custom ebooks.

While U.S. Federal material is relatively easy to obtain and incorporate into the FLR, state level material is more difficult to locate and add to the FLR. Certainly the RECOP feeds provide good access to state appellate court material from January 1, 2011 forward, but the backfile of state court opinions is harder to come by. Likewise state codes and statutes are often difficult to locate and are usually not available in a downloadable format. Community involvement will be the key to building out the state collections in the FLR. Law librarians are an excellent resource for locating state legal materials and I would encourage them to work with state courts and governments to  obtain access to downloadable opinions and codes that can be incorporated into the FLR.

Edit selected documents from search results or browsing to create truly custom ebooks – It follows that once you can select specific documents for inclusion in custom FLR volumes, you will want to be able edit those documents to highlight specific points and/or add commentary. Because the source documents for FLR volumes are HTML, I will be able to provide this feature as part of a process that will allow you to search or browse for documents, select those documents for inclusion in a custom ebook, edit those documents as you see fit, and add your own chapters to the ebook. Once the selection and editing is complete you will be able to save the volume and you will be provided with URL for the volume that you can share or use to download the ebook.

As with the simple selection and publishing features, this feature will provide a way for faculty and law librarians to assemble custom volumes of FLR documents that can be shared with students or added to a law library’s local collection. With a some work, the community can create things like annotated law reporters and statute books. Law faculty can create customized course materials for their students.

Provide tools for the community to add value to the Free Law Reporter – One of the major feature sets I plan to add to the FLR are tools that will allow the community to add value to the collections. For example, tools for adding head notes to a document, tagging a document, and adding commentary to a document. These will provide the community with the capability to enhance and extend the value of the FLR. We all need to get involved in making the Free Law Reporter into a resource that is of great value to students, researchers, and the public, a resource that provides free and unencumbered access to legal materials to those who need to learn about the law.

Citations – I have already been asked several times about how one would cite to the Free Law Reporter. My answer has been that right now I would not cite to the Free Law Reporter. The FLR currently only contains slip opinions that are available more easily elsewhere and any citation should be to the more easily available and recognizable source. I do realize that this is not a satisfactory answer. As the FLR grows it will need to be citable and that is very complicated problem. I have included unique identifiers and lots of metadata in the documents added to the FLR so far. What I would like to see happen is that we talk about this and take the opportunity presented by a new law reporter published in a new medium to figure out the best way to create citations for the FLR. I would suggest using the FLR discussion forum for this.

 

This is where I see the Free Law Reporter headed over the coming months. The FLR project is important because it is intended to create  a resource that provides free and unencumbered access to legal materials to those who need to learn about the law. It is important because it will provide a way for a community of law librarians and faculty to come together to create this valuable resource.

Disclaimer – The Free Law Reporter is a CALI project. This roadmap is where I would like to see the  FLR go and it is not intended to commit CALI to any particular direction on the project.

 

CourtListener.com – US Fed Appellate Court Alerts and Yet Another Legal Search Engine

A mention in the BeSpecific blog tipped me off to an interesting project called CourtListener.com. From the about page:

The goal of the site is to create a free and competitive real time alert tool for the U.S. judicial system.

At present, the site has daily information regarding all precedential opinions issued by the 13 federal circuit courts and the Supreme Court of the United States. Each day, we also have the non-precedential opinions from all of the Circuit courts except the D.C. Circuit. This means that by 5:10pm PST, the database will be updated with the opinions of the day, with custom alerts going out shortly thereafter.

The site was created by Michael Lissner as a Masters thesis project at UC Berkley School of Information.

A quick perusal of the site and its associated documents tells us that Michael is using a scraping technique to visit court websites looking for recently released opinions. Once found, the opinions are retrieved, converted from PDF to text, indexed, and stored. Atom RSS feeds are then generated to provide current alerts.

The site is powered by Python using the Django web framework and is open source, so you can download the code. The backend database is MySQL and search is handled by Sphinx. The conversion from PDF appears to be plain text. If you register on the site you can create custom alerts based on saved searches.

All in all CourtListener.com provides another good source for current Federal appellate court opinions. Be sure to check the coverage page to see how far back the site goes for each court. Perhaps the future will bring an expansion to more courts and jurisdictions.

Beware of Openwashing as “Open” Becomes the New Black

The old “open vs. proprietary” debate is over and open won. As IT infrastructure moves to the cloud, openness is not just a priority for source code but for standards and APIs as well. Almost every vendor in the IT market now wants to position its products as “open.” Vendors that don’t have an open source product instead emphasize having a product that uses “open standards” or has an “open API.”

“Openwashing” is a term derived from “greenwashing” to refer to dubious vendor claims about openness. Openwashing brings the old “open vs. proprietary” debate back into play – not as “which one is better” but as “which one is which?”

What does it mean to be open? And how can you tell if a product is really “open”?

via How to Spot Openwashing.

The article goes on to recommend paying close attention to licensing, the community, and a vendors proprietary products to see if their software and APIs are truly open source or just wrapped in a open blanket to take advantage of the latest buzz words.

Over the years I’ve seen a number of instances of openwashing, most notably with companies who built commercial products around a core of open source projects. The companies would make big noise about being open source and such, but community releases would just be a mash-up of other open source projects with the glue and features that comprised the real product they wanted to sell held back as proprietary.

So, buyer/developer beware. That open source based product that looks so cool may really just be a mirage.