Demystifying Text Data with the unstructured Python Library | Saeed Esmaili

In the world of data, textual data stands out as being particularly complex. It doesn’t fall into neat rows and columns like numerical data does. As a side project, I’m in the process of developing my own personal AI assistant. The objective is to use the data within my notes and documents to answer my questions. The important benefit is all data processing will occure locally on my computer, ensuring that no documents are uploaded to the cloud, and my documents will remain private.

To handle such unstructured data, I’ve found the unstructured Python library to be extremely useful. It’s a flexible tool that works with various document formats, including Markdown, , XML, and HTML documents.

Demystifying Text Data with the unstructured Python Library — https://saeedesmaili.com/demystifying-text-data-with-the-unstructured-python-library/

What If You Could Store a Complete Document in a URL?

Hashify does not solve a problem, it poses a question: what becomes possible when one is able to store entire documents in URLs?

via Hashify.

Try contemplating this for a few minutes. A whole document. In a URL. Not a link to a document, the actual document. Encoded and stored in a URL. Thousands of characters reduced to a handful in a short URL.

Imagine a blog that is just a store of URLs. A messaging system that trades links instead of text. A commenting API that just delivers URLs.

In education there could be student portfolios that are literally a collection of links. Exam submissions that require only sending the teacher a URL.

The possibilities seem endless and exciting. You can give it a try at http://hashify.me/.