Goal: better, more focused search for www.cali.org.
In general the plan is to scrape the site to a vector database, enable embeddings of the vector db in Llama 2, provide API endpoints to search/find things.
Hints and pointers.
- Llama2-webui – Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
- FastAPI – web framework for building APIs with Python 3.7+ based on standard Python type hints
- Danswer – Ask Questions in natural language and get Answers backed by private sources. It makes use of
- PostgreSQL – a powerful, open source object-relational database system
- QDrant – Vector Database for the next generation of AI applications.
- Typesense – a modern, privacy-friendly, open source search engine built from the ground up using cutting-edge search algorithms, that take advantage of the latest advances in hardware capabilities.
The challenge is to wire together these technologies and then figure out how to get it to play nice with Drupal. One possibility is just to build this with an API and then use the API to interact with Drupal. That approach also offers the possibility of allowing the membership to interact with the API too.
Here’s a list of resources to get you started with DDEV and Drupal 9. As with setting up any new development (or production) environment there are a lot of moving parts and it take some time to get it all right. This list includes “HowTo” articles, tools, and documentation to get it all set up.
— After running ddev config and before running ddev start for the fist time use your favorite editor to edit .dev/config.yaml to the following:
This will setup DDEV with MySQL 8, PHP 8.1, Drupal 9, and Apache. This matches the dev environment that CALI is using for D9. Check the DDEV docs for more possibilities.
— The DDEV install includes the latest phpmyadmin to help with mysql admin. It’s available in a local browser at <projectName>.ddev.site:8036. Use phpmyadmin to load a dump of the D9 dev database.
— Once WSL2 is setup, use Ubuntu 20.04 to host DDEV.
— DDEV includes git so that’s a good way to manage Drupal. In the CALI world use git to grab a copy of the current D9 code base.
There’s plenty of articles out there explaining what the changes are (https://blog.chromium.org/2020/02/samesite-cookie-changes-in-february.html), why they’ve been done (https://www.troyhunt.com/promiscuous-cookies-and-their-impending-death-via-the-samesite-policy) and how to ‘theoretically’ fix them with simple code examples, but we haven’t stumbled upon many articles explaining ‘practical’ solutions to apply to a Drupal site to actually fix the issues that arise due to the stricter cookie policies implemented since the Chrome 80 release.
Source: How to fix the Chrome 80 cookies issue in Drupal
What a Drupal website for distance learning should be like? We will talk about technologies for students’ assignments, collecting statistics on the work done, functionality, Drupal modules, and distributions for its work.
Source: Drupal for e-Learning websites
Here’s a list of JS tour libraries that are open source and currently maintained. Tour libraries provide a way for site designers to create guides that will show the features of a website via a walk through of pop-up dialog boxes. They’re really handy for complex sites.
H5P empowers everyone to create, share and reuse interactive content – all you need is a web browser and a web site that supports H5P.
Source: Set up H5P for Drupal
Built on Drupal, the Getty Institute’s Getty Scholars’ Workspace provides a platform for art historians, and researchers in similar fields, to work collaboratively on multiple projects without having to use several different platforms.
A Drupal-based platform for collaborative research | Opensource.com https://opensource.com/education/16/3/getty-scholars-workspace
The platform includes scholar friendly features like importing Zotero files to create bibliographies and collaboration tools like forums and shared documents. If course it is Drupal so it’ll take some take configuration to get it going. With checking out.
Day 1 of DrupalCamp Altanta was a short day, just Friday afternoon, but there were plenty of excellent sessions on the agenda. I actually took a fair number of notes and picked up several ideas for making the Drupal sites I run, run better.
I started the afternoon with Building a Better Resource: Improving a Drupal Scholarly Journal Platform. This was a solid presentation by Dan Hansen and Jesse Karlsberg that covered a range of topics from migrating a legacy Drupal 6 site to Drupal 7 to capturing a scholarly journal workflow with the Maestro module. Of particular interest is some of the custom module work being done for the Southern Spaces site. These include a text section module that allows content creators to add section level navigation points into lengthy journal articles and juicebox inline that adds a WP-style shortcode for creating Juicebox Galleries easily. Finally work is being done to create a distribution for scholarly journals that would be useful for law reviews. Finally it’s worth noting that author’s submit articles to the journal via word processor files, not through the WYSIWYG editor.
Next up was Growth Hacking with Content, Marketing Automation & Drupal presented by Shellie Hutchens, the Director of Marketing at Mediacurrent. The focus here was on marketing automation and integrating Drupal sites with marketing platforms. The idea is to shape visitor experience on your site to engage the viewer and slowly gather information that can be used for highly targeted marketing whether it be sales, brand visibility, or higher levels of engagement. Mediacurrent supports the development of a number of marketing automation modules that tie Drupal to many popular marketing platforms.
to be continued…
Solr is an open source search server based on Apache Lucene. Lucene provides Java-based indexing and a search library, and Solr extends it to provide a variety of APIs and search functionality, including faceted search and hit highlighting, and handles Word and PDF document searching. It also provides caching and replication, making it scalable, robust, and very fast.
Happily, Solr also plays nicely with Drupal, the popular CMS platform. If you want fast and effective search on your Drupal site, installing Solr is a straightforward way of getting it quickly. Until this month, the Apachesolr Drupal module didnt support the current Solr 4.x schemas, but as of the very latest version of the Apachesolr module, 7.x-1.2, you can now set up Solr 4.x on your Drupal 7 site. This tutorial assumes that youre running Drupal 7.22 the most up-to-date version under Apache on a Linux box.
via How to set up Solr 4.2 on Drupal 7 with Apache.
If you running Drupal and have a lot of nodes to index and you’re not using Solr you’re missing out on a lot. Though it takes a bit of config to set up, using Solr to index and search your Drupal site is much better than the stock Drupal search.