Notes on better search 8/18/2023

Goal: better, more focused search for www.cali.org.

In general the plan is to scrape the site to a vector database, enable embeddings of the vector db in Llama 2, provide API endpoints to search/find things.

Hints and pointers.

  • Llama2-webui – Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
  • FastAPI – web framework for building APIs with Python 3.7+ based on standard Python type hints
  • Danswer – Ask Questions in natural language and get Answers backed by private sources. It makes use of
    • PostgreSQL – a powerful, open source object-relational database system
    • QDrant – Vector Database for the next generation of AI applications.
    • Typesense – a modern, privacy-friendly, open source search engine built from the ground up using cutting-edge search algorithms, that take advantage of the latest advances in hardware capabilities.

The challenge is to wire together these technologies and then figure out how to get it to play nice with Drupal. One possibility is just to build this with an API and then use the API to interact with Drupal. That approach also offers the possibility of allowing the membership to interact with the API too.

Resources on using ddev for Drupal 9 development, Windows edition

Resources

Here’s a list of resources to get you started with DDEV and Drupal 9. As with setting up any new development (or production) environment there are a lot of moving parts and it take some time to get it all right. This list includes “HowTo” articles, tools, and documentation to get it all set up.

Notes

— After running ddev config and before running ddev start for the fist time use your favorite editor to edit .dev/config.yaml to the following:

name: d9-dev
type: drupal9
docroot: web
php_version: "8.1"
webserver_type: apache-fpm
router_http_port: "80"
router_https_port: "443"
xdebug_enabled: false
additional_hostnames: []
additional_fqdns: []
mariadb_version: ""
mysql_version: "8.0"
nfs_mount_enabled: false
mutagen_enabled: false
use_dns_when_possible: true
composer_version: ""
web_environment: []

This will setup DDEV with MySQL 8, PHP 8.1, Drupal 9, and Apache. This matches the dev environment that CALI is using for D9. Check the DDEV docs for more possibilities.

— The DDEV install includes the latest phpmyadmin to help with mysql admin. It’s available in a local browser at <projectName>.ddev.site:8036. Use phpmyadmin to load a dump of the D9 dev database.

— Once WSL2 is setup, use Ubuntu 20.04 to host DDEV.

— DDEV includes git so that’s a good way to manage Drupal. In the CALI world use git to grab a copy of the current D9 code base.

How to fix the Chrome 80 cookies issue in Drupal

There’s plenty of articles out there explaining what the changes are (https://blog.chromium.org/2020/02/samesite-cookie-changes-in-february.html), why they’ve been done (https://www.troyhunt.com/promiscuous-cookies-and-their-impending-death-via-the-samesite-policy) and how to ‘theoretically’ fix them with simple code examples, but we haven’t stumbled upon many articles explaining ‘practical’ solutions to apply to a Drupal site to actually fix the issues that arise due to the stricter cookie policies implemented since the Chrome 80 release.

Source: How to fix the Chrome 80 cookies issue in Drupal

Drupal Distribution: Opigno LMS

It allows to very easily create engaging learning pathsassess the knowledge of students, employees or partners, and monitor their achievements thanks to the reporting dashboards. It offers innovative features like adaptive learning depending on the user’s results, automatic skill management, a mobile application, and much more…

  • manage training paths organized in courses, modules, and activities
  • configure adaptive learning paths
  • manage and ensure skill acquisition by students
  • assess students thanks to varied quizzes
  • manage blended learning by combining online modules with in-house sessions and virtual classrooms
  • award certificates to successful students
  • sell your trainings online
  • facilitate interactions thanks to live meetings, forums and chats
  • and much more!

Opigno LMS is fully compliant with SCORM (1.2 and 2004 v3) and Tin Can (xAPI).

It integrates the innovative H5P technology, making possible to create rich interactive training contents.

Source: Opigno LMS

Some Javascript Tour Libraries

Here’s a list of JS tour libraries that are open source and currently maintained. Tour libraries provide a way for site designers to create guides that will show the features of a website via a walk through of pop-up dialog boxes. They’re really handy for complex sites.

Getty Scholars’ Workspace: A Drupal-based platform for collaborative research | Opensource.com

Built on Drupal, the Getty Institute’s Getty Scholars’ Workspace provides a platform for art historians, and researchers in similar fields, to work collaboratively on multiple projects without having to use several different platforms.

A Drupal-based platform for collaborative research | Opensource.com https://opensource.com/education/16/3/getty-scholars-workspace

The platform includes scholar friendly features like importing Zotero files to create bibliographies and collaboration tools like forums and shared documents. If course it is Drupal so it’ll take some take configuration to get it going. With checking out.

Notes from #DCATL Day 1, part 1

Day 1 of DrupalCamp Altanta was a short day, just Friday afternoon, but there were plenty of excellent sessions on the agenda. I actually took a fair number of notes and picked up several ideas for making the Drupal sites I run, run better.

I started the afternoon with Building a Better Resource: Improving a Drupal Scholarly Journal Platform. This was a solid presentation by Dan Hansen and Jesse Karlsberg that covered a range of topics from migrating a legacy Drupal 6 site to Drupal 7 to capturing a scholarly journal workflow with the Maestro module. Of particular interest is some of the custom module work being done for the Southern Spaces site. These include a text section module that allows content creators to add section level navigation points into lengthy journal articles and juicebox inline that adds a WP-style shortcode for creating Juicebox Galleries easily. Finally work is being done to create a distribution for scholarly journals that would be useful for law reviews. Finally it’s worth noting that author’s submit articles to the journal via word processor files, not through the WYSIWYG editor.

Next up was Growth Hacking with Content, Marketing Automation & Drupal presented by Shellie Hutchens, the Director of Marketing at Mediacurrent. The focus here was on marketing automation and integrating Drupal sites with marketing platforms. The idea is to shape visitor experience on your site to engage the viewer and slowly gather information that can be used for highly targeted marketing whether it be sales, brand visibility, or higher levels of engagement. Mediacurrent supports the development of a number of marketing automation modules that tie Drupal to many popular marketing platforms.

to be continued…

 

Setting Up Apache Solr 4.2 and Drupal 7 For Better Search

Solr is an open source search server based on Apache Lucene. Lucene provides Java-based indexing and a search library, and Solr extends it to provide a variety of APIs and search functionality, including faceted search and hit highlighting, and handles Word and PDF document searching. It also provides caching and replication, making it scalable, robust, and very fast.
Happily, Solr also plays nicely with Drupal, the popular CMS platform. If you want fast and effective search on your Drupal site, installing Solr is a straightforward way of getting it quickly. Until this month, the Apachesolr Drupal module didnt support the current Solr 4.x schemas, but as of the very latest version of the Apachesolr module, 7.x-1.2, you can now set up Solr 4.x on your Drupal 7 site. This tutorial assumes that youre running Drupal 7.22 the most up-to-date version under Apache on a Linux box.

via How to set up Solr 4.2 on Drupal 7 with Apache.

If you running Drupal and have a lot of nodes to index and you’re not using Solr you’re missing out on a lot. Though it takes a bit of config to set up, using Solr to index and search your Drupal site is much better than the stock Drupal search.