Drupal 11.0 will require PHP 8.3 and MySQL 8.0

Drupal 11 development has reached a point where the system requirements are being raised in the development branch. To prepare core developers for this and to inform the community at large, we are announcing the following requirements for Drupal 11. … Drupal 11 will require PHP 8.3 and older versions of PHP are not supported. … The minimum database requirements for backends supported by Drupal 11 are MySQL 8.0 …

Source: Drupal 11.0 will require PHP 8.3 and MySQL 8.0

StarCoder2 – open source code completion models

StarCoder2 is a family of code generation models (3B, 7B, and 15B), trained on 600+ programming languages from The Stack v2 and some natural language text such as Wikipedia, Arxiv, and GitHub issues. The models use Grouped Query Attention, a context window of 16,384 tokens, with sliding window attention of 4,096 tokens. The 3B & 7B models were trained on 3+ trillion tokens, while the 15B was trained on 4+ trillion tokens. For more details check out the paper.

StarCoder2 @ Github

StarCoder2 is a family of open LLMs for code and comes in 3 different sizes with 3B, 7B and 15B parameters. The flagship StarCoder2-15B model is trained on over 4 trillion tokens and 600+ programming languages from The Stack v2. All models use Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and were trained using the Fill-in-the-Middle objective.

StarCoder2 offers three model sizes: a 3 billion-parameter model trained by ServiceNow, a 7 billion-parameter model trained by Hugging Face, and a 15 billion-parameter model trained by NVIDIA using NVIDIA NeMo on NVIDIA accelerated infrastructure:

StarCoder2 @ Hugging Face

 

Drupal 11 is now open for development, Drupal 10.3.x is branched

Starting today, the Drupal 11.x branch is used for building the next major Drupal version, Drupal 11. This means that major version specific changes can now happen on the Drupal 11.x branch. This includes dependency and requirements updates and removal of deprecated API and extensions. Details are available in the allowed changes during Drupal core release cycle document. Drupal 11 is planned to be released either on the week of June 17, week of July 29 or week of December 9, 2024, depending on when beta requirements are completed.

Source: Drupal 11 is now open for development, Drupal 10.3.x is branched

Mountpoint for Amazon S3 – Generally Available and Ready for Production Workloads | AWS News Blog

Mountpoint for Amazon S3 is an open source file client that makes it easy for your file-aware Linux applications to connect directly to Amazon Simple Storage Service (Amazon S3) buckets. Announced earlier this year as an alpha release, it is now generally available and ready for production use on your large-scale read-heavy applications: data lakes, machine learning training, image rendering, autonomous vehicle simulation, ETL, and more. It supports file-based workloads that perform sequential and random reads, sequential (append only) writes, and that don’t need full POSIX semantics.

Mountpoint for Amazon S3 – Generally Available and Ready for Production Workloads

 

Notes on better search 8/18/2023

Goal: better, more focused search for www.cali.org.

In general the plan is to scrape the site to a vector database, enable embeddings of the vector db in Llama 2, provide API endpoints to search/find things.

Hints and pointers.

  • Llama2-webui – Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
  • FastAPI – web framework for building APIs with Python 3.7+ based on standard Python type hints
  • Danswer – Ask Questions in natural language and get Answers backed by private sources. It makes use of
    • PostgreSQL – a powerful, open source object-relational database system
    • QDrant – Vector Database for the next generation of AI applications.
    • Typesense – a modern, privacy-friendly, open source search engine built from the ground up using cutting-edge search algorithms, that take advantage of the latest advances in hardware capabilities.

The challenge is to wire together these technologies and then figure out how to get it to play nice with Drupal. One possibility is just to build this with an API and then use the API to interact with Drupal. That approach also offers the possibility of allowing the membership to interact with the API too.

Demystifying Text Data with the unstructured Python Library | Saeed Esmaili

In the world of data, textual data stands out as being particularly complex. It doesn’t fall into neat rows and columns like numerical data does. As a side project, I’m in the process of developing my own personal AI assistant. The objective is to use the data within my notes and documents to answer my questions. The important benefit is all data processing will occure locally on my computer, ensuring that no documents are uploaded to the cloud, and my documents will remain private.

To handle such unstructured data, I’ve found the unstructured Python library to be extremely useful. It’s a flexible tool that works with various document formats, including Markdown, , XML, and HTML documents.

Demystifying Text Data with the unstructured Python Library — https://saeedesmaili.com/demystifying-text-data-with-the-unstructured-python-library/