Python – <CONTENT /> v.6

October 25, 2023

Productionizing and scaling Python ML workloads simply | Ray

Scale your compute-intensive Python workloads. From reinforcement learning to large-scale model serving, Ray makes the power of distributed compute easy and accessible to every engineer.

Source: Productionizing and scaling Python ML workloads simply | Ray

August 18, 2023

Notes on better search 8/18/2023

Goal: better, more focused search for www.cali.org.

In general the plan is to scrape the site to a vector database, enable embeddings of the vector db in Llama 2, provide API endpoints to search/find things.

Hints and pointers.

Llama2-webui – Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
FastAPI – web framework for building APIs with Python 3.7+ based on standard Python type hints
Danswer – Ask Questions in natural language and get Answers backed by private sources. It makes use of
- PostgreSQL – a powerful, open source object-relational database system
- QDrant – Vector Database for the next generation of AI applications.
- Typesense – a modern, privacy-friendly, open source search engine built from the ground up using cutting-edge search algorithms, that take advantage of the latest advances in hardware capabilities.

The challenge is to wire together these technologies and then figure out how to get it to play nice with Drupal. One possibility is just to build this with an API and then use the API to interact with Drupal. That approach also offers the possibility of allowing the membership to interact with the API too.

July 6, 2023

Configuring Jupyter Notebook in Windows Subsystem Linux (WSL2) | by Cristian Saavedra Desmoineaux | Towards Data Science

Here’s a great quick start guide to getting Jupyter Notebook and Lab up and running with the Miniconda environment in WSL2 running Ubuntu. When you’re finished walking through the steps you’ll have a great data science space up and running on your Windows machine.

I am going to explain how to configure Windows 10 and Miniconda to work with Notebooks using WSL2

Source: Configuring Jupyter Notebook in Windows Subsystem Linux (WSL2) | by Cristian Saavedra Desmoineaux | Towards Data Science

June 20, 2023

From Medium :: Mastering AI Summarization: Your Ultimate Productivity Hack

Unlock Your Second Brain with Streamlit and Hugging Face’s Free LLM Summarization: build a Python Webapp running on your PC.

Source: Mastering AI Summarization: Your Ultimate Productivity Hack

This uses a smaller language model tailored to text summarization. Maybe a good path for assessing student short answers and essays.

February 17, 2023

Customizing GPT-3 for Your Application :: OpenAI

Developers can now fine-tune GPT-3 on their own data, creating a custom version tailored to their application. Customizing makes GPT-3 reliable for a wider variety of use cases and makes running the model cheaper and faster.

You can use an existing dataset of virtually any shape and size, or incrementally add data based on user feedback. With fine-tuning, one API customer was able to increase correct outputs from 83% to 95%. By adding new data from their product each week, another reduced error rates by 50%.

Source: Customizing GPT-3 for Your Application

April 4, 2022April 4, 2022

Scraping the Teknoids Mailman PiperMail Archive

Putting this here in case anyone finds themselves in need of something to scrape a Pipermail web archive of a Mailman mailing list. This bit of Python 3 is based on a a bit of Python 2 I found at Scraping GNU Mailman Pipermail Email List Archives. The only changes I made from the original are to update somethings to work in Python 3. It works well for my purposes, generating a single text file of the teknoids list archive from 2005 to today.

#!/usr/bin/env python


import requests

from lxml import html

import gzip

from io import BytesIO
listname = 'teknoids'

url = 'https://lists.teknoids.net/pipermail/' + listname + '/'
response = requests.get(url)

tree = html.fromstring(response.text)
filenames = tree.xpath('//table/tr/td[3]/a/@href')
def emails_from_filename(filename):

print (filename)

response = requests.get(url + filename)

if filename[-3:] == '.gz':

contents = gzip.GzipFile(fileobj=BytesIO(response.content)).read()

else:

contents = response.content

return contents
contents = [emails_from_filename(filename) for filename in filenames]

contents.reverse()
contents = b"\n\n\n\n".join(contents)

with open(listname + '.txt', 'wb') as filehandle: filehandle.write(contents)

December 3, 2020December 3, 2020

KNN (K-Nearest Neighbors) is Dead! | by Marie Stephen Leo | Towards AI | Dec, 2020 | Medium

KNN (K-Nearest Neighbors) is Dead! | by Marie Stephen Leo | Towards AI | Dec, 2020 | Medium https://medium.com/towards-artificial-intelligence/knn-k-nearest-neighbors-is-dead-fc16507eb3e

Learning how to apply some of the algorithms mentioned in this article would likely improve students’ and teachers’ ability to locate CALI resources and allow us to build a useful recommender system.

February 27, 2018

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tag: Python