- The Data Science Dossier
- Posts
- LLMs, marketers FOMO and Pattern Matching with Postgres
LLMs, marketers FOMO and Pattern Matching with Postgres
More from Delta, Plotly and Dremio

Back to the usual stuff this week after last week's Search special. I find the rate of development coming out of Databricks interesting, but I also wonder how everyone keeps up. Every week is a flurry of announcements of the latest AI this and that. It feels like in 2024, if it doesn’t have AI attached to it, then marketers are in FOMO mode, which seems a little nonsensical. But, with that said, I think data folks' fear of their jobs being on the line is a little misguided. There will be a significant uptake in self-hosted open-source LLMs, which are tuned to specific target topics depending on the business. So this, to me at least, seems like there will be a need for search experts, data modelling experts, data quality experts and so on because, whilst it may not use the traditional “data warehouse” ETL structure, there will be a need to ensure data quality and consistency on the data being fed into these AI models. And, of course, with all this, is the data warehouse, lake, and lakehouse going away? Of course not. The future is bright, my friend
From the community
From Matthew Powers, Developer Advocate at Databricks, points out that Apache Druid has recently merged an extension to use Delta Tables, which leverages the Delta Kernel project which aims to abstract away the Delta processing logic, making it easier for projects to use Delta tables without having to rewrite all the implementation code in Java or Rust. This looks worth keeping an eye on.
Whilst we talk Delta, Delta Lake 3.1 has been shipped. A whole bunch of improvements in the Spark and Flink implementations.
An interesting webinar last week from Dremio who are demonstrating how S&P Global are building an Azure based Data Lakehouse using their stack.
Following an interesting chat with the folks at Plotly Dash, Dave Gibbon shared with us the website he maintains to help track public Dash demonstration sites. So for a bit of dashboarding inspiration, make sure you check it out.
If you’re interested in Postgres, consider submitting a talk to pgDay Paris or pgConf BE where some of the most knowledgable Postgres practitioners get together to discuss all things Postgres.
On the blog
Keeping with the Postgres theme. This weeks blog, we’ll take a quick look at similarity matching with Postgres. The pg_trgm extension provides a number of functions to build a pattern matcher in Postgres and help triage that wonky data and the like.

I’m Tom Barber
I Help Customers Develop Cloud-Based Platforms for Rapid Startup Launch and MVP Success. Find out more on my website.
Reply