- The Data Science Dossier
- Posts
- Does winglang solve all IaC issues? Uber deploy uVitals and more!
Does winglang solve all IaC issues? Uber deploy uVitals and more!

Welcome to 2024! We’ve all made it, another trip around the sun and are ready for another year of exciting innovation. I think we can safely say last year was dominated by LLMs and how companies can leverage them in the business.
Of course with deploying LLMs also come the hosting, training and security issues that come packaged alongside them. If you need help and support there, you know who to ask!
In the meantime, thank you to all the subscribers of this newsletter, it's a fun and exciting sector to work in and I enjoy sharing all the latest news and innovation with you all.
From the community
First up today we have an interesting project brought to my attention by a community member. This project is Winglang, a project that aims to combine both Infrastructure and Runtime code in the same codebase. I’d be very curious to see how this works in practice, it reminds me of GWT for us terrible web developers who hoped to create great web apps writing Java code, it turned out it wasn’t that easy. But the jury is out, let me know if you’ve tried it and what you think!
Uber show off their Anonomly detection and alerting system, named uVitals which is an unsupervised ML platform specialising in detecting errors in multidimensional datasets. Fun stuff!
Eagle-eyed LLM nerds spotted Microsoft switch up the license in the PHI-2 model to MIT so businesses can leverage the model without fear of the wrath of Microsoft
Folks who still follow Hitachi Vantara and Pentaho will have seen their release of what they are calling Pentaho+ aimed at streaming data across the enterprise, data quality and data cataloguing. There was also a rumour that the latest release of Pentaho Community Edition would be their last. We’ll see what happens here, but Pentaho still lives on over at Hitachi Ventara towers, although the numbers on their homepage look entirely made up….
Databricks continue to be all in on Gen AI and you’d be forgiven to forget they deal primarily in Apache Spark. Continuing down the Gen AI route they have released a suite of tools to help with AI model development.
Interesting play from Dremio, who have teamed up with Carahsoft, the US public sector organisation. Clearly trying to get more Dremio installs in federal locations and seems a good move all around.
Lastly, we’ve got a blog post from Confluent that I saw doing the rounds earlier this month and thought it would be useful. Introducing Apache Flink for Apache Kafka users and how it can enhance stream processing. I do think 2024 will see a large uptick in streaming data use cases, so getting informed about Apache Flink seems a good way to start 2024!
On the blog
At the backend of 2023, we did a dive into what Canonical Data Fabric is and how you spin it up as a new user. The blog and video are available on our site where we take a look at how to deploy Spark jobs to Kubernetes, how to spin up Kafka with Juju and more!
Finally don’t forget we’ve been working hard on our podcast and video content, just before Christmas looking at how to deal with scale and security in AWS in the run-up to the festive period. You can find that and more right here.

I’m Tom Barber
I assist businesses in maximizing the value of their data, enhancing efficiency, performance, and gaining deeper insights. Find out more on my website.
Reply