Tech — Kristina Young

Blog posts about machine learning in production

The notebook anti-pattern

In the past few years there has been a large increase in tools trying to solve the challenge of bringing machine learning models to production. One thing that these tools seem to have in common is the incorporation of notebooks into production pipelines. This article aims to explain why this drive towards the use of notebooks in production is an anti pattern, giving some suggestions along the way.

Personal blog | Towards data science

Testing your ML pipelines

When it comes to data products, a lot of the time there is a misconception that these cannot be put through automated testing. Although some parts of the pipeline can not go through traditional testing methodologies due to their experimental and stochastic nature, most of the pipeline can. In addition to this, the more unpredictable algorithms can be put through specialised validation processes.

Personal blog | Towards data science

Monitoring ML pipelines

I have spoken a lot in this blog about the process of bringing machine learning code to production. However, once the models are in production you are not done, you are just getting started. The model will have to face its worst enemy: The Real World!

Personal blog | Towards data science

Automated model serving to mobile devices

The most common approach to deploying machine learning models is to expose an API endpoint. This API endpoint would generally be called via a POST method containing the input data for the model as the body, and responding with the output of the model. However, an API endpoint is not always the most appropriate solution to your use case.

Personal blog

Terraforming a Spark cluster on Amazon

This post is about setting up the infrastructure to run yor spark jobs on a cluster hosted on Amazon.

Personal blog

Spark Word2Vec: lessons learned

This post summarises some of the lessons learned while working with Spark’s Word2Vec implementation. You may also be interested in the previous post “Problems encountered with Spark ml Wod2Vec”

Personal blog

Conference talks

AMLD 2020

Testing your machine learning pipelines, AI & Industry track

Switzerland

DV Hive 2018

Let evolution do the guessing - how to evolve neural networks

Berlin

MCubed 2020

Monitoring your ML Pipelines

London

DV Hive 2019

Flaming the notebook: ML in the real world

Berlin | group presentation

IEEE CEC 2014

Cooperative DynDE for temporal data clustering

Beijing

DV Hive 2020

A Geospatial Dig

Berlin

DV Hive 2019

Buzz can crack too: How to test your ML pipelines

Berlin

IEEE CEC 2013

A cooperative multi-population approach to clustering temporal data

Cancun

DV Hive 2018

Deployment of machine learning models on mobile devices

Berlin | group presentation

LNCS

Dynamic differential evolution algorithm for clustering temporal data

Sofia

Blog posts about machine learning in production

The notebook anti-pattern

Testing your ML pipelines

Monitoring ML pipelines

Automated model serving to mobile devices

Terraforming a Spark cluster on Amazon

Spark Word2Vec: lessons learned

Conference talks

AMLD 2020

DV Hive 2018

MCubed 2020

DV Hive 2019

IEEE CEC 2014

DV Hive 2020

DV Hive 2019

IEEE CEC 2013

DV Hive 2018

LNCS

Archaeology

Art