The Flywheel of Machine Learning Systems

The Flywheel of Machine Learning Systems

5 years ago
Anonymous $roN-uuAfLt

https://medium.com/@guyernest/the-flywheel-of-machine-learning-systems-50aa6d992382

Many companies want to ride on the wave of machine learning and AI and are looking for ways to develop such systems into their business. The technology of machine learning, artificial intelligence and deep learning specifically are relatively new, and the number of experts in this domain is limited. The main mistake that some of these companies are doing is to start with the technology and not with the business needs. They are hiring a couple of data scientists and give them access to the databases, and ask them to build something interesting from the data. It is true that you can find some interesting anecdotes in the data, but for a successful system, the process should be different.

A successful team must include at least four different disciplines: product managers (business owner to focus on what is the problem to solve), data engineers (what is the data we can use), data scientists (how to build good models with the data) and DevOps engineers (how to take the models to touch customers’ lives). These are different people using different tools that must all work together to solve the business problem of the team. The concept of “full stack developers” is not applicable for such systems. Even if you can find a data scientist that is also knowledgeable about data transformations (Pandas in python, for example) and also on model serving, she would most likely not going to be able to take it to the scale to spin the flywheel. The focus of the data engineers should be on building and scaling the data pipelines and data lake, the focus of the data scientists should be on building and scaling the model pipelines, and the focus of the DevOps engineers should be on building and optimizing the deployment pipelines.