As machine learning evolves, we need to update the definition of ‘data scientist’
In the early days of machine learning, hiring good statisticians was the key challenge for AI projects. Now, machine learning has evolved from its early focus on statistics to more emphasis on computation. As the process of building algorithms has become simpler and the applications for AI technology have grown, human resources professionals in AI face a new challenge. Not only are data scientists in short supply, but what makes a successful data scientist has changed.
Divergence between statistical models and neural networks
As recently as six years ago, there were minimal differences between statistical models (usually logistic regressions) and neural networks. The neural network had a slightly larger separation capacity (statistical performance) at the cost of being a black box. Since they had similar potential, the choice of whether to use a neural network or a statistical model was determined by the requirements of each scenario and by the type of professional available to create the algorithm.
More recently, though, neural networks have evolved to support many layers. This deep learning allows for, among other things, effective and novel exploitation of unstructured data such as text, voice, images, and videos. Increased processing capacity, image identifiers, simultaneous translators, text interpreters, and other innovations have set neural networks further apart from statistical models. With this evolution comes the need for data scientists with new skills.
Unchanging elements of building algorithms
Despite the changes in algorithm structures and capabilities, the process of constructing high-quality predictive models still follows a series of steps that hasn’t changed much. More important than the fit and method used is the ability to perform each step of this process efficiently and creatively.
Process to build a supervised algorithm