What is Model-Centric and Data-Centric?

Tram Ho

Hello everyone, today I will write a slightly different topic than usual. What is Model centric and Data centric and how is it different? As everyone knows Data and model are both important foundation in AI system. Both of these components play an important role in developing a robust model, but which one should you focus on more? In this article, we’ll look at data-centric versus model-centric approaches and compare them.

Model-centric approach

Figure: model-driven approach

Model-centric approach means focusing on using the right set of machine learning algorithms, programming languages, and AI platforms to build quality machine learning models high. This involves choosing the best model architecture. In this approach, we usually keep the data the same and improve the code or model architecture. This approach has led to great progress in the field of machine learning/deep learning algorithms.

Currently, most AI applications focus on models, while most focus on academic research on models and model improvement. According to Andrew Ng, more than 90% of research papers in the field of AI focus on modeling. This is because it is very difficult to generate large datasets that can become universally recognized standards, and data collection is also quite difficult.

Data-centric approach – Data-centric

Figure: data-driven approach

The data-centric approach to AI is focused on getting the right kind of data that can be used to build high-performance, high-quality machine learning models. Unlike model-focused AI, the focus shifts to getting high-quality data for training models rather than models.

In this day and age, where the AI ​​model has evolved and most companies have their own amount of data, data becomes the core of every decision-making process. Some data-centric companies, also known as data-driven approaches, can rely on data to analyze information about company and business operations to adjust their strategies to suit their needs. increase benefits for the company itself. By taking this approach, the results can be more accurate, organized and transparent, which can help the organization run more smoothly. This approach involves systematically changing/improving datasets to increase the accuracy of machine learning applications. Working with data is the central goal of this approach.

Compare the two methods above

For data scientists and machine learning engineers, a model-centric approach seems more exciting. This is understandable because researchers can use their knowledge to solve a particular problem. On the other hand, no one wants to spend all day labeling data as it is seen as time consuming and boring work :v.

However, in today’s machine learning, data is so important that it is often overlooked. As a result, hundreds of hours are wasted refining a model based on faulty data. That is most likely the underlying cause of your model’s lower accuracy, and it has nothing to do with model optimization.

Working with the model is the central goalWorking with data is a central goal
Model OptimizationData collection and processing
Labels are inconsistentData consistency is key
Data is kept fixed after normalizationCode/algorithm is kept fixed
The model is improved iterativelyImproved data quality

Table: Comparison of Model-centric and Data-centric


In my opinion, how to harmoniously combine data and model is the most effective and best way. Because as I mentioned above, not every company has a large amount of data to focus on data. So you have to choose to improve model quality.

Thank you for reading my post.




https://neptune.ai/blog/data-centric-vs-model-centric-machine-learning#:~:text=It’s sometimes referred to as,permanent asset%2C whereas applications change



Share the news now

Source : Viblo