代码编织梦想

目录

1、使用机器学习模型时,一般怎么处理数据集?When using machine learning models, how do you generally handle data sets?

2、监督学习和无监督学习的区别?What is the difference between supervised machine learning and unsupervised machine learning?

3、什么是训练误差和测试误差?What are training errors and test errors?


1、使用机器学习模型时,一般怎么处理数据集?When using machine learning models, how do you generally handle data sets?

拿到数据集后,首先查看数据是否有缺失值,然后进行缺失值处理。接着查看数据的长度、维度、特征含义,初步了解数据集。划分数据集为训练集和测试集,在训练集上训练模型,得到模型参数,将训练好的模型用于测试集,计算模型对应的测试误差,如果是多个模型则根据测试误差进行模型评估,选择最优模型。
After getting the data set, first check whether the data has missing values, and then perform missing values processing. Then check the length, dimension, and feature meaning of the data to get a preliminary understanding of the data set. Divide the data set into training set and test set, train the model on the training set, obtain model parameters, use the trained model for the test set, calculate the test error corresponding to the model, if it is multiple models, perform model evaluation according to the test error, Select the optimal model.

2、监督学习和无监督学习的区别?What is the difference between supervised machine learning and unsupervised machine learning?

监督学习:使用带有标记信息的训练数据来进行模型训练。例如分类和回归。

无监督学习:使用无标记信息的训练数据来进行模型训练。例如聚类。
Supervised machine learning: Use training data with labeled information for model training. Such as classification and regression.

Unsupervised machine learning: Use training data without labeled information for model training. Such as clustering.

3、什么是训练误差和测试误差?What are training errors and test errors?

训练误差:模型(学习器)在训练集上的误差称为训练误差,也叫经验误差

测试误差:模型(学习器)在新样本(测试集)上的误差称为测试误差,也叫泛化误差

Training error: The error of the model (learner) on the training set is called training error, also called empirical error.

Test error: The error of the model (learner) on the new sample (test set) is called test error, also called generalization error.

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/weixin_52198808/article/details/130903725