在Tensorflow中已经有一个通过交叉列创建特征的函数tf.feature_column.crossed_column
,但更多的是针对类别数据。数字数据怎么样?
例如,已经有 2 列
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
如果我想根据年龄和教育数字创建第三和第四个特征列,如下所示
my_feature = age * education_num
my_another_feature = age * age
如何做呢?
您可以声明一个自定义数字列并将其添加到您的数据框中输入功能 https://www.tensorflow.org/get_started/input_fn:
# Existing features
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
# Declare a custom column just like other columns
my_feature = tf.feature_column.numeric_column("my_feature")
...
# Add to the list of features
feature_columns = { ... age, education_num, my_feature, ... }
...
def input_fn():
df_data = pd.read_csv("input.csv")
df_data = df_data.dropna(how="any", axis=0)
# Manually update the dataframe
df_data["my_feature"] = df_data["age"] * df_data["education_num"]
return tf.estimator.inputs.pandas_input_fn(x=df_data,
y=labels,
batch_size=100,
num_epochs=10)
...
model.train(input_fn=input_fn())
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)