[ML算法之Python实现系列] Polynomial Features

admin

2018 年 02 月 01 日

668次浏览

暂无评论

2170字数

机器学习实战

在进行多项式回归的时候，通常需要将特征转换到多项式特征，比如有如下特征，

$$ x_1,x_2,...,x_n $$

要将其转换为二项特征，那么转换之后的特征则是：

$$ 1,x_1,x_2,..,x_n,x_1x_1, x_1x_2,...,x_1x_n,...,x_2x_2,...x_nx_n $$

假设数据为：

import numpy as np

x = np.array([[1, 2],[3, 4]])
x

array([[1, 2],
       [3, 4]])

使用`sklearn`库来实现

代码如下：

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2)

poly.fit_transform(x)

array([[ 1.,  1.,  2.,  1.,  2.,  4.],
       [ 1.,  3.,  4.,  9., 12., 16.]])

我的实现

代码如下：

import itertools
import functools
import numpy as np


class PolynomialFeatures(object):

    def __init__(self, degree=2):
        assert isinstance(degree, int)
        self.degree = degree
        
    def transform(self, x):
        # 如果特征只有1维，就将其转换为nx1矩阵
        if x.ndim == 1:
            x = x[:, None]
        
        # 转置一下，方便后面按列来叠加相乘
        x_t = x.T
        
        # 准备返回的新特征矩阵，第一列是常数项（偏置），所以全部为1
        features = [np.ones(len(x))]
        # 每个degree循环创建该级别的多项特征，从1到self.degree + 1
        for degree in range(1, self.degree + 1):
            # 第二个循环用到itertools.combinations_with_replacement
            # 第1个参数是个列表，第2个参数是个正整数，假设记为n
            # 返回的是一个生成器，是列表中所有n个元素的组合，包含元素与自身的组合
            # 比如itertools.combinations_with_replacement([1,2,3],2)
            # 那么结果为：
            #  [(1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3)]
            # 这里第1个参数是x的转置，转换成了列，也就是列的组合
            for items in itertools.combinations_with_replacement(x_t, degree):
                # functools.reduce是根据列表中数值按顺序做lambda，并且把前一次得到的
                # 值保存下来，再与下一次的值来做lambda运算，官方的例子：
                # reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
                # 计算的是：((((1+2)+3)+4)+5)
                # 这里就是将每一列叠加相乘，之后再放到features中
                features.append(functools.reduce(lambda x, y : x * y, items))
        # 转置回来，并将其返回
        return np.asarray(features).T

my_poly = PolynomialFeatures(2)
my_poly.transform(x)

array([[ 1.,  1.,  2.,  1.,  2.,  4.],
       [ 1.,  3.,  4.,  9., 12., 16.]])

[ML算法之Python实现系列] Polynomial Features

admin • 2018 年 02 月 01 日

在进行多项式回归的时候，通常需要将特征转换到多项式特征，比如有如下特征，

$$ x_1,x_2,...,x_n $$

要将其转换为二项特征，那么转换之后的特征则是：

$$ 1,x_1,x_2,..,x_n,x_1x_1, x_1x_2,...,x_1x_n,...,x_2x_2,...x_nx_n $$

假设数据为：

import numpy as np

x = np.array([[1, 2],[3, 4]])
x

array([[1, 2],
       [3, 4]])

使用`sklearn`库来实现

代码如下：

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2)

poly.fit_transform(x)

array([[ 1.,  1.,  2.,  1.,  2.,  4.],
       [ 1.,  3.,  4.,  9., 12., 16.]])

我的实现

代码如下：

import itertools
import functools
import numpy as np


class PolynomialFeatures(object):

    def __init__(self, degree=2):
        assert isinstance(degree, int)
        self.degree = degree
        
    def transform(self, x):
        # 如果特征只有1维，就将其转换为nx1矩阵
        if x.ndim == 1:
            x = x[:, None]
        
        # 转置一下，方便后面按列来叠加相乘
        x_t = x.T
        
        # 准备返回的新特征矩阵，第一列是常数项（偏置），所以全部为1
        features = [np.ones(len(x))]
        # 每个degree循环创建该级别的多项特征，从1到self.degree + 1
        for degree in range(1, self.degree + 1):
            # 第二个循环用到itertools.combinations_with_replacement
            # 第1个参数是个列表，第2个参数是个正整数，假设记为n
            # 返回的是一个生成器，是列表中所有n个元素的组合，包含元素与自身的组合
            # 比如itertools.combinations_with_replacement([1,2,3],2)
            # 那么结果为：
            #  [(1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3)]
            # 这里第1个参数是x的转置，转换成了列，也就是列的组合
            for items in itertools.combinations_with_replacement(x_t, degree):
                # functools.reduce是根据列表中数值按顺序做lambda，并且把前一次得到的
                # 值保存下来，再与下一次的值来做lambda运算，官方的例子：
                # reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
                # 计算的是：((((1+2)+3)+4)+5)
                # 这里就是将每一列叠加相乘，之后再放到features中
                features.append(functools.reduce(lambda x, y : x * y, items))
        # 转置回来，并将其返回
        return np.asarray(features).T

my_poly = PolynomialFeatures(2)
my_poly.transform(x)

array([[ 1.,  1.,  2.,  1.,  2.,  4.],
       [ 1.,  3.,  4.,  9., 12., 16.]])

[ML算法之Python实现系列] Polynomial Features

使用`sklearn`库来实现

我的实现

发表评论取消回复

广义拉格朗日函数及其对偶算法

支持向量机SVM 系列(1)——线性可分支持向量机

支持向量机SVM 系列(2)——对偶方法(Dual Method)

支持向量机SVM 系列(3)——核函数(Kernel Function)

支持向量机SVM 系列(4)——软间隔(soft-margin SVM)

[拿来就用] Stacking代码

Python 操作Hive

8大经典排序算法 Python3实现

推荐系统笔记1——推荐系统简介

时间序列分析基础概念(1)类别、精度与预测方法

[ML算法之Python实现系列] Polynomial Features

使用`sklearn`库来实现

我的实现

使用sklearn库来实现

我的实现

发表评论 取消回复

[ML算法之Python实现系列] Polynomial Features

使用sklearn库来实现

我的实现

使用`sklearn`库来实现

发表评论取消回复

使用`sklearn`库来实现