>>> import pandas as pd >>> obj1 = pd.Series([7.3, -2.5, 3.4, 1.5], index=['a', 'c', 'd', 'e']) >>> obj2 = pd.Series([-2.1, 3.6, -1.5, 4, 3.1], index=['a', 'c', 'e', 'f', 'g']) >>> obj1 a 7.3 c -2.5 d 3.4 e 1.5 dtype: float64 >>> >>> obj2 a -2.1 c 3.6 e -1.5 f 4.0 g 3.1 dtype: float64 >>> >>> obj1 + obj2 a 5.2 c 1.1 d NaN e 0.0 f NaN g NaN dtype: float64
>>> import pandas as pd >>> obj1 = pd.DataFrame(np.arange(9.).reshape((3, 3)), columns=list('bcd'), index=['Ohio', 'Texas', 'Colorado']) >>> obj2 = pd.DataFrame(np.arange(12.).reshape((4, 3)), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon']) >>> obj1 b c d Ohio 0.01.02.0 Texas 3.04.05.0 Colorado 6.07.08.0 >>> >>> obj2 b d e Utah 0.01.02.0 Ohio 3.04.05.0 Texas 6.07.08.0 Oregon 9.010.011.0 >>> >>> obj1 + obj2 b c d e Colorado NaN NaN NaN NaN Ohio 3.0 NaN 6.0 NaN Oregon NaN NaN NaN NaN Texas 9.0 NaN 12.0 NaN Utah NaN NaN NaN NaN
>>> import numpy as np >>> import pandas as pd >>> frame = pd.DataFrame(np.arange(12.).reshape((4, 3)), columns=list('bde'), index=['AA', 'BB', 'CC', 'DD']) >>> frame b d e AA 0.01.02.0 BB 3.04.05.0 CC 6.07.08.0 DD 9.010.011.0 >>> >>> series = frame.iloc[0] >>> series b 0.0 d 1.0 e 2.0 Name: AA, dtype: float64 >>> >>> frame - series b d e AA 0.00.00.0 BB 3.03.03.0 CC 6.06.06.0 DD 9.09.09.0
如果某个索引值在 DataFrame 的列或 Series 的索引中找不到,则参与运算的两个对象就会被重新索引以形成并集:
>>> import numpy as np >>> import pandas as pd >>> frame = pd.DataFrame(np.arange(12.).reshape((4, 3)), columns=list('bde'), index=['AA', 'BB', 'CC', 'DD']) >>> frame b d e AA 0.01.02.0 BB 3.04.05.0 CC 6.07.08.0 DD 9.010.011.0 >>> >>> series = pd.Series(range(3), index=['b', 'e', 'f']) >>> series b 0 e 1 f 2 dtype: int64 >>> >>> frame + series b d e f AA 0.0 NaN 3.0 NaN BB 3.0 NaN 6.0 NaN CC 6.0 NaN 9.0 NaN DD 9.0 NaN 12.0 NaN
如果希望匹配行且在列上广播,则必须使用算术运算方法,在方法中传入的轴(axis)就是希望匹配的轴。在下例中,我们的目的是匹配 DataFrame 的行索引(axis=’index’ or axis=0)并进行广播:
>>> import numpy as np >>> import pandas as pd >>> frame = pd.DataFrame(np.arange(12.).reshape((4, 3)), columns=list('bde'), index=['AA', 'BB', 'CC', 'DD']) >>> frame b d e AA 0.01.02.0 BB 3.04.05.0 CC 6.07.08.0 DD 9.010.011.0 >>> >>> series = frame['d'] >>> series AA 1.0 BB 4.0 CC 7.0 DD 10.0 Name: d, dtype: float64 >>> >>> frame.sub(series, axis='index') b d e AA -1.00.01.0 BB -1.00.01.0 CC -1.00.01.0 DD -1.00.01.0
【01x04】Pandas 算术方法
完整的 Pandas 算术方法见下表:
方法
副本
描述
add()
radd()
加法(+)
sub()、subtract()
rsub()
减法(-)
mul()、multiply()
rmul()
乘法(*)
pow()
rpow()
指数(**)
truediv()、div()、divide()
rdiv()
除法(/)
floordiv()
rfloordiv()
底除(//)
mod()
rmod()
求余(%)
副本均为原方法前加了个 r,它会翻转参数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
>>> import pandas as pd >>> obj = pd.DataFrame(np.arange(12.).reshape((3, 4)), columns=list('abcd')) >>> obj a b c d 00.01.02.03.0 14.05.06.07.0 28.09.010.011.0 >>> >>> 1 / obj a b c d 0 inf 1.0000000.5000000.333333 10.2500.2000000.1666670.142857 20.1250.1111110.1000000.090909 >>> >>> obj.rdiv(1) a b c d 0 inf 1.0000000.5000000.333333 10.2500.2000000.1666670.142857 20.1250.1111110.1000000.090909
>>> import pandas as pd >>> import numpy as np >>> obj1 = pd.DataFrame(np.arange(12.).reshape((3, 4)), columns=list('abcd')) >>> obj2 = pd.DataFrame(np.arange(20.).reshape((4, 5)), columns=list('abcde')) >>> >>> obj2.loc[1, 'b'] = np.nan >>> >>> obj1 a b c d 00.01.02.03.0 14.05.06.07.0 28.09.010.011.0 >>> >>> obj2 a b c d e 00.01.02.03.04.0 15.0 NaN 7.08.09.0 210.011.012.013.014.0 315.016.017.018.019.0 >>> >>> obj1 + obj2 a b c d e 00.02.04.06.0 NaN 19.0 NaN 13.015.0 NaN 218.020.022.024.0 NaN 3 NaN NaN NaN NaN NaN >>> >>> obj1.add(obj2, fill_value=10) a b c d e 00.02.04.06.014.0 19.015.013.015.019.0 218.020.022.024.024.0 325.026.027.028.029.0
>>> import pandas as pd >>> obj = pd.Series([1, np.nan, 2, None, 3], index=list('abcde')) >>> obj a 1.0 b NaN c 2.0 d NaN e 3.0 dtype: float64 >>> >>> obj.fillna(0) a 1.0 b 0.0 c 2.0 d 0.0 e 3.0 dtype: float64 >>> >>> obj.fillna(method='ffill') a 1.0 b 1.0 c 2.0 d 2.0 e 3.0 dtype: float64 >>> >>> obj.fillna(method='bfill') a 1.0 b 2.0 c 2.0 d 3.0 e 3.0 dtype: float64