python - DataFrame's elements as numpy arrays -
i'm trying change dataframe's values this: df['tokens'] = tokens
tokens
2-d np.array
. expected have column, each element 1-d np.array
, found out, each element took first element of correspoding 1-d array
. there way store arrays in dataframe's elements?
is want?
in [26]: df = pd.dataframe(np.random.rand(5,2), columns=list('ab')) in [27]: df out[27]: b 0 0.513723 0.886019 1 0.197956 0.172094 2 0.131495 0.476552 3 0.678821 0.106523 4 0.440118 0.802589 in [28]: arr = df.values in [29]: arr out[29]: array([[ 0.51372311, 0.88601887], [ 0.19795635, 0.17209383], [ 0.13149478, 0.47655197], [ 0.67882124, 0.10652332], [ 0.44011802, 0.80258924]]) in [30]: df['c'] = arr.tolist() in [31]: df out[31]: b c 0 0.513723 0.886019 [0.5137231110962795, 0.8860188692834928] 1 0.197956 0.172094 [0.19795634688449892, 0.17209383434042336] 2 0.131495 0.476552 [0.13149477867656167, 0.47655196508193576] 3 0.678821 0.106523 [0.6788212365523125, 0.10652331756477551] 4 0.440118 0.802589 [0.44011802077658635, 0.8025892383754725]
timing 5m rows df:
in [36]: big = pd.concat([df] * 10**6, ignore_index=true) in [38]: big.shape out[38]: (5000000, 2) in [39]: arr = big.values in [40]: %timeit arr.tolist() 1 loop, best of 3: 2.27 s per loop in [41]: %timeit list(arr) 1 loop, best of 3: 3.62 s per loop
Comments
Post a Comment