Get first row of dataframe in Python Pandas based on criteria -
let's have dataframe one
import pandas pd df = pd.dataframe([[1, 2, 1], [1, 3, 2], [4, 6, 3], [4, 3, 4], [5, 4, 5]], columns=['a', 'b', 'c']) >> df b c 0 1 2 1 1 1 3 2 2 4 6 3 3 4 3 4 4 5 4 5
the original table more complicated more columns , rows.
i want first row fulfil criteria. examples:
- get first row > 3 (returns row 2)
- get first row > 4 , b > 3 (returns row 4)
- get first row > 3 , (b > 3 or c > 2) (returns row 2)
but, if there isn't row fulfil specific criteria, want first 1 after sort descending (or other cases b, c etc)
- get first row > 6 (returns row 4 ordering desc , first one)
i able iterating on dataframe (i know craps :p). so, prefer more pythonic way solve it.
this tutorial 1 pandas slicing. make sure check out. onto snippets... slice dataframe condition, use format:
>>> df[condition]
this return slice of dataframe can index using iloc
. here examples:
get first row > 3 (returns row 2)
>>> df[df.a > 3].iloc[0] 4 b 6 c 3 name: 2, dtype: int64
if want row number, rather using iloc
, df[df.a > 3].index[0]
.
get first row > 4 , b > 3:
>>> df[(df.a > 4) & (df.b > 3)].iloc[0] 5 b 4 c 5 name: 4, dtype: int64
get first row > 3 , (b > 3 or c > 2) (returns row 2)
>>> df[(df.a > 3) & ((df.b > 3) | (df.c > 2))].iloc[0] 4 b 6 c 3 name: 2, dtype: int64
now, last case can write function handles default case of returning descending-sorted frame:
>>> def series_or_default(x, condition, default_col, ascending=false): ... sliced = x[condition] ... if sliced.shape[0] == 0: ... return x.sort_values(default_col, ascending=ascending).iloc[0] ... return sliced.iloc[0] >>> >>> series_or_default(df, df.a > 6, 'a') 5 b 4 c 5 name: 4, dtype: int64
as expected, returns row 4.
Comments
Post a Comment