Get first row of dataframe in Python Pandas based on criteria -


let's have dataframe one

import pandas pd df = pd.dataframe([[1, 2, 1], [1, 3, 2], [4, 6, 3], [4, 3, 4], [5, 4, 5]], columns=['a', 'b', 'c'])  >> df     b  c 0  1  2  1 1  1  3  2 2  4  6  3 3  4  3  4 4  5  4  5 

the original table more complicated more columns , rows.

i want first row fulfil criteria. examples:

  1. get first row > 3 (returns row 2)
  2. get first row > 4 , b > 3 (returns row 4)
  3. get first row > 3 , (b > 3 or c > 2) (returns row 2)

but, if there isn't row fulfil specific criteria, want first 1 after sort descending (or other cases b, c etc)

  1. get first row > 6 (returns row 4 ordering desc , first one)

i able iterating on dataframe (i know craps :p). so, prefer more pythonic way solve it.

this tutorial 1 pandas slicing. make sure check out. onto snippets... slice dataframe condition, use format:

>>> df[condition] 

this return slice of dataframe can index using iloc. here examples:

  1. get first row > 3 (returns row 2)

    >>> df[df.a > 3].iloc[0]    4 b    6 c    3 name: 2, dtype: int64 

if want row number, rather using iloc, df[df.a > 3].index[0].

  1. get first row > 4 , b > 3:

    >>> df[(df.a > 4) & (df.b > 3)].iloc[0]    5 b    4 c    5 name: 4, dtype: int64 
  2. get first row > 3 , (b > 3 or c > 2) (returns row 2)

    >>> df[(df.a > 3) & ((df.b > 3) | (df.c > 2))].iloc[0]    4 b    6 c    3 name: 2, dtype: int64 

now, last case can write function handles default case of returning descending-sorted frame:

>>> def series_or_default(x, condition, default_col, ascending=false): ...     sliced = x[condition] ...     if sliced.shape[0] == 0: ...         return x.sort_values(default_col, ascending=ascending).iloc[0] ...     return sliced.iloc[0] >>>  >>> series_or_default(df, df.a > 6, 'a')    5 b    4 c    5 name: 4, dtype: int64 

as expected, returns row 4.


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -