spark conditional replacement but keep filed values -


i want fill nan values in spark conditionally (to make sure considered each corner case of data , not filling replacement value).

a sample like

case class foobar(foo:string, bar:string) val mydf = seq(("a","first"),("b","second"),("c",null), ("third","foobar"), ("somemore","null"))          .todf("foo","bar")          .as[foobar]  +--------+------+ |     foo|   bar| +--------+------+ |       a| first| |       b|second| |       c|  null| |   third|foobar| |somemore|  null| +--------+------+ 

unfortunately

    mydf         .withcolumn(           "bar",           when(             (($"foo" === "c") , ($"bar" isnull)) , "somereplacement"            )         ).show 

resets regular other values in column

+--------+---------------+ |     foo|            bar| +--------+---------------+ |       a|           null| |       b|           null| |       c|somereplacement| |   third|           null| |somemore|           null| +--------+---------------+ 

and

mydf     .withcolumn(       "bar",       when(         (($"foo" === "c") , ($"bar" isnull)) or         (($"foo" === "somemore") , ($"bar" isnull)), "somereplacement"        )     ).show 

which want use fill in values different classes / categories of foo. not work well.

i curious how fix this.

use otherwise:

when(   (($"foo" === "c") , ($"bar" isnull)) or   (($"foo" === "somemore") , ($"bar" isnull)), "somereplacement"  ).otherwise($"bar") 

or coalesce:

coalesce(   $"bar",     when(($"foo" === "c") or ($"foo" === "somemore"), "somereplacement") ) 

the reason coalesce is...less typing (so don't repeat $"bar" isnull).


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -