我的要求是从列中的评论列中检索订单号comment
并且总是开始于R
。订单号应作为新列添加到表中。
输入数据:
code,id,mode,location,status,comment
AS-SD,101,Airways,hyderabad,D,order got delayed R1657
FY-YT,102,Airways,Delhi,ND,R7856 package damaged
TY-OP,103,Airways,Pune,D,Order number R5463 not received
预期输出:
AS-SD,101,Airways,hyderabad,D,order got delayed R1657,R1657
FY-YT,102,Airways,Delhi,ND,R7856 package damaged,R7856
TY-OP,103,Airways,Pune,D,Order number R5463 not received,R5463
我在spark-sql中尝试过,我使用的查询如下:
val r = sqlContext.sql("select substring(comment, PatIndex('%[0-9]%',comment, length(comment))) as number from A")
但是,我收到以下错误:
org.apache.spark.sql.AnalysisException: undefined function PatIndex; line 0 pos 0