多选题 : 存在如下业务场景:用户上网日志文件已经存放在 HDFS 上, 日志文件内容格式是:每条上网记录三个字段, 分别是姓名, 性别,上网时间,字段之间用“,”护分隔;要求打印输出所有上网总时间大于两小时的女性网友。请问如下哪些代码片段能实现上述业务场景?
(A)sc.textFile(“/date/file/path”.map(_.split(“,”).map(p=>Femalelnfo(p(0),p(1),p(2).trim.tolnt)).toD(F)registerTempTable(“FemaleinfoTable”)sqlContext.spl(“selectname,sum(stayTime)asstayTimeformFemaleinfoTablewheregender=’female’groupbyname”).filter(“stayTime>=120”).collect().foreach(primtln)
(B)sc.textFile(“/date/file/path”.map(_.split(“,”).map(p=>Femalelnfo(p(0),p(1),p(2).trim.tolnt)).toD(F)registerTempTable(“FemaleinfoTable”)sqlContext.spl(“selectname,sum(stayTime)asstayTimeformFemaleinfoTablewheregender=’female’”).filter(“stayTime>=120”).collect().foreach(primtln)
(C)valtext=sc.textFile(“/date/file/path”)valdate=text.filter(_.contains(“female”))valfemaleDate:RDD[(String,int)]=date.map{line=>valt=line.split(‘,’)(t(0),t(2).tolnt)}.reduceByKey(_+_)valresult=femaleDate.Filter(line=>line._2>120)result.collect().map(x=>x._1=‘,’+x._2).foreach(println)
(D)valtext=sc.textFile(“/date/file/path”)valdate=text.filter(_.contains(“female”))valfemaleDate:RDD[(String,int)]=date.map{line=>valt=line.split(‘,’)(t(0),t(2).tolnt)}.valresult=femaleDate.Filter(line=>line._2>120)result.collect().map(x=>x._1=‘,’+x._2).foreach(println)