需求:一个文件,有子女和对应的父母,要求输出 祖父母 孙子,
文件如下:
单表关联 结果:
child parent grand child
Tom Lucy Alice Tom
Tom Jack Jesse Tom
Jone Lucy Alice Jone
Jone Jack Jesse Jone
Lucy Mary Ben Tom
Lucy Ben Mary Tom
Jack Alice Ben Jone
Jack Jesse Mary Jone
Terry Alice Alice Philip
Terry Jesse Jesse Philip
Philip Terry Alice Mark
Philip Alma Jesse Mark
Mark Terry
Mark Alma
1.Mapper.class
public class SingleMapper extends Mapper<LongWritable,Text,Text,Text>{
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line=value.toString();
if(line.contains("child ")| line.contains("parent")){
return ;
}
// String[] _str=line.split("\t"); //会报数组越界
StringTokenizer _str=new StringTokenizer(line);
while(_str.hasMoreTokens()){
String child=_str.nextToken();
String parent=_str.nextToken();
context.write(new Text(parent),new Text("1,"+child));//1 儿子 :父母儿子
context.write(new Text(child),new Text("0,"+parent));//0 祖父 :父母 祖父
}
}
}
2.Reducer.class
public class SingleReduce extends Reducer<Text, Text, Text, Text> {
@Override
protected void setup(Context context)
throws IOException, InterruptedException {
context.write(new Text("grand"), new Text("child"));
} //只执行一次
@Override
protected void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
ArrayList<String> left=new ArrayList<String>();
ArrayList<String> right=new ArrayList<String>();
for(Text v:values){
if(v.toString().contains("1")){
left.add(v.toString().split(",")[1]);//孙子
}else{
right.add(v.toString().split(",")[1]);//祖父母
}
}//对相同的key
for(int i=0;i<left.size();i++){ //相当于笛卡儿积
for(int j=0;j<right.size();j++){
context.write(new Text(right.get(j)), new Text(left.get(i)));
}
}
}
}
3.Driver.class
public class SingleDriver {
public static void main(String[] args) throws IllegalArgumentException, IOException, ClassNotFoundException, InterruptedException, URISyntaxException {
Configuration conf = new Configuration();
Path outfile = new Path("file:///D:/输出结果/singleout");
FileSystem fs = outfile.getFileSystem(conf);
if(fs.exists(outfile)){
fs.delete(outfile,true);
}
Job job = Job.getInstance(conf);
job.setJarByClass(SingleDriver.class);
job.setJobName("Sencondary Sort");
job.setMapperClass(SingleMapper.class);
job.setReducerClass(SingleReduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path("file:///D:/测试数据/单表关联.txt/"));
FileOutputFormat.setOutputPath(job,outfile);
System.exit(job.waitForCompletion(true)?0:1);
}
}
总结:join解决表关联查询的时候,特别要锁定标识位,通常作为key,去比较筛选所得的value,最后context.write(),写出.
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)