about云开发

 找回密码
 立即注册

QQ登录

只需一步,快速开始

扫一扫,访问微社区

查看: 1011|回复: 4

[疑问解答] 注册免费送体验金平台

[复制链接]

2

主题

1

听众

0

收听

中级会员

Rank: 3Rank: 3

积分
227
发表于 2018-11-7 11:52:10 | 显示全部楼层 |阅读模式
需求:使用mr求下列数据的相同订单号的前n个订单总金额的topn

[AppleScript] 纯文本查看 复制代码
order001,u001,小米6,1999.9,2
order001,u001,雀巢咖啡,99.0,2
order001,u001,安慕希,250.0,2
order001,u001,经典红双喜,200.0,4
order001,u001,防水电脑包,400.0,2
order002,u002,小米手环,199.0,3
order002,u002,榴莲,15.0,10
order002,u002,苹果,4.5,20
order002,u002,肥皂,10.0,40
order003,u001,小米6,1999.9,2
order003,u001,雀巢咖啡,99.0,2
order003,u001,安慕希,250.0,2
order003,u001,经典红双喜,200.0,4
order003,u001,防水电脑包,400.0,2


代码

bean类:
[AppleScript] 纯文本查看 复制代码
package com.red.mr.order.TeachDuan;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.WritableComparable;
public class OrderBean implements WritableComparable<OrderBean>{
    private String orderId;
    private String userId;
    private String pdtName;
    private float price;
    private int number;
    private float amountFee;
    public void set(String orderId, String userId, String pdtName, float price, int number) {
        this.orderId = orderId;
        this.userId = userId;
        this.pdtName = pdtName;
        this.price = price;
        this.number = number;
        this.amountFee = price * number;
    }
    public String getOrderId() {
        return orderId;
    }
    public void setOrderId(String orderId) {
        this.orderId = orderId;
    }
    public String getUserId() {
        return userId;
    }
    public void setUserId(String userId) {
        this.userId = userId;
    }
    public String getPdtName() {
        return pdtName;
    }
    public void setPdtName(String pdtName) {
        this.pdtName = pdtName;
    }
    public float getPrice() {
        return price;
    }
    public void setPrice(float price) {
        this.price = price;
    }
    public int getNumber() {
        return number;
    }
    public void setNumber(int number) {
        this.number = number;
    }
    public float getAmountFee() {
        return amountFee;
    }
    public void setAmountFee(float amountFee) {
        this.amountFee = amountFee;
    }
    @Override
    public String toString() {
        return this.orderId + "," + this.userId + "," + this.pdtName + "," + this.price + "," + this.number + ","
                + this.amountFee;
    }
    public int compareTo(OrderBean o) {
        return this.orderId.compareTo(o.getOrderId())==0?Float.compare(o.getAmountFee(), this.getAmountFee()):this.orderId.compareTo(o.getOrderId());
    }
    public void write(DataOutput out) throws IOException {
        out.writeUTF(this.orderId);
        out.writeUTF(this.userId);
        out.writeUTF(this.pdtName);
        out.writeFloat(this.price);
        out.writeInt(this.number);
    }
    public void readFields(DataInput in) throws IOException {
        this.orderId = in.readUTF();
        this.userId = in.readUTF();
        this.pdtName = in.readUTF();
        this.price = in.readFloat();
        this.number = in.readInt();
        this.amountFee = this.price * this.number;
    }
}


重写partition类:
[AppleScript] 纯文本查看 复制代码
package com.red.mr.order.TeachDuan;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.Partitioner;
/**
 * Create by ljh on 2018/11/7
 */
public class OrderIdPartitioner extends Partitioner<OrderBean, NullWritable> {

    @Override
    public int getPartition(OrderBean key, NullWritable nullWritable, int i) {
        return (key.getOrderId().hashCode() & Integer.MAX_VALUE) % i;
    }
}


重写groupComparator类:
[AppleScript] 纯文本查看 复制代码
package com.red.mr.order.TeachDuan;
import org.apache.hadoop.io.WritableComparator;
/**
 * Create by ljh on 2018/11/7
 */
public class OrderIdGroupingComparator extends WritableComparator {

    public OrderIdGroupingComparator() {
        super(OrderBean.class,true);
    }
    @Override
    public int compare(Object a, Object b) {
        OrderBean o1 = (OrderBean) a;
        OrderBean o2 = (OrderBean) b;
        return o1.getOrderId().compareTo(o2.getOrderId());
    }
}


主类:
[AppleScript] 纯文本查看 复制代码
package com.red.mr.order.TeachDuan;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class OrderTopn {
    public static class OrderTopnMapper extends Mapper<LongWritable, Text, OrderBean, NullWritable>{
        OrderBean orderBean = new OrderBean();
        NullWritable v = NullWritable.get();
        @Override
        protected void map(LongWritable key, Text value,
                           Mapper<LongWritable, Text, OrderBean, NullWritable>.Context context)
                throws IOException, InterruptedException {
            String[] fields = value.toString().split(",");
            orderBean.set(fields[0], fields[1], fields[2], Float.parseFloat(fields[3]), Integer.parseInt(fields[4]));
            context.write(orderBean,v);
        }

    }

    public static class OrderTopnReducer extends Reducer< OrderBean, NullWritable,  OrderBean, NullWritable>{
        /**
         * 虽然reduce方法中的参数key只有一个,但是只要迭代器迭代一次,key中的值就会变
         */
        @Override
        protected void reduce(OrderBean key, Iterable<NullWritable> values,
                              Reducer<OrderBean, NullWritable, OrderBean, NullWritable>.Context context)
                throws IOException, InterruptedException {
            int i=0;
            for (NullWritable v : values) {
                context.write(key, v);
                if(++i==3) return;
            }
        }

    }
    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration(); // 默认只加载core-default.xml core-site.xml
        conf.setInt("order.top.n", 2);
        Job job = Job.getInstance(conf);
        job.setJarByClass(OrderTopn.class);
        job.setMapperClass(OrderTopnMapper.class);
        job.setReducerClass(OrderTopnReducer.class);
        job.setPartitionerClass(OrderIdPartitioner.class);
        job.setGroupingComparatorClass(OrderIdGroupingComparator.class);
        job.setNumReduceTasks(2);
        job.setMapOutputKeyClass(OrderBean.class);
        job.setMapOutputValueClass(NullWritable.class);
        job.setOutputKeyClass(OrderBean.class);
        job.setOutputValueClass(NullWritable.class);
        FileInputFormat.setInputPaths(job, new Path("d:/mrData/order/input"));
        FileOutputFormat.setOutputPath(job, new Path("d:/mrData/order/output-1"));
        boolean res = job.waitForCompletion(true);
        System.exit(res?0:1);
    }
}



问题:我运行结束去看结果的时,得到不了前n个结果,然而程序把所有的结果都放在一起了,这并不是我想要的结果。我反复debug了下,发现reduce阶段,不同的key或调一次reduce方法,但是我又想,我不是重写groupComparator方法了吗?程序不是认为相同orederid的会到同一个key吗?难道我重写错了?检查了下,没错啊,哪位大神能救救我?






48

主题

9

听众

6

收听

高级会员

Rank: 4

积分
2273
发表于 2018-11-7 19:06:51 | 显示全部楼层
本帖最后由 s060403072 于 2018-11-7 19:10 编辑

for (NullWritable v : values) {
                context.write(key, v);
                if(++i==3) return;
            }
这里只是控制了一个reduce,如果你有多个reduce,那就不一样了。
比如1个reduce会输出2条记录,如果2个reduce就是4条记录
所以你的输出结果跟你的reduce个数有关系。所以首先设置reduce个数为1
另外==3应该不对,应该是==2

2

主题

1

听众

0

收听

中级会员

Rank: 3Rank: 3

积分
227
 楼主| 发表于 2018-11-8 23:26:43 | 显示全部楼层
s060403072 发表于 2018-11-7 19:06
for (NullWritable v : values) {
                context.write(key, v);
                if(++i==3) ...

我设置了reduceTask为1,但是运行不对。我这里还是得到全部的结果,这不是我想要的。==3,是我随便写的,这只是我想输出前n个而已。我没有使用yarn客户端的参数而已

你运行下试试。。。




我想得到这样的结果top2:
[AppleScript] 纯文本查看 复制代码
order001,u001,小米6,3999.8
order001,u001,经典红双喜,800.0
order003,u001,小米6,3999.8
order003,u001,经典红双喜,800.0


[AppleScript] 纯文本查看 复制代码
order002,u002,小米手环,597.0
order002,u002,肥皂,400.0


------------------------------------------------------------------------------------------------------------------------

但是却得到的是这样的结果
[AppleScript] 纯文本查看 复制代码
order002,u002,小米手环,199.0,3,597.0
order002,u002,肥皂,10.0,40,400.0
order002,u002,榴莲,15.0,10,150.0
order002,u002,苹果,4.5,20,90.0


[AppleScript] 纯文本查看 复制代码
order001,u001,小米6,1999.9,2,3999.8
order001,u001,防水电脑包,400.0,2,800.0
order001,u001,经典红双喜,200.0,4,800.0
order001,u001,安慕希,250.0,2,500.0
order001,u001,雀巢咖啡,99.0,2,198.0
order003,u001,小米6,1999.9,2,3999.8
order003,u001,经典红双喜,200.0,4,800.0
order003,u001,防水电脑包,400.0,2,800.0
order003,u001,安慕希,250.0,2,500.0
order003,u001,雀巢咖啡,99.0,2,198.0


我设置topn2的结果,reduceTask为2。但是上面程序得到的结果事与愿违!



48

主题

9

听众

6

收听

高级会员

Rank: 4

积分
2273
发表于 2018-11-9 11:58:22 | 显示全部楼层
本帖最后由 s060403072 于 2018-11-9 13:25 编辑
若余相思28 发表于 2018-11-8 23:26
我设置了reduceTask为1,但是运行不对。我这里还是得到全部的结果,这不是我想要的。==3,是我随便写的, ...

reduce个数为1是好设置的。至于其他的格式,可能由多个因素决定,不是设置几个就是几个的。具体看看跑的是几个reduce。

2

主题

1

听众

0

收听

中级会员

Rank: 3Rank: 3

积分
227
 楼主| 发表于 2018-11-9 13:21:17 | 显示全部楼层
s060403072 发表于 2018-11-9 11:58
reduce个数为1是好设置的。至于其他的格式,可能由多个因素决定,不是你设置几个就是几个的。具体看看跑 ...

你运行得到的结果是什么?
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /3 下一条

QQ|小黑屋|about云开发-学问论坛|社区 ( 京ICP备12023829号

GMT+8, 2018-12-16 16:17 , Processed in 0.541867 second(s), 29 queries , Gzip On.

Powered by Discuz! X3.2 Licensed

快速回复 返回顶部 返回列表