Abstract:In order to improve the efficiency of data mining, a merge calculation algorithm based on unit hierarchical tree is proposed. Taking equipment maintenance support related data as example, establish a bottom-up layer-by-layer merge calculation model of leaf nodes, improve the MapReduce parallel calculation model, and adopt a fully distributed mode to implement the Hadoop Map/Reduce distributed processing architecture to achieve unit-oriented level. The attribute reduction of the tree and data merging are used to analyze the speedup and scalability of the MapReduce distributed model. The results show that the method has achieved good results and has certain theoretical value.