Graph pattern matching, which aims to discover structural patterns in graphs, is considered one of the most fundamental graph mining problems in many real applications. Despite previous efforts, existing systems face two main challenges. First, inherent symmetry existing in patterns can introduce a large amount of redundant computation. Second, different matching orders for a pattern have significant performance differences and are quite hard to predict. When these factors are mixed, this problem becomes extremely complicated. High efficient pattern matching remains an open problem currently. To address these challenges, we propose GraphPi, a high performance distributed pattern matching system. GraphPi utilizes a new algorithm based on 2-cycles in group theory to generate multiple sets of asymmetric restrictions, where each set can eliminate redundant computation completely. We further design an accurate performance model to determine the optimal matching order and asymmetric restriction set for efficient pattern matching. We evaluate GraphPi on Tianhe-2A supercomputer. Results show that GraphPi outperforms the state-ofthe-art system, by up to 105X for 6 real-world graph datasets on a single node. We also scale GraphPi to 1,024 computing nodes (24,576 cores).