Graph classification is important for different scientific applications; it can be exploited in various problems related to bioinformatics and cheminformatics. Given their graphs, there is increasing need for classifying small molecules to predict their properties such as activity, toxicity or mutagenicity. Using subtrees as feature set for graph classification in kernel methods has been shown to perform well in classifying small molecules. It is also well-known that feature selection can improve the performance of classifiers. However, most of the graph kernels are not selective in choosing which subtrees to include in the set of features. Instead, they use all subtrees of a certain property as their feature set. We argue that not all the latter features are needed for effective classification. In this paper, we investigate the effect of selecting subset of the subtrees as features for graph kernels, i.e., we try to identify and keep useful features; all the remaining subtrees are eliminated. A masking procedure, which boils down to feature selection, is proposed for classifying graphs. We conducted experiments on several molecule classification datasets; the results demonstrate the applicability and effectiveness of the proposed feature selection process.