BERT is a language model that captures the dependency structure between words by using the self-attention mechanism of the Transformer and is said to contain latent syntactic structure information inside the model. However, few studies use this information for syntactic structure analysis. An unsupervised syntactic structure analyzer, Tree-Transformer, has been proposed to analyze the syntactic structure of input sentences by using the self-attention mechanism. On the other hand, in natural language processing, there are many syntactic parsing result data that have been constructed in the research of syntactic structure parsing. Therefore, in addition to the unsupervised learning of the Tree-Transformer, we propose a hierarchical error back-propagation method that utilizes the loss between the output of each layer of the Transformer and the teacher data of the parsing results and we develop a supervised learning method for syntactic structure analysis using the Transformer. |