XRTransformer: Fast Multi-Resolution Transformer Fine-tuning for XC
Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as XTransformer and LightXML, have shown significant improvement over other XMC methods. Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs. XR-Transformer is a novel recursive approach to accelerate the procedure through recursively fine-tuning transformer models on a series of multi-resolution objectives related to the original XMC objective function. Empirical results show that XRTransformer takes significantly less training time compared to other transformer-based XMC models while yielding better state-of-the-art results. In particular, on the public Amazon-3M dataset with 3 million labels, XR-Transformer is not only 20x faster than X-Transformer but also improves the Precision@1 from 51% to 54%.
In this video, I will talk about the following: What is the broad idea of XR-Transformer training? How is XRTransformer trained? How does XRTransformer perform?
For more details, please look at and
Zhang, Jiong, Wei-Cheng Chang, Hsiang-Fu Yu, and Inderjit Dhillon. “Fast multi-resolution transformer fine-tuning for extreme multi-label text classification.” Advances in Neural Information Processing Systems 34 (2021): 7267-7280.
[ad_2]
source