Pratik Jawanpuria
2020
Geometry-aware domain adaptation for unsupervised alignment of word embeddings
Pratik Jawanpuria
|
Mayank Meghwanshi
|
Bamdev Mishra
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We propose a novel manifold based geometric approach for learning unsupervised alignment of word embeddings between the source and the target languages. Our approach formulates the alignment learning problem as a domain adaptation problem over the manifold of doubly stochastic matrices. This viewpoint arises from the aim to align the second order information of the two language spaces. The rich geometry of the doubly stochastic manifold allows to employ efficient Riemannian conjugate gradient algorithm for the proposed formulation. Empirically, the proposed approach outperforms state-of-the-art optimal transport based approach on the bilingual lexicon induction task across several language pairs. The performance improvement is more significant for distant language pairs.
Learning Geometric Word Meta-Embeddings
Pratik Jawanpuria
|
Satya Dev N T V
|
Anoop Kunchukuttan
|
Bamdev Mishra
Proceedings of the 5th Workshop on Representation Learning for NLP
We propose a geometric framework for learning meta-embeddings of words from different embedding sources. Our framework transforms the embeddings into a common latent space, where, for example, simple averaging or concatenation of different embeddings (of a given word) is more amenable. The proposed latent space arises from two particular geometric transformations - source embedding specific orthogonal rotations and a common Mahalanobis metric scaling. Empirical results on several word similarity and word analogy benchmarks illustrate the efficacy of the proposed framework.
Search