只做了计分部分
Operations on word vectors
Welcome to your first assignment of this week!
Because word embeddings are very computionally expensive to train, most ML practitioners will load a pre-trained set of embeddings.
After this assignment you will be able to:
- Load pre-trained word vectors, and measure similarity using cosine similarity
- Use word embeddings to solve word analogy problems such as Man is to Woman as King is to __.
- Modify word embeddings to reduce their gender bias
Let's get started! Run the following cell to load the packages you will need.
In [ ]:
import numpy as np
from w2v_utils import *
Next, lets load the word vectors. For this assignment, we will use 50-dimensional GloVe vectors to represent words. Run the following cell to load the word_to_vec_map
.
In [ ]:
words, word_to_vec_map = read_glove_vecs('data/glove.6B.50d.txt')
You've loaded:
-
words
: set of words in the vocabulary.
-
word_to_vec_map
: dictionary mapping words to their GloVe vector representation.
You've seen that one-hot vectors do not do a good job cpaturing what words are similar. GloVe vectors provide much more useful information about the meaning of individual words. Lets now see how you can use GloVe vectors to decide how similar two words are.
1 - Cosine similarity
To measure how similar two words are, we need a way to measure the degree of similarity between two embedding vectors for the two words. Given two vectors uu and vv, cosine similarity is defined as follows:
CosineSimilarity(u, v)=u.v||u||2||v||2=cos(θ)(1)(1)CosineSimilarity(u, v)=u.v||u||2||v||2=cos(θ)
where
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)