1 使用Tensorflow / Keras
Keras提供的loss function有cosine similarity
import tensorflow as tf import math vector1 = tf.constant([1, 2, 3], dtype=tf.float32) vector2 = tf.constant([1, 2, 3], dtype=tf.float32) print(f'similarity type: {type(similarity)}') print(f'similarity: {similarity}') # Output: 0.97463185 print(f'similarity = {similarity.numpy()}') # Output: 0.97463185
輸出結果
similarity type: <class 'tensorflow.python.framework.ops.EagerTensor'> similarity: -0.9999998807907104 similarity = -0.9999998807907104
需要注意的是,Keras算出來的結果是負的。因為它是作為一種loss function,如果2個向量越相似,他們越接近1,所以loss越小,那麼Keras就把一般意義上的Consine similarity加了個負號。
2 只使用Raw Python
只用Raw Python寫個函數也不難,代碼如下
def cosine_similarity(x, y): dot_product = sum(i*j for i, j in zip(x, y)) norm_x = math.sqrt(sum(i*i for i in x)) norm_y = math.sqrt(sum(i*i for i in y)) return dot_product / (norm_x * norm_y) similarity2 = cosine_similarity([1,2,3], [4, 5, 6]) print(f'similarity2 = {similarity2}')
輸出結果
similarity2 = 0.9746318461970762
這樣就沒有負數問題了。