怎樣計算cosine similarity? Python實例

1 使用Tensorflow / Keras

Keras提供的loss function有cosine similarity

import tensorflow as tf
import math

vector1 = tf.constant([1, 2, 3], dtype=tf.float32)
vector2 = tf.constant([1, 2, 3], dtype=tf.float32)

print(f'similarity type: {type(similarity)}')
print(f'similarity: {similarity}')  # Output: 0.97463185
print(f'similarity = {similarity.numpy()}')  # Output: 0.97463185

輸出結果

similarity type: <class 'tensorflow.python.framework.ops.EagerTensor'>
similarity: -0.9999998807907104
similarity = -0.9999998807907104

需要注意的是,Keras算出來的結果是負的。因為它是作為一種loss function,如果2個向量越相似,他們越接近1,所以loss越小,那麼Keras就把一般意義上的Consine similarity加了個負號。

2 只使用Raw Python

只用Raw Python寫個函數也不難,代碼如下

def cosine_similarity(x, y):
    dot_product = sum(i*j for i, j in zip(x, y))
    norm_x = math.sqrt(sum(i*i for i in x))
    norm_y = math.sqrt(sum(i*i for i in y))
    return dot_product / (norm_x * norm_y)

similarity2 = cosine_similarity([1,2,3], [4, 5, 6])

print(f'similarity2 = {similarity2}')

輸出結果

similarity2 = 0.9746318461970762

這樣就沒有負數問題了。

Leave a Comment

Your email address will not be published.