怎样计算cosine similarity? Python实例

1 使用Tensorflow / Keras

Keras提供的loss function有cosine similarity

import tensorflow as tf
import math

vector1 = tf.constant([1, 2, 3], dtype=tf.float32)
vector2 = tf.constant([1, 2, 3], dtype=tf.float32)

print(f'similarity type: {type(similarity)}')
print(f'similarity: {similarity}')  # Output: 0.97463185
print(f'similarity = {similarity.numpy()}')  # Output: 0.97463185

输出结果

similarity type: <class 'tensorflow.python.framework.ops.EagerTensor'>
similarity: -0.9999998807907104
similarity = -0.9999998807907104

需要注意的是,Keras算出来的结果是负的。因为它是作为一种loss function,如果2个向量越相似,他们越接近1,所以loss越小,那么Keras就把一般意义上的Consine similarity加了个负号。

2 只使用Raw Python

只用Raw Python写个函数也不难,代码如下

def cosine_similarity(x, y):
    dot_product = sum(i*j for i, j in zip(x, y))
    norm_x = math.sqrt(sum(i*i for i in x))
    norm_y = math.sqrt(sum(i*i for i in y))
    return dot_product / (norm_x * norm_y)

similarity2 = cosine_similarity([1,2,3], [4, 5, 6])

print(f'similarity2 = {similarity2}')

输出结果

similarity2 = 0.9746318461970762

这样就没有负数问题了。

Leave a Comment

Your email address will not be published.