1 使用Tensorflow / Keras
Keras提供的loss function有cosine similarity
import tensorflow as tf import math vector1 = tf.constant([1, 2, 3], dtype=tf.float32) vector2 = tf.constant([1, 2, 3], dtype=tf.float32) print(f'similarity type: {type(similarity)}') print(f'similarity: {similarity}') # Output: 0.97463185 print(f'similarity = {similarity.numpy()}') # Output: 0.97463185
输出结果
similarity type: <class 'tensorflow.python.framework.ops.EagerTensor'> similarity: -0.9999998807907104 similarity = -0.9999998807907104
需要注意的是,Keras算出来的结果是负的。因为它是作为一种loss function,如果2个向量越相似,他们越接近1,所以loss越小,那么Keras就把一般意义上的Consine similarity加了个负号。
2 只使用Raw Python
只用Raw Python写个函数也不难,代码如下
def cosine_similarity(x, y): dot_product = sum(i*j for i, j in zip(x, y)) norm_x = math.sqrt(sum(i*i for i in x)) norm_y = math.sqrt(sum(i*i for i in y)) return dot_product / (norm_x * norm_y) similarity2 = cosine_similarity([1,2,3], [4, 5, 6]) print(f'similarity2 = {similarity2}')
输出结果
similarity2 = 0.9746318461970762
这样就没有负数问题了。