faiss是什么

很多人不知道人脸识别是如何实现, 实际上从1000个人脸库中,查找出当前视频画面下检测到的人脸都是谁,是一个很难得任务,为什么呢?
从算法的角度分析,你需要对检测到的人脸进行遍历,每一个人脸都与1000个进行距离计算,得到距离最小值,如果距离大于某一个阈值,则认为我们识别出来了某个人,但是试想,假如画面有5个人,那么你每一张图片都要进行5000次计算. 这个计算量是非常大的.
而faiss的库就是来解决这个问题的.

解决什么问题?

我们把上面的问题normalize一下,问题应该这样描述:

  • 我有10000个数据,每个数据都是128维的 (你看作是向量);
  • 我有3个向量,也是128维度的.

好,那么问题就出来了. 我想要做的事情就是: 从上面10000个数据中,找到距离我3个向量最相似的向量. 这其实本质就是一个相似度搜索.

而faiss可以完美的解决上述问题. 以后你遇到人脸比对问题,千万别说你啥也不懂, 别忘了本网站就是来给你醍醐灌顶的, 很多学校和工作上学不到的东西, 旁门左道这里都有.
最后说一句, 目前很多公司的人脸考勤机, 银行风控系统, 一些压箱底的业余算法, 实际上都是基于faiss做的, 你学会了之后说不定可以面试时假装参与过某大型海量数据智能搜索项目..

入门

关于faiss如何编译不说了, 你可以用gpu也可以用cpu, 实际上CPU速度已经非常快了. 应付日常的人脸比对没有问题.

#include "faiss/IndexBinaryFlat.h"
#include "faiss/IndexFlat.h"
#include <cstdio>
#include <cstdlib>
#include <iostream>

using namespace std;

int main()
{

    // define serverl high dimension
    // search 10 records from 100000 database
    int n_dim = 64;
    float *xb = new float[10000 * n_dim];
    // we just find 3 vectors cloest
    float *xq = new float[3 * n_dim];

    for (int i = 0; i < 10000; i++)
    {
        for (int j = 0; j < n_dim; j++)
        {
            xb[n_dim * i + j] = drand48();
        }
        xb[n_dim * i] += i / 1000.;
    }

    // Just for logging first 4 lines
    for (int i = 0; i < 4; i++)
    {
        for (int j = 0; j < n_dim; j++)
        {
            cout << xb[n_dim * i + j] << " ";
        }
        cout << endl;
    }

    for (int i = 0; i < 3; i++)
    {
        for (int j = 0; j < n_dim; j++)
        {
            xq[n_dim * i + j] = drand48();
        }
        xq[n_dim * i] += i / 1000.;
    }

    cout << "\n\n to query data: \n";
    for (int i = 0; i < 3; i++)
    {
        for (int j = 0; j < n_dim; j++)
        {
            cout << xq[n_dim * i + j] << " ";
        }
        cout << endl;
    }
    cout << "data init done.\n";

    faiss::IndexFlatL2 index(n_dim);
    // create a Index object with 128 dimensions
    index.add(10000, xb);

    cout << "ntotal: " << index.ntotal << endl;

    const int k = 5;
    // I contains 3 to query, get frist 5
    long *I = new long[3 * k];
    float *D = new float[3 * k];
    // we will search frist 5 most closest vector of provide 3 xq vectors
    index.search(3, xq, k, D, I);

    // log out result
    cout << "I: \n";
    for (int i = 0; i < 3; i++)
    {
        for (int j = 0; j < k; j++)
        {
            // cout << I[i * k + j] << " ";
            printf("%5ld ", I[i * k + j]);
        }
        cout << endl;
    }

    cout << "Data: \n";
    for (int i = 0; i < 3; i++)
    {
        cout << "to query: " << to_string(i) << ": \n";
        /* code */
        for (int j = 0; j < k; j++)
        {
            /* code */
            cout << "find at " << I[i*k+j] << " row\n";
            for(int r = 0; r < n_dim; r++)
            {
                /* code */
                cout << xb[I[i*k+j]*n_dim + r] << " ";
            }
            cout << "\ndistance: " << D[i*k+j] << endl;
            cout << endl;
            
        }
        cout << endl;
    }
}

上述就是所有算法了.

输出你会看到结果如下:

I: 
  225    65   861   419    55 
  273   798   843   248   442 
  218   234  1036   666   462 
Data: 
to query: 0: 
find at 225 row
0.466021 0.157434 0.358225 0.925175 0.767356 0.947012 0.426969 0.643445 0.738131 0.728681 0.314912 0.606148 0.734014 0.061149 0.52515 0.278242 0.410853 0.575736 0.476858 0.716448 0.50381 0.735936 0.711614 0.726204 0.79187 0.532111 0.249043 0.495299 0.48978 0.614022 0.223693 0.907543 0.0861676 0.330432 0.339118 0.531724 0.497999 0.489021 0.570725 0.116967 0.153322 0.270842 0.472179 0.243972 0.0404449 0.671394 0.255326 0.286576 0.119363 0.52595 0.962982 0.749311 0.73053 0.681197 0.808226 0.349587 0.660384 0.160472 0.46143 0.408649 0.438705 0.338531 0.404171 0.470306 
distance: 6.27085

find at 65 row
0.505255 0.28922 0.631373 0.918576 0.181592 0.588332 0.775476 0.771283 0.0793275 0.0847964 0.315319 0.457572 0.224771 0.68705 0.926852 0.529911 0.739419 0.801049 0.290878 0.515319 0.409873 0.803053 0.356623 0.800135 0.20588 0.337549 0.828714 0.880484 0.693433 0.33952 0.226299 0.812698 0.125389 0.490504 0.235795 0.0835198 0.562164 0.505707 0.443148 0.0495714 0.335637 0.114078 0.42344 0.50769 0.314323 0.874968 0.449576 0.556408 0.00430368 0.504196 0.199993 0.413889 0.478442 0.230585 0.068269 0.254272 0.369448 0.832987 0.353961 0.913728 0.79681 0.52022 0.420583 0.766039 
distance: 6.4031

find at 861 row
0.873469 0.750993 0.857495 0.47256 0.782863 0.476003 0.349614 0.16134 0.344077 0.440439 0.867413 0.116205 0.171824 0.629315 0.713136 0.260927 0.967486 0.329049 0.28973 0.461988 0.661987 0.554029 0.603161 0.257836 0.201252 0.928316 0.446082 0.876206 0.614237 0.349172 0.832989 0.635202 0.271704 0.97169 0.485299 0.455559 0.63825 0.581464 0.072068 0.324728 0.792392 0.16425 0.034549 0.0866274 0.333672 0.478069 0.0246297 0.588204 0.258855 0.11482 0.710857 0.715049 0.921586 0.976496 0.221699 0.178565 0.194623 0.247428 0.15257 0.577608 0.537408 0.921213 0.537721 0.514654 
distance: 6.47911

find at 419 row
0.693233 0.36702 0.463789 0.302791 0.984381 0.798606 0.906685 0.64936 0.0597382 0.439838 0.716839 0.842966 0.506091 0.833202 0.584998 0.932343 0.911039 0.919794 0.232314 0.546304 0.0473816 0.0195894 0.626377 0.205462 0.737214 0.979008 0.729008 0.884622 0.301808 0.451535 0.563585 0.958041 0.546344 0.120554 0.381652 0.531712 0.808454 0.425854 0.0982931 0.864027 0.226094 0.415601 0.989373 0.537119 0.34752 0.254235 0.265532 0.469958 0.260998 0.604158 0.500142 0.980621 0.0147639 0.136788 0.761824 0.394391 0.631748 0.278178 0.53514 0.157486 0.105327 0.558782 0.282595 0.819856 
distance: 6.49069

find at 55 row
0.257709 0.0859596 0.00411785 0.823344 0.965238 0.336131 0.316234 0.899132 0.213677 0.97542 0.556926 0.0236263 0.574166 0.493997 0.623751 0.57421 0.683779 0.610097 0.420989 0.864925 0.623292 0.580468 0.386176 0.413755 0.26925 0.924793 0.676345 0.805018 0.2051 0.206813 0.469955 0.489271 0.927807 0.877865 0.461383 0.285862 0.701429 0.77285 0.705987 0.413759 0.633338 0.0535123 0.551255 0.0130009 0.676162 0.759635 0.827795 0.53618 0.0360931 0.585662 0.715104 0.306481 0.346177 0.620009 0.511843 0.441329 0.344913 0.0284955 0.777338 0.229621 0.29236 0.634718 0.146859 0.692429 
distance: 6.49767


to query: 1: 
find at 273 row
0.677296 0.585997 0.0553738 0.864789 0.11853 0.827149 0.297492 0.408547 0.451095 0.45367 0.615399 0.612543 0.60031 0.406916 0.458278 0.481905 0.700731 0.90727 0.408412 0.206795 0.823983 0.566926 0.59377 0.655704 0.638159 0.445796 0.902688 0.653724 0.129332 0.134317 0.967511 0.759879 0.574875 0.46029 0.897517 0.471399 0.793974 0.0578429 0.996367 0.790528 0.176969 0.704368 0.732001 0.829726 0.905669 0.0661185 0.901356 0.307939 0.368865 0.725505 0.69113 0.123736 0.000904269 0.164443 0.449403 0.66592 0.529893 0.710248 0.799975 0.63966 0.0221227 0.416043 0.928486 0.212836 
distance: 6.14798

find at 798 row
1.15928 0.543615 0.224626 0.916172 0.90518 0.659441 0.629947 0.537924 0.392073 0.941471 0.678197 0.965487 0.375038 0.749908 0.534098 0.72416 0.0459989 0.523811 0.047497 0.290382 0.520448 0.744092 0.222655 0.898367 0.122647 0.521324 0.713795 0.836127 0.107169 0.505237 0.360213 0.851549 0.7897 0.682583 0.140249 0.429699 0.412399 0.0805599 0.921543 0.215152 0.0774941 0.573476 0.801407 0.926418 0.791248 0.19095 0.780052 0.697448 0.565905 0.227274 0.683294 0.853651 0.830031 0.374929 0.170605 0.921973 0.251035 0.262592 0.927703 0.937581 0.83409 0.52176 0.67702 0.359982 
distance: 7.00169

find at 843 row
1.10931 0.359975 0.801474 0.728754 0.416373 0.778199 0.736728 0.122723 0.951517 0.880154 0.411319 0.0441091 0.0909334 0.596108 0.899065 0.226211 0.851917 0.839522 0.478784 0.623026 0.487713 0.907234 0.684378 0.838186 0.0877218 0.997805 0.762191 0.782982 0.0210723 0.363053 0.948904 0.623911 0.0989273 0.61534 0.858033 0.178676 0.60604 0.254504 0.660599 0.468336 0.798998 0.952731 0.70202 0.570841 0.381107 0.0833921 0.0768369 0.45147 0.589095 0.331361 0.428048 0.288646 0.314958 0.549034 0.618986 0.454374 0.540626 0.786055 0.791562 0.748264 0.890162 0.224111 0.851973 0.489301 
distance: 7.12636

find at 248 row
0.914648 0.305961 0.495205 0.874023 0.458822 0.943957 0.123607 0.133754 0.907432 0.911847 0.363161 0.314546 0.202649 0.797917 0.437428 0.503002 0.262552 0.976047 0.784578 0.62072 0.691318 0.266727 0.324286 0.473234 0.679304 0.286232 0.618868 0.0670787 0.529058 0.418717 0.312213 0.859206 0.467795 0.116242 0.738621 0.284527 0.729952 0.488854 0.51562 0.11724 0.903801 0.54024 0.556299 0.648388 0.777709 0.724234 0.87198 0.759778 0.391615 0.924164 0.977455 0.212833 0.397751 0.312046 0.679121 0.585377 0.815007 0.539857 0.619751 0.530813 0.679936 0.203051 0.599092 0.419228 
distance: 7.18848

find at 442 row
1.0593 0.958399 0.857755 0.508242 0.723734 0.845 0.94509 0.0920004 0.567447 0.681435 0.850301 0.365953 0.510369 0.75197 0.782022 0.022814 0.521615 0.261907 0.370121 0.737841 0.363219 0.766053 0.350891 0.469102 0.227089 0.296823 0.472253 0.626026 0.208972 0.558516 0.252179 0.418061 0.554042 0.701471 0.819957 0.956868 0.160271 0.355167 0.986034 0.256701 0.220898 0.932148 0.708058 0.346972 0.169928 0.00512246 0.362594 0.232563 0.761832 0.48826 0.386363 0.380883 0.368631 0.741925 0.43805 0.0155159 0.567612 0.779756 0.850531 0.505262 0.177619 0.84065 0.72049 0.0713463 
distance: 7.37195


to query: 2: 
find at 218 row
0.789956 0.127211 0.350287 0.484015 0.820771 0.461537 0.850525 0.660183 0.323255 0.473371 0.439952 0.570391 0.405359 0.830808 0.880974 0.497624 0.7853 0.99512 0.59151 0.969219 0.305285 0.502814 0.413822 0.859267 0.512544 0.774588 0.463374 0.0351958 0.938322 0.272807 0.294392 0.643575 0.378597 0.757543 0.371534 0.423003 0.438085 0.83781 0.409435 0.534455 0.90046 0.128706 0.319977 0.500031 0.176647 0.396892 0.405317 0.893858 0.656467 0.786754 0.745251 0.403624 0.447224 0.889079 0.971002 0.0132697 0.879131 0.254961 0.392535 0.0976032 0.648378 0.322591 0.679886 0.684816 
distance: 6.59329

find at 234 row
1.17942 0.76557 0.759288 0.291068 0.189831 0.529297 0.131956 0.0411172 0.376996 0.0831702 0.784108 0.650577 0.109366 0.147236 0.319566 0.446089 0.189489 0.874222 0.369946 0.776055 0.188854 0.781561 0.278282 0.275609 0.926888 0.811937 0.033379 0.00249271 0.604275 0.454331 0.219904 0.625784 0.868002 0.301576 0.705079 0.28289 0.371428 0.632659 0.808288 0.213335 0.272214 0.737722 0.58116 0.0190753 0.627956 0.0410134 0.676979 0.344506 0.300955 0.302649 0.801941 0.884695 0.969953 0.61508 0.437414 0.152995 0.321124 0.766734 0.000507648 0.120141 0.809633 0.175838 0.796751 0.192669 
distance: 6.62435

find at 1036 row
1.64188 0.552404 0.11612 0.661468 0.488598 0.852641 0.787818 0.178657 0.280292 0.0725081 0.984331 0.478286 0.0196322 0.632602 0.875309 0.24 0.601203 0.0437491 0.370092 0.719343 0.313233 0.649225 0.851744 0.186134 0.689183 0.810082 0.0880837 0.596978 0.460979 0.153177 0.222988 0.19667 0.475848 0.91624 0.51381 0.100563 0.421953 0.638504 0.719034 0.608129 0.44102 0.108308 0.0407314 0.725532 0.732938 0.380898 0.903016 0.585978 0.0749376 0.000878291 0.793043 0.620847 0.486585 0.125832 0.92559 0.187296 0.724833 0.762369 0.714592 0.150298 0.747136 0.645827 0.371833 0.665408 
distance: 6.70976

find at 666 row
0.930705 0.0620238 0.735576 0.953759 0.495099 0.339989 0.204248 0.118341 0.329635 0.669598 0.594073 0.501553 0.0600229 0.250359 0.533784 0.233426 0.288357 0.553333 0.108614 0.543632 0.379357 0.45851 0.774319 0.322116 0.965752 0.264567 0.631253 0.872114 0.429717 0.48243 0.088136 0.384168 0.925825 0.690061 0.385414 0.89012 0.656293 0.609912 0.707917 0.496024 0.598189 0.122142 0.48215 0.731042 0.758647 0.435827 0.0392148 0.839645 0.795122 0.787899 0.822416 0.756708 0.683459 0.146435 0.249589 0.574608 0.200415 0.673407 0.802742 0.289194 0.665659 0.504368 0.838461 0.0141903 
distance: 6.78529

find at 462 row
0.521379 0.306417 0.0181092 0.444188 0.338208 0.555521 0.346579 0.233428 0.301085 0.672265 0.856702 0.345093 0.112477 0.983967 0.726925 0.198079 0.442038 0.150574 0.112896 0.963589 0.775355 0.474437 0.980524 0.547615 0.134789 0.138315 0.56777 0.639174 0.840585 0.751075 0.393751 0.619752 0.838916 0.622288 0.834107 0.316815 0.585204 0.559258 0.76408 0.893745 0.338895 0.832426 0.421515 0.788684 0.0436307 0.577539 0.575335 0.499603 0.231649 0.976224 0.993221 0.13217 0.803416 0.401813 0.605905 0.216065 0.246232 0.546508 0.263178 0.105177 0.225365 0.711612 0.495208 0.824331 
distance: 6.83551

我们从10000个数据集中, 找到了与需要查找的 3个向量最相似的向量, 并且每一个向量找到了5个最相似的值. 并给出了距离值.

这里要注意, faiss没有给结果取根号, 原因很简单, 不用根号也能表示顺序, 而如果娶了根号, 那么它的计算量将非常大, 没有必要, 你最终结果再计算根号也是可以的.

那么问题来了,人脸特征如何提取呢?

欢迎大家时刻关注我们的算法平台: http://codes.strangeai.pro , 我们将实时更新最快的人脸检测与比对算法.