This tutorial will show you how to build a personalized recommender system based on Milvus

Environment requirements

The following table lists recommended configurations, which have been tested:

CPU

Intel® Core™ i7-7700K CPU @ 4.20GHz

GPU

GeForce GTX 1050 Ti 4GB

Memory

32GB

OS

Ubuntu 18.04

Software

Milvus 0.10.0 pymilvus 0.2.13 PaddlePaddle 1.6.1

The data source is MovieLens million-scale dataset (ml-1m), created by GroupLens Research. Refer to ml-1m-README for more information.

Follow the steps below to build a recommender system:

  1. Train the model.
    # run train.py
    $ git clone -b 0.10.0 https://github.com/milvus-io/bootcamp.git
    $ cd bootcamp/solutions/recommender_system/
    $ python3 train.py
    

This command generates a model file recommender_system.inference.model in the same folder.

  1. Generate test data.
    # Download movie data movies_origin.txt to the same folder
    $ wget https://raw.githubusercontent.com/milvus-io/bootcamp/0.5.3/demo/recommender_system/movies_origin.txt
    # Generate test data. The -f parameter is followed by the movie data filename.
    $ python3 get_movies_data.py -f movies_origin.txt
    

The above commands generate movies_data.txt in the same folder.

  1. Use Milvus for personalized recommendation by running the following command:
    # Milvus performs personalized recommendation based on user status
    $ python3 infer_milvus.py -a <age> -g <gender> -j <job> [-i]
    # Example 1
    $ python3 infer_milvus.py -a 0 -g 1 -j 10 -i
    # Example 2
    $ python3 infer_milvus.py -a 6 -g 0 -j 16
    

The following table describes arguments of infer_milvus.py.

-a/--age

Age distribution 0: "Under 18" 1: "18-24" 2: "25-34" 3: "35-44" 4: "45-49" 5: "50-55" 6: "56+"

-g/--gender

Gender 0:male 1:female

-j/--job

Job 0: "other" or not specified
1: "academic/educator"
2: "artist"
3: "clerical/admin"
4: "college/grad student"
5: "customer service"
6: "doctor/health care"
7: "executive/managerial"
8: "farmer"
9: "homemaker"
10: "K-12 student"
11: "lawyer"
12: "programmer"
13: "retired"
14: "sales/marketing"
15: "scientist"
16: "self-employed"
17: "technician/engineer"
18: "tradesman/craftsman"
19: "unemployed"
20: "writer"

-i/--infer

(Optional) Converts test data to vectors and import to Milvus.

Note: -i/--infer is required when you use Milvus for personalized recommendation for the first time or when you start another training and regenerate the model.

The result displays top 5 movies that the specified user might be interested in:

get infer vectors finished!
Server connected.
Status(code=0, message='Create table successfully!')
rows in table recommender_demo: 3883
Top      Ids     Title   Score
0        3030    Yojimbo         2.9444923996925354
1        3871    Shane           2.8583481907844543
2        3467    Hud     2.849525213241577
3        1809    Hana-bi         2.826111316680908
4        3184    Montana         2.8119677305221558

Run python3 infer_paddle.py. You can see that Paddle and Milvus generate the same result.