2

I have been researching how to build a newsfeed that sorts the order by relevancy.

I have been following along Facebook's Edge Rank and Etsy's Activity Feed Architecture and I can't figure out how these companies calculate Affinity.

The database stores a percentage value for example 0.75

Given I have a set of activity items created.

User X commented on User Y User X liked User Y

How would you get an affinity percentage out of this?

ericraio
  • 33
  • 7
  • The links you've mentioned are pretty great. The details of **Edge Affinity** already describe how FB calculates it well. Based on actions and how *close* you are to the person to whom you are performing the action. Are you specifically asking for getting a **percentage** affinity or something else? – Gaurav Ramanan Oct 06 '15 at 19:40
  • 1
    @GauravRamanan After some research, i noticed that I could calculate an "affinity score" which tally's up all the interactions and add a time aspect to this. But I am baffled why other companies opted to store a percentage based value. – ericraio Oct 06 '15 at 20:38

1 Answers1

0

As I understand your question, you are specifically asking why / how do we need to calculate percentage and not absolute affinity scores?

The EdgeRank algorithm for choosing the priority of Social Stories is the same problem (theoretically) as choosing a page for a Search Result in PageRank which is the same as choosing an Ad out of many for a particular search query.

Take the case of PageRank. According to Wikipedia:

Wikipedia Diagram for PageRank Mathematical PageRanks for a simple network, expressed as percentages. (Google uses a logarithmic scale.) Page C has a higher PageRank than Page E ...

Also see this -> http://www2007.org/posters/poster893.pdf

Similarly, I suspect the real reason for using percentages is to normalize the EdgeRank so that all news Feed stories are brought to the same scale.

Taking your example lets say there are 2 pages P1 and P2 our User called A has liked. Now there are Updates from both. If Facebook's affinity formula for pages is (hypothetically) this ->

popularity of page * frequency of user interaction

Or

number of page likes / time since last interaction from user A.

Say for Page P1 this is 100000 / 20 = 50000 and for Page P2 it is 2000 / 10 = 200.

So Page P1's news story wins because it is a more popular page and its story will be shown before P2.

But this also has to compete with stories from users where the formula can be totally different say

Number of Mutual Friends * Number of Posts shared on their Wall

For a post from another user B this can be 1000 * 10 = 10000 but when this value competes with P1 it loses in absolute number. But having 1000 mutual friends is a really big thing! Technically B's story should appear before P1's.

A solution to this would be to normalize all affinity scores between 0 to 1 so that the competition is fair and relative.

Now the 2nd part of the EdgeRank formula has Edge Weight which is simlar to the above concept. But Edge Weight specifically focusses on the type of interaction, like , comment, share etc.

All these "Competition Algorithms" EdgeRank, PageRank etc. are called Recommendation Systems. These algorithms must normalize scores to scale in order to show relevant order of results!

I'll keep adding to the question if I find something more!

Gaurav Ramanan
  • 624
  • 3
  • 8
  • My pleasure! @ericraio ! Based on your interest in this I strongly recommend you go through **Recommendation Systems**, its used everywhere from Facebook to E Commerce sites. Will add more info if I come across :) Truly honored you say *worth all your rep*! Unfortunately SE doesn't automatically award bounty to selected answer :P – Gaurav Ramanan Oct 07 '15 at 05:47
  • Yeah of course! I was looking into http://prediction.io and their universal recommendation template, would need to figure out how I would be sending events to the engine. – ericraio Oct 07 '15 at 06:06