database - Tracking number of times web content has been viewed within a page? -
i reading more quora's answer rank algorithm, , came across following feature i'm trying reverse engineer:
http://blog.quora.com/improved-answer-ranking-follow-up
"with new answer ranking, not focusing solely on absolute number of upvotes , downvotes; considering level of attention answer has received. example: if 20 people see answer, , 20 of them upvote it, may stronger quality signal if thousands of people see answer 100 upvote it."
little background on quora: it's similar stack exchange in layout. posts question, people reply answers, , site ranks/displays answers in single page.
given answers compiled single page, how quora keep track of number of views each individual answer has gotten?
potential hypothesis:
- each individual answer stored in database, counter of how many times has been fetched.
- when user first visits page, first few answers fetched database , shown on page. user scrolls down, more answers dynamically fetched through additional requests.
- each time answer fetched database, database counter incremented, tracking number of times answer has been seen viewers.
concerns approach:
- every single request requires database updates, worsen database workload.
- instead of batch-fetching 10-20 answers user loads page, site instead have fetch 1-2 answers, every time user scrolls bottom of page. worsen latencies , user experience, user have keep waiting additional content show up.
are these real concerns blow scale? or can managed?
here speculation on how it's done.
storing view stats
yes, quora need store views per answer, commonly done @ scale app developers. however, imply storing in same place answer, whereas in practice, store separately in medium that's more optimised fast writes , less reliability (it's okay if miss few views due server outage; it's less okay if don't save user's answer). example, stored in redis, keeps stats in memory , writes disk once minute default. or store them in memcached , write own periodic process dump results main database.
counting views
it's unlikely views counted describe, ie how many times data requested, because distributed architecture should caching kind of content in browser , @ intermediate points along way. it's more tracking views directly in browser , apps checking, upon scroll events, if element has become visible. periodically upload bulk list of viewed items.
Comments
Post a Comment