SQL
  • Tips
  • SQL
    • Database Basics
    • SQL Basics
    • SQL Syntax
    • Retrieve Data: SELECT
    • Sort Data: ORDER BY
    • Filter Data: WHERE
    • Calculated Fields
    • Aggregate Functions
    • Group Data: GROUP BY
    • Subqueries
    • Join Tables
    • Combine Queries: UNION
    • Control Flow Statements
    • IF function
    • Handle NULL
    • Date
    • Numeric
    • String
    • Notes
  • Table/Database
    • Insert
    • Delete
    • Update
    • Table
    • Database
    • Stored Procedure
  • Misc
    • SQL vs NoSQL
    • 大数据
    • Why SQL instead of Excel + VBA?
  • sqlzoo
    • world table
    • nobel table
    • football data
    • movie data
    • Teacher Department Data
    • Edinburgh Buses data
  • Leetcode
    • 175_Combine Two Tables
    • 176_Second Highest Salary
    • 177_Nth Highest Salary
    • 178_Rank Scores
    • 180_Consecutive Numbers
    • 181_Employees Earning More Than Their Managers
    • 182_Duplicate Emails
    • 183_Customers Who Never Order
    • 184_Department Highest Salary
    • 185_Department Top Three Salaries
    • 196_Delete Duplicate Emails
    • 197_Rising Temperature
    • 570_Managers with at Least 5 Direct Reports
    • 578_Get Highest Answer Rate Question
    • 579_Find Cumulative Salary of an Employee
    • 584_Find Customer Referee
    • 586_Customer Placing the Largest Number of Orders
    • 595_Big Countries
    • 596_Classes More Than 5 Students
    • 597_Friend Requests I: Overall Acceptance Rate
    • 601_Human Traffic of Stadium
    • 602_Friend Requests II: Who Has the Most Friends
    • 603_Consecutive Available Seats
    • 607_Sales Person
    • 608_Tree Node
    • 610_Triangle Judgement
    • 612_Shortest Distance in a Plane
    • 613_Shortest Distance in a Line
    • 619_Biggest Single Number
    • 620_Not Boring Movies
    • 626_Exchange Seats
    • 627_Swap Salary
  • Facebook 面经题
    • spam filter
    • marketplace
    • instagram
    • session
    • message confirmation
Powered by GitBook
On this page
  • Table: user_actions
  • Table: reviewer_removals
  • Q1: How many posts were reported yesterday for each report Reason?
  • Q2: What percent of daily content that users view on Facebook is actually Spam?
  • Q3: How to find the user who abuses this spam system?

Was this helpful?

  1. Facebook 面经题

spam filter

[2019-1-14, 2018-11-5]

Table: user_actions

ds (STRING) | user_id (BIGINT) |post_id (BIGINT) |action (STRING) | extra (STRING)

'2018-07-01'| 1209283021 | 329482048384792 | 'view' |

'2018-07-01'| 1209283021 | 329482048384792 | 'like' |

'2018-07-01'| 1938409273 | 349573908750923 | 'reaction' | 'LOVE'

'2018-07-01'| 1209283021 | 329482048384792 | 'comment' | 'Such nice Raybans'

'2018-07-01'| 1238472931 | 329482048384792 | 'report' | 'SPAM'

'2018-07-01'| 1298349287 | 328472938472087 | 'report' | 'NUDITY'

'2018-07-01'| 1238712388 | 329482048384792 | 'reshare' | 'I wanted to share with you all'

Table: reviewer_removals

ds (STRING) | reviewer_id (BIGINT) | post_id (BIGINT) |

'2018-07-01'| 3894729384729078 | 329482048384792 |

'2018-07-01'| 8477594743909585 | 388573002873499 |

Q1: How many posts were reported yesterday for each report Reason?

SELECT extra, count(post_id)
FROM user_actions
WHERE ds = CAST(SUBDATE(curdate(), 1) AS STRING) and action = 'report'
GROUP BY extra

Q2: What percent of daily content that users view on Facebook is actually Spam?

SELECT ds, avg(case when (action = 'report' and extra = 'SPAM') or r.post_id IS NOT NULL then 1 else 0 end)
FROM user_actions AS a
    LEFT JOIN reviewer_removals as r on a.post_id = r.post_id
GROUP BY ds
ORDER BY ds

Q3: How to find the user who abuses this spam system?

SELECT user_id, count(a.post_id) AS num_report, count(r.post_id) AS num_real_spam
FROM user_action AS a LEFT JOIN reviewer_removals AS r on a.post_id = r.post_id
WHERE action = 'report' and extra = 'SPAM'
GROUP BY user_id

Q3: Facebook has decided to be proactive about SPAM, instead of merely reactive. We decide to address the SPAM problem through a Machine Learning solution predicting whether a given post is Indeed SPAM. We want to use the predictions in order to downrank/deprioritize suspected SPAM from news feed. Q3的问题是如何来评估这个machine leaning有没有用.

PRODUCT:

Q1. Facebook用machine learning 建了一个model来rank content以达到filter spam的目的,需要关注什么metrics来评价这个model

Q2. 在用ab testing的时候发现用了新的spam model之后revenue下降了。面试官确定了首先这个model不会touch到ads,就是说ads不会被filter out。并且DAU/WAU/MAU和time spend没有变化,也就是说user方面没有变化。那么可能的原因是什么。

然后我问面试官revenue主要来自什么,面试官说是click ads。我说那么ads click的revenue主要可以break down成#user x CTR/CTP x price/click. 这个情况下只可能有变化的是CTR,也就是说因为用来新的model以后,这个平台的整体content质量更高了,那么user就更喜欢花更多时间去explore这些content,那么点击广告的时间就相对来说哦变少了,revenue也下降了。面试官说是这样的,采用新的model之后用户可能会花更多时间去看video之类的,那么用在ads上的时间就变少了。

PreviousFacebook 面经题Nextmarketplace

Last updated 5 years ago

Was this helpful?