session

-- 数据表有两张,均为userlog,

-- 对于A表:每一次用户进入页面分派一个unique session_id,用户离开则这一个session结束,期间用户的每一个行为都会生成一条记录;

-- 对于表B:记录一条session存在的时间。

-- A: date, session_id, user_id, act('enter', 'exit', 'post')--can be duplicated

-- B: date, session_id, time_spent(sec) --date and session_id are primary keys.

Q1:generate average number of session per user per day. 这一问要求num_of_session/num_of_users

SELECT date, count(session_id)/count(DISTINCT user_id)
FROM A
GROUP BY date
ORDER BY date

Q2: generate number of user per time interval. in order to measure how many user is spending certain amount of time.

这一问需要考虑到每个session的时间,然后再做aggregation

SELECT time, count(user_id)
FROM
    (
    SELECT user_id, sum(time_spend) as time
    FROM A JOIN B ON A.session_id = B.session_id
    GROUP BY user_id
    ) AS temp
GROUP BY time
ORDER BY time

Last updated