This is a followup to this question MySql Select rows with 30 minutes difference in date, albeit similar in concept the solution needed might be different.
I have a MySql-8.0/MariaDb-10.4 table that contains a list of site visits of different visitors:
I want to create a query that returns the last visit of each visit session, where the session definition is where the
CreatedAt date is 30 min or more from the previous visits.
So in my case, I should be returning row 7 (Id column), row 12 and row 13. Note also that a session can be more than 30 minutes, as long as each visit succeeds a previous visit with less than 30min.
The neat solution suggest by @EugenRieck was as follows:
SELECT late.* FROM activities AS late LEFT JOIN activities AS early ON late.VisitorId=early.VisitorId AND late.CreatedAt>early.CreatedAt AND late.CreatedAt<=DATE_ADD(early.CreatedAt, INTERVAL +30 MINUTE) WHERE early.Id IS NULL -- Maybe: AND late.VisitorId='26924c19-3cd1-411e-a771-5ebd6806fb27' -- Maybe: ORDER BY late.CreatedAt
It works great, but it works by returning the first visit in each visit session, not the last visit. I tried to modify to work as i wanted but with no luck. Please help.
This is a variant of gap-and-islands problem. But you can handle it using
lead(). Just check if the next
createdAt is over 30 minutes from the value in a given row. That is the last row for a session:
select a.* from (select a.*, lead(createdAt) over (partition by visitorid order by createdat) as next_ca from activities a ) a where next_ca > createdAt + interval 30 minute;
Usually, in this situation you would want the last row as well. You would get that with
or next_ca is null.