Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics relative to previous period are misleading #1650

Open
2 tasks done
RealOrangeOne opened this issue Jan 30, 2022 · 9 comments
Open
2 tasks done

Statistics relative to previous period are misleading #1650

RealOrangeOne opened this issue Jan 30, 2022 · 9 comments

Comments

@RealOrangeOne
Copy link
Contributor

Past Issues Searched

  • I have searched open and closed issues to make sure that the bug has not yet been reported

Issue is a Bug Report

  • This is a bug report and not a feature request, nor asking for self-hosted support

Describe the bug

On the dashboard, under "Unique visitors", is a percentage showing how today's numbers compare to yesterday's.

Unfortunately, it seems these are based on the entire stats of the previous day, rather than just those at the current time. At least, I think that's the case.

Expected behavior

These "previous period" statistics take into account how far through the current period we are.

For example, when looking at the "Day" statistics, and it's 4pm, it should only consider stats for yesterday up to 4pm, so the data is actually comparable.

Otherwise, the values aren't exceptionally useful unless going through extreme growth / decline in traffic.

Screenshots

image

Environment

N/A
@metmarkosaric
Copy link
Contributor

thanks for reporting @RealOrangeOne! yes, this is correct for the "today" view. there's a thread on this here #344

@ccmaris
Copy link

ccmaris commented Feb 9, 2022

Hey ho 👋

I just had a quick conversation with @ukutaht about contributing to Plausible, and I feel that this is an excellent first issue for me to get my feet wet.

Therefore, I'll start tackling this (probably this weekend) unless somebody else has already started with it.

BTW feel free to assign it to me if you want.

Cheers,
Chris

@metmarkosaric
Copy link
Contributor

oh that's nice to hear @ccmaris! would be a great improvement to Plausible to standardize comparisons so they're compared to exact same amount of time. thanks for your contribution!

@ccmaris
Copy link

ccmaris commented Mar 27, 2022

Hey there 👋
This would be tricky to tackle because the date range passed to the query doesn't specify the exact hour/time. Therefore, I can see two possible solutions:

  1. Expand the query.ex date_range attribute in a backward-compatible way that allows it to also include hours (big but proper change).
  2. Add a catch after the query is dispatched in the part of the codebase that satisfies the query so that it adapts in case the date_range is a single day (hacky and error-prone).

Which solution would you guys prefer?
And do you have any further suggestions here that I might have missed?

@ukutaht
Copy link
Contributor

ukutaht commented Mar 28, 2022

You're right, it's quite a bit trickier than I initially thought. I'm leaning towards option 2 because:

  1. Changing the query.ex is probably a bigger rabbit hole than it might seem at first :)
  2. We already do override some dates when actually querying the database to have more fine-grained control for realtime stats. It wouldn't be completely new.

I think adding another clause here that limits the query to the current hour of the day (when period=day) would be the most minimal change. As long as there's a test case for it I'm not worried about the error-proneness.

We may need to rethink the Query struct in the future but I would also look at the other exceptions we have made in that case and approach it holistically.

@ccmaris ccmaris removed their assignment Jun 5, 2022
@ccmaris
Copy link

ccmaris commented Jun 5, 2022

I had a look and think that this is tackled now ?!
I also didn't have as much free time as I thought :/

@djanda97
Copy link
Contributor

@ukutaht
Is something like this what you had in mind?

In lib/plausible/stats/base.ex:

  def utc_boundaries(%Query{period: "day"}, _timezone) do
    last_datetime = NaiveDateTime.utc_now() |> Timex.shift(seconds: 5)
    first_datetime = NaiveDateTime.utc_now() |> Timex.shift(days: -1)

    {first_datetime, last_datetime}
  end

@ukutaht
Copy link
Contributor

ukutaht commented Jan 17, 2023

Sorry no, I don't think this will work.

I think we need to add another case to to the shift_back function in query.ex:

def shift_back(%__MODULE__{period: "month"} = query, site) do

We already adjust the previous period for month and year ranges to make sure they only include a comparable time period from the previous one. Another clause needs to be added for the day period to make sure we include the same number of hours as are shown for the current query period.

@djanda97
Copy link
Contributor

Ah I see, that makes sense. The shift_back function looks like it handles the month and year periods similarly, so I'll try adding another case for the day period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants