nssp pipeline code #1952

minhkhul · 2024-04-17T21:09:59Z

Description

Add 8 signals from source nssp

dsweber2

Ran the pipeline and it seems to be pulling correctly, and the fields that are generated make sense. Generally looks good, I have some minor cosmetic suggestions.

I guess the archive differ is the only part that really remains?

nssp/DETAILS.md

nssp/README.md

dsweber2 · 2024-04-18T19:24:55Z

nssp/README.md

Ran these as it suggests and things run fine. The linter is a little angry. Either way

dsweber2 · 2024-04-18T19:46:27Z

nssp/delphi_nssp/pull.py

+    limit = 50000  # maximum limit allowed by SODA 2.0
+    while True:
+        page = client.get("rdmq-nq56", limit=limit, offset=offset)
+        if not page:
+            break  # exit the loop if no more results
+        results.extend(page)
+        offset += limit


Suggested change

limit = 50000 # maximum limit allowed by SODA 2.0

while True:

page = client.get("rdmq-nq56", limit=limit, offset=offset)

if not page:

break # exit the loop if no more results

results.extend(page)

offset += limit

limit = 50_000

for ii in range(100):

page = client.get("rdmq-nq56", limit=limit, offset=offset)

if not page:

max_ii = ii

break # exit the loop if no more results

results.extend(page)

offset += limit

if max_ii == 100:

raise ValueError("client has pulled 100x the socrata limit")

This is probably fine, but while true freaks me out. Feel free to use or not

@dsweber2 why did you choose 100 here for the limit in your rewrite? I believe 5k is the per-page item limit, not the total limit. So theoretically we could get an infinite-page result.

100 was maybe too low, though it would correspond to 50,000,000 items. If we're pulling more than that it should be quite a while down the road, or something has gone wrong. (or I may be misunderstanding how item counts work).

nssp/delphi_nssp/run.py

nmdefries

This file seems to be unused: nssp/tests/test_data/page.txt

Some nits and style suggestions. Question about which geos we're reporting.

nssp/README.md

nssp/delphi_nssp/__main__.py

nssp/delphi_nssp/constants.py

nssp/delphi_nssp/pull.py

nssp/delphi_nssp/run.py

nssp/params.json.template

nssp/delphi_nssp/run.py

nssp/delphi_nssp/pull.py

nmdefries

Some remaining documentation cleanup. I also have a few questions about the data. Looks like there is data that is not getting included in the aggregation step -- we'll want to think about what to do with that.

nssp/DETAILS.md

nssp/README.md

nssp/delphi_nssp/constants.py

nssp/delphi_nssp/run.py

nmdefries · 2024-04-25T21:59:41Z

nssp/delphi_nssp/run.py

+            elif geo == "hrr" or geo == "msa":
+                df = df[['fips', 'val', 'timestamp']]
+                df = geo_mapper.add_population_column(df, geocode_type="fips", geocode_col="fips")
+                df = geo_mapper.add_geocode(df, "fips", geo, from_col="fips", new_col="geo_id")


note: This step adds many rows (238k -> 395k) in the HRR aggregation case. It appears this is because HRRs are made up of zip codes, which cross county boundaries. So some FIPS codes map to multiple HRRs.

This step drops many rows (238k -> 92k) in the MSA case, because many counties don't fall into an MSA.

nmdefries · 2024-04-25T22:03:46Z

nssp/delphi_nssp/run.py

+    generate the relevant population amounts, and create a weighted but
+    unnormalized column, derived from `column_aggregating`
+    """
+    # set the weight of places with na's to zero


question: What is the meaning of a missing value in the incoming data? Censored for privacy? 0? Too small a sample size to accurately report?

nmdefries · 2024-04-25T22:18:44Z

nssp/delphi_nssp/run.py

+                df = geo_mapper.add_geocode(df, "fips", geo, from_col="fips", new_col="geo_id")
+                df = generate_weights(df, "val")
+                df = weighted_geo_sum(df, "geo_id", "val")
+                df = df.groupby(["timestamp","geo_id", "val"]).sum(numeric_only=True).reset_index()


question: I'm confused about this line. I don't think we need to do another sum, I thought we were done on the previous line. This step is actually removing dates/geos where the final aggregated value is NaN, but other values are unchanged. If that is our goal, the code should be more clear about that goal (e.g. unclear why this groups by val).

question: do we want to remove NaNs? I can't remember what we've done in the past.

nmdefries · 2024-04-25T23:55:11Z

nssp/delphi_nssp/run.py

+        return np.nan
+    return np.nansum(x)
+
+def weighted_geo_sum(df: pd.DataFrame, geo: str, sensor: str):


praise: Nice! Thanks for adding the aggregation in. Now we'll have an actual reference for how to aggregate non-summable signals 👏

nmdefries · 2024-04-26T12:07:13Z

We'll also need to do a statistical review of the created data. Because there's no formal process for this, you're free to do whatever you think is reasonable/sufficient. I'd say a thorough analysis would include:

Use correlations notebook (example output). Helps evaluate potential value in modeling. Choropleths give another way to plot the data to look for weird patterns.
Run same analysis comparing against other relevant signals -- for signals that are ostensibly measuring the same thing, helps see issues/benefits of one versus the other and how well they agree (e.g. JHU cases vs USAFacts cases).
Plot signals for all geos over time and space (via choropleth). Look for weird anomalies, missing geos, missing-not-at-random values, etc. Determine reporting behavior for DC and territories.
Think about limitations, gotchas, and lag and backfill characteristics.
What is the source's missing behavior? Missing due to reporting pattern (e.g. no weekend reports)? Do they censor (e.g. for low population)? Use some numeric "missing" like 9999?
(unlikely) Do we need to do any interpolation?
(unlikely) Think about if we should do any filtering/cleaning, e.g. low sample size in covid tests causing high variability in test positivity rate.

Example analysis 1, analysis 2. Part of analysis for wastewater

Get the stakeholder to look at your plots and approve if they look reasonable. Some of this investigation will be useful to include in the signal documentation.

I'd recommend putting the plots in the GitHub issue so they're easy to find later.

nmdefries · 2024-04-26T14:44:48Z

nssp/delphi_nssp/run.py

+                df = generate_weights(df, "val")
+                df = weighted_geo_sum(df, "geo_id", "val")


note: once the population and new geo_id columns have been added on, a simple weighted mean can be done with

df.groupby(["timestamp", "geo_id"]).apply(lambda x: sum(x.population * x.val) / x.population.sum() )

but this doesn't handle the NaNs appropriately. Actually I'm surprised that there isn't a built-in weighted mean function. numpy has an average function that takes weights but also doesn't let you ignore NaNs.

yeah, I had to write these as a way to handle NaN correctly for nwss and they're getting reused here. I plan to eventually add them to the geomapper. It is slower than using precomputed weights, so I'm also debating adding those. There's probably also room for optimizing them though.

We may want to find a domain expert to comment on the assumption that "# of ER visits" is proportional to "county population" and similar assumptions that actually makes this work in other contexts.

Yeah, at least an acknowledgement of the assumption would be good to include in the documentation for this data source.

Talking with @dshemetov, apparently in the HRR case, the weighted fips -> hrr map is already present, so we may not need to do this. He may add the weighted fips -> msa map in the near future, since the data is already there for it

minhkhul · 2024-05-02T16:03:49Z

Signal behavior across time and geos

State-level:

No data for MO and SD throughout the time on all three base signals.

County-level:

Following states have no country-level data at all: CA, WA, AK, AZ, AL, CO, SD, ND, MO, AR, FL, OH, NH, CT, NJ

There are some outliers in low population counties, creating stats like 50-100% of ER visits being covid/flu related in those places. There are around 10 of these outliers spread among each signals. This makes sense as the signals are percentages. So if county A has only one ER visit this week and that visit is about covid, then the data would show up as 100% of all ER visits in county A is covid-related.

dsweber2 · 2024-05-02T19:17:50Z

Oh one thing I would add from our discussion: CA does have state level data, somehow. I guess the reporting pipelines must be different.

minhkhul · 2024-05-02T20:13:18Z

Signal behavior across time and geos (Continue)

MSA-level

minhkhul · 2024-05-02T21:26:14Z

DC

DC is the only non-state territory featured in the source. Signals availability seem consistent with state-level signals.

nmdefries · 2024-05-02T22:35:27Z

There are some outliers in low population counties, creating stats like 50-100% of ER visits being covid/flu related in those places. There are around 10 of these outliers spread among each signals. This makes sense as the signals are percentages. So if county A has only one ER visit this week and that visit is about covid, then the data would show up as 100% of all ER visits in county A is covid-related.

Does it make sense to censor these?

Pros:

privacy
removes misleading values. Since the data doesn't report sample size (right?), it won't be obvious to a user that a given value is likely calculated from a small sample size.

Cons:

removes data (although those stats are reported in aggregated values at higher geo levels)
we're making an assumption about low population -> low ER visits

dsweber2 · 2024-05-03T17:49:28Z

Given we're pulling from a public datasource, I would trust them to have done the censoring correctly. Maybe this is historically a bad assumption?

dsweber2 · 2024-05-03T19:54:20Z

Some correlation data. I'm comparing the state level data with hhs

Flu

The overall (spearman) correlation is 0.87, so they are strongly correlated in general (which one very much expects).

Fixed State, over time

And the summary stats

  cor_min cor_median cor_mean cor_std cor_q25 cor_q75 cor_max
    <dbl>      <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1   0.730      0.893    0.881  0.0510   0.856   0.916   0.958

Fixed Time over states

And the summary stats

  cor_min cor_median cor_mean cor_std cor_q25 cor_q75 cor_max
    <dbl>      <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1   0.172      0.573    0.551   0.154   0.448   0.664   0.817

Which is significantly lower on average. I think we have a mild example of Simpson's paradox or something along those lines.

Lag correlations

Also somewhat unsurprisingly, ER visits are most useful with ~a week of lag, but that's it. Given that these are not versioned, the lagging is somewhat suspect for practical purposes.

Covid

Overall correlation is 0.81, so a bit worse than flu but still quite high.

Covid Fixed States over time

Covid Fixed Time over states

And the summary stats

# A tibble: 1 × 7
  cor_min cor_median cor_mean cor_std cor_q25 cor_q75 cor_max
    <dbl>      <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 -0.0873      0.489    0.462   0.190   0.388   0.591   0.806

Which is significantly lower on average than both the corresponding flu, and the spatial average.

Covid Lag analysis

Somewhat unsurprisingly, the correlation is highest immediately afterward, though its significantly lower than in the flu case.

minhkhul · 2024-05-03T20:44:48Z

Quick analysis of nssp signals source backfill behavior

36 15 * * * ~/myenv/bin/python ~/chunk.py

I set up this cron job to run the following chunk of code on a server every day in the past 2 weeks. The chunk creates a daily snapshot of data available through socrata API.

Code to take daily snapshot from source api

import numpy as np
import pandas as pd
from sodapy import Socrata
from datetime import date

today = date.today()
socrata_token = 'sOcrAt4t0k3n' # replace with your own Socrata token
client = Socrata("data.cdc.gov", socrata_token)
results = []
offset = 0
limit = 50000  # maximum limit allowed by SODA 2.0
while True:
    page = client.get("rdmq-nq56", limit=limit, offset=offset)
    if not page:
        break  # exit the loop if no more results
    results.extend(page)
    offset += limit
df_ervisits = pd.DataFrame.from_records(results)
df_ervisits.to_csv(f'{today}.csv', index=False)

Grab those daily snapshots of data and put them into a dictionary of dataframes for analysis.

Setup for analysis

import numpy as np
import pandas as pd
from sodapy import Socrata
from datetime import date
import os
import pandas as pd

# Get the current directory
current_dir = os.getcwd()

# Initialize an empty dictionary to store the dataframes
dataframes = {}

# Iterate over each file in the current directory
for file in os.listdir(current_dir):
    if file.endswith(".csv"):
        # Get the name of the CSV file without the extension
        name = os.path.splitext(file)[0]
        
        # Read the CSV file into a dataframe
        df = pd.read_csv(file)
        
        # Store the dataframe in the dictionary with the name as the key
        dataframes[name] = df

We can see here that (at least recently) the api content is updated every week some time between Thursday night and Friday afternoon.

Compare snapshot code

from datetime import datetime, timedelta
dataframes = {key: dataframes[key] for key in sorted(dataframes)}
for date1 in dataframes:
  date2 = datetime.strptime(date1, "%Y-%m-%d")+ timedelta(days=1)
  date2 = date2.strftime("%Y-%m-%d")
  if date2 not in dataframes:
      continue
  if dataframes[date1].equals(dataframes[date2]):
      print(f"Snapshot {date1} and Snapshot {date2} are alike.")
  else:
      print(f"Snapshot {date1} and Snapshot {date2} are different.")

Output:

Snapshot 2024-04-17 and Snapshot 2024-04-18 are alike.
Snapshot 2024-04-18 and Snapshot 2024-04-19 are different.
Snapshot 2024-04-19 and Snapshot 2024-04-20 are alike.
Snapshot 2024-04-20 and Snapshot 2024-04-21 are alike.
Snapshot 2024-04-21 and Snapshot 2024-04-22 are alike.
Snapshot 2024-04-22 and Snapshot 2024-04-23 are alike.
Snapshot 2024-04-23 and Snapshot 2024-04-24 are alike.
Snapshot 2024-04-24 and Snapshot 2024-04-25 are alike.
Snapshot 2024-04-25 and Snapshot 2024-04-26 are different.
Snapshot 2024-04-26 and Snapshot 2024-04-27 are alike.
Snapshot 2024-04-29 and Snapshot 2024-04-30 are alike.
Snapshot 2024-04-30 and Snapshot 2024-05-01 are alike.

Specifically, every week the source updates with info from the week before.
For example, Friday 2024-04-19 was updated with new data from week end 2024-04-13.

for datedf in dataframes:
    df = dataframes[datedf]

    # Convert the 'week_end' column to datetime
    df['week_end'] = pd.to_datetime(df['week_end'])

    # Get the row with the most recent date
    most_recent_row = df.loc[df['week_end'].idxmax()]
    
    print(f"Most recent row in {datedf}:")
    print(most_recent_row['week_end'])

Most recent row in 2024-04-17:
2024-04-06 00:00:00
Most recent row in 2024-04-18:
2024-04-06 00:00:00
Most recent row in 2024-04-19:
2024-04-13 00:00:00
Most recent row in 2024-04-20:
2024-04-13 00:00:00
Most recent row in 2024-04-21:
2024-04-13 00:00:00
Most recent row in 2024-04-22:
2024-04-13 00:00:00
Most recent row in 2024-04-23:
2024-04-13 00:00:00
Most recent row in 2024-04-24:
2024-04-13 00:00:00
Most recent row in 2024-04-25:
2024-04-13 00:00:00
Most recent row in 2024-04-26:
2024-04-20 00:00:00
Most recent row in 2024-04-27:
2024-04-20 00:00:00
Most recent row in 2024-04-29:
2024-04-20 00:00:00
Most recent row in 2024-04-30:
2024-04-20 00:00:00
Most recent row in 2024-05-01:
2024-04-20 00:00:00

More comparison specifically the case between data available on 20240418 and 20240419 here.
tldr: there are around 150,000 revised rows and around 3000 new rows in the 2024-04-19 update in comparison to the day before. All new data is about week end 2024-04-13 while revised data was from various period up to 2 years ago (Oct 2022).

Co-authored-by: David Weber <david.weber2@pm.me>

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

… available sources and signals"

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

dsweber2 · 2024-05-13T17:16:34Z

We should get this merged and running on staging @nmdefries @melange396. Every Friday we don't have it running on staging is revisions that we're losing. What blockers remain?

melange396 · 2024-05-13T17:46:34Z

Im working through a big backlog of PRs, but this one is now at the top of my list. @minhkhul: we should be able to get this running on staging before its merged, right?

minhkhul · 2024-05-13T17:52:20Z

@dsweber2 @melange396 I wrote some code a month ago to call the api everyday and keep all data returned on a separate server. These can be transformed and loaded as patches later. So no worries on missing data.

melange396 · 2024-05-13T17:58:04Z

Oh, good! I thought i remembered having a discussion about that but wasnt sure if it was for this or something else.

melange396

I'm gonna do a deeper look at the core code, but i did a first pass to see if i could narrow down what files i needed to inspect more closely; i knocked off 29/46 files that way!

Theres a few things that seem to be out of scope for this PR, like all the max-line-length=120 in the pylint configs from other indicators... Are they going to break the lint steps in other indicators?

melange396 · 2024-05-13T20:16:27Z

ansible/templates/sir_complainsalot-params-prod.json.j2

+      "max_age":15,
+      "max_age":13,


duplicate key; which value is appropriate?

sir_complainsalot/params.json.template claims "13" for this, but
ansible/templates/nssp-params-prod.json.j2 has "max_expected_lag":{"all": "15"}

my bad, this came from an apparently botched rebase. Minh specifically set it to 13

what about the "15" in the other location? are those not referring to the ~same thing?

fixed. They are all uniformly max 13 days now, which is based on analysis of api data in April 2024.

Most recent data published on 2024-04-17: 2024-04-06 00:00:00 lag: 12 Most recent data published on 2024-04-18: 2024-04-06 00:00:00 lag: 13 Most recent data published on 2024-04-19: 2024-04-13 00:00:00 lag: 7 Most recent data published on 2024-04-20: 2024-04-13 00:00:00 lag: 8 Most recent data published on 2024-04-21: 2024-04-13 00:00:00 lag: 9 Most recent data published on 2024-04-22: 2024-04-13 00:00:00 lag: 10 Most recent data published on 2024-04-23: 2024-04-13 00:00:00 lag: 11 Most recent data published on 2024-04-24: 2024-04-13 00:00:00 lag: 12 Most recent data published on 2024-04-25: 2024-04-13 00:00:00 lag: 13 Most recent data published on 2024-04-17: 2024-04-06 00:00:00 lag: 12 Most recent data published on 2024-04-18: 2024-04-06 00:00:00 lag: 13 Most recent data published on 2024-04-19: 2024-04-13 00:00:00 lag: 7 Most recent data published on 2024-04-20: 2024-04-13 00:00:00 lag: 8 Most recent data published on 2024-04-21: 2024-04-13 00:00:00 lag: 9 Most recent data published on 2024-04-22: 2024-04-13 00:00:00 lag: 10 Most recent data published on 2024-04-23: 2024-04-13 00:00:00 lag: 11 Most recent data published on 2024-04-24: 2024-04-13 00:00:00 lag: 12 Most recent data published on 2024-04-25: 2024-04-13 00:00:00 lag: 13 Most recent data published on 2024-04-26: 2024-04-20 00:00:00 lag: 7 Most recent data published on 2024-04-27: 2024-04-20 00:00:00 lag: 8 Most recent data published on 2024-04-29: 2024-04-20 00:00:00 lag: 10 Most recent data published on 2024-04-30: 2024-04-20 00:00:00 lag: 11 Most recent data published on 2024-05-01: 2024-04-20 00:00:00 lag: 12

melange396 · 2024-05-13T20:16:54Z

nssp/setup.py

+
+setup(
+    name="delphi_nssp",
+    version="0.0.1",


nssp/delphi_nssp/__init__.py claims version "0.1.0"... We should probably make them agree.

switching to 0.1.0 to match other initial release indicators. Though I did notice that e.g. claims_hosp/version.cfg has current _version = 0.3.54, which seems weird given that it's setup.py says 0.1.0

oof, thats a problem with our bump2version setup and/or our conventions... ill make an issue to look at it.

melange396 · 2024-05-14T03:00:41Z

notebooks/renv/activate.R

+local({
+
+  # the requested version of renv
+  version <- "1.0.7"


this file seems to be imported from some external source. will we need to do anything to keep it synced/up-to-date? i presume (while along side the notebooks/renv.lock) it should continue to "just work" until we need it to do something it cant already do?

I'm not 100% sure this is the right place to keep these notebooks to be honest, but I feel like the current set of notes about adding indicators are way too scattered. None of this folder is really intended to be automated.

renv/ and renv.lock are maintained via a cli in R, and allow for pinning versions of dependencies (in this case, the notebooks).

minhkhul requested review from dsweber2 and nmdefries April 17, 2024 21:11

dsweber2 approved these changes Apr 18, 2024

View reviewed changes

nmdefries requested changes Apr 19, 2024

View reviewed changes

dsweber2 reviewed Apr 23, 2024

View reviewed changes

nssp/delphi_nssp/pull.py Outdated Show resolved Hide resolved

minhkhul requested a review from nmdefries April 25, 2024 19:56

nmdefries reviewed Apr 25, 2024

View reviewed changes

nmdefries reviewed Apr 26, 2024

View reviewed changes

minhkhul force-pushed the nssp branch from 7bc3274 to 2bfd5fc Compare April 26, 2024 21:38

minhkhul and others added 12 commits May 9, 2024 13:07

et code

c074a45

Update nssp/delphi_nssp/run.py

1be5b28

Co-authored-by: David Weber <david.weber2@pm.me>

Update nssp/README.md

bd5c782

Co-authored-by: David Weber <david.weber2@pm.me>

Update nssp/DETAILS.md

86acc03

Co-authored-by: David Weber <david.weber2@pm.me>

Update nssp/delphi_nssp/__main__.py

9a53923

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

Update nssp/delphi_nssp/pull.py

54094eb

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

Update nssp/delphi_nssp/run.py

1552504

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

readme update

6ccddfc

column names mapping + signals name standardization to fit with other…

e560393

… available sources and signals"

improve readability

309b6c7

Add type_dict constant

813d289

more type_dict

db1f8ae

minhkhul and others added 15 commits May 9, 2024 13:07

use enumerate for clarity

b11c528

to make nssp run in staging

8169655

add nssp to Jenkinsfile

fde9264

nssp_token name change

bd545c8

set nssp sircal max_age to 15 days, to account for nighttime run

7a3807e

set nssp sircal max_age to 13 days

3623b5f

add validation to params

daee033

Update nssp/DETAILS.md

566a826

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

Update nssp/delphi_nssp/constants.py

cfa1b94

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>

nssp correlation rmd and general notebook folder

425b1fe

making Black happy

5ce26f0

update to new geomapper function

9b87133

following 120 line convention everywhere

d53ff83

happy linter

8eb6055

happy black formatter in nssp

e32cc55

dsweber2 force-pushed the nssp branch from a7a869d to e32cc55 Compare May 10, 2024 15:55

dsweber2 added 2 commits May 10, 2024 16:08

drop unneeded nssp tests

adf5df4

updates borked old tests, caught by @dshemetov

3559664

melange396 mentioned this pull request May 11, 2024

Add NSSP data as an endpoint cmu-delphi/delphi-epidata#558

Open

melange396 self-requested a review May 13, 2024 17:44

melange396 requested changes May 13, 2024

View reviewed changes

rebase woes and version consistency

67601aa

melange396 reviewed May 14, 2024

View reviewed changes

minhkhul and others added 3 commits May 14, 2024 01:17

Update nssp-params-prod.json.j2 min/max lag to 13

a79cff8

Update params.json.template min/max lag to 7 and 13

9c6f31b

missed column renames for geo_mapper, unneeded index

33a188e

		df = generate_weights(df, "val")
		df = weighted_geo_sum(df, "geo_id", "val")

nssp pipeline code #1952

Are you sure you want to change the base?

nssp pipeline code #1952

Conversation

minhkhul commented Apr 17, 2024

Description

dsweber2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmdefries left a comment

Choose a reason for hiding this comment

nmdefries left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmdefries Apr 25, 2024 • edited

Choose a reason for hiding this comment

nmdefries commented Apr 26, 2024

Choose a reason for hiding this comment

dsweber2 Apr 26, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

minhkhul commented May 2, 2024

Signal behavior across time and geos

State-level:

County-level:

dsweber2 commented May 2, 2024

minhkhul commented May 2, 2024

Signal behavior across time and geos (Continue)

MSA-level

minhkhul commented May 2, 2024

DC

nmdefries commented May 2, 2024

dsweber2 commented May 3, 2024 • edited

dsweber2 commented May 3, 2024

Flu

Fixed State, over time

Fixed Time over states

Lag correlations

Covid

Covid Fixed States over time

Covid Fixed Time over states

Covid Lag analysis

minhkhul commented May 3, 2024

Quick analysis of nssp signals source backfill behavior

dsweber2 commented May 13, 2024 • edited

melange396 commented May 13, 2024

minhkhul commented May 13, 2024

melange396 commented May 13, 2024

melange396 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

minhkhul May 14, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmdefries Apr 25, 2024 •

edited

dsweber2 Apr 26, 2024 •

edited

dsweber2 commented May 3, 2024 •

edited

dsweber2 commented May 13, 2024 •

edited

minhkhul May 14, 2024 •

edited