Unable to treat variable as continuous measure #109

sgummidipundi · 2020-12-27T04:42:36Z

Hello! Would just like to say fantastic package and great syntax for the function.

I seem to be having an issue with creating a table with continuous values. I'm sure I am probably doing something incorrectly on my end since it is basic functionality. When I try to do an easy example with a single continuous variable I get an output like below:

It is odd because clearly it is reading it as non-normal as I have specified (as indicated by the 'median [Q1, Q3]) but it seems to only give counts and frequencies, essentially treating it as categorical. I have also verified that the variable is of type float64. Is there any suggestions on how I can proceed and have it treat it as a continuous measure?

Thanks in advance

tompollard · 2020-12-30T18:41:44Z

Hi @sgummidipundi, you've raised a good point, which is that there is no "continuous" argument. At the moment, tableone expects you to define the categorical variables using the "categorical" argument. Anything else is then treated as continuous. I can see how this is confusing, especially when (as in your case) there are no categorical variables.

If you don't specify which variables are categorical, then then tableone attempts to guess (and, from your example, clearly doesn't do a great job!). In your example, you would need to provide an empty categorical argument. I've tried to recreate the example below:

1. Generate sample data

# import packages
import pandas as pd
import tableone

# create sample dataframe
x = ([0.0] * 41639 + 
     [0.2] * 3 +
     [0.25] * 1 +
     [1] * 3 +
     [10] * 806 +
     [100] * 816 +
     [1000] * 1488 +
     [10000] * 57 +
     [100000] * 3 +
     [11000] * 2 +
     [117000] * 7 +
     [12] * 1 +
     [1200] * 267 +
     [12000] * 51)

data = pd.DataFrame(x, columns=["x"])

2. Create summary table, allowing tableone to guess the data type

Based on the large number of observations and the limited number of unique values, tableone (incorrectly!) guesses that x is categorical

t1 = tableone.tableone(data)
print(t1.tabulate(tablefmt = "github"))

		Missing	Overall
n			45144
x, n (%)	0.0	0	41639 (92.2)
	0.2		3 (0.0)
	0.25		1 (0.0)
	1.0		3 (0.0)
	10.0		806 (1.8)
	100.0		816 (1.8)
	1000.0		1488 (3.3)
	10000.0		57 (0.1)
	100000.0		3 (0.0)
	11000.0		2 (0.0)
	117000.0		7 (0.0)
	12.0		1 (0.0)
	1200.0		267 (0.6)
	12000.0		51 (0.1)

3. Create summary table with the `categorical` argument

t2 = tableone.tableone(data, categorical=[])
print(t2.tabulate(tablefmt = "github"))

		Missing	Overall
n			45144
x, mean (SD)		0	93.5 (1764.8)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to treat variable as continuous measure #109

Unable to treat variable as continuous measure #109

sgummidipundi commented Dec 27, 2020

tompollard commented Dec 30, 2020

Unable to treat variable as continuous measure #109

Unable to treat variable as continuous measure #109

Comments

sgummidipundi commented Dec 27, 2020

tompollard commented Dec 30, 2020

1. Generate sample data

2. Create summary table, allowing tableone to guess the data type

3. Create summary table with the categorical argument

3. Create summary table with the `categorical` argument