Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

enforce_privacy dose not work? #1145

Open
gDanzel opened this issue May 4, 2024 · 1 comment
Open

enforce_privacy dose not work? #1145

gDanzel opened this issue May 4, 2024 · 1 comment

Comments

@gDanzel
Copy link

gDanzel commented May 4, 2024

System Info

OS version: win11
Python version: 3.11
The current version of pandasai being used: 2.0.36

馃悰 Describe the bug

The sample data appears in the prompt even set enforce_privacy True.

The Code below:

import pandasai.pandas as pd
from pandasai import Agent
from pandasai.helpers import get_openai_callback
from pandasai.llm import OpenAI, GoogleGemini

from data.sample_dataframe import dataframe

llm = OpenAI()

agent = Agent([pd.DataFrame(dataframe)], config={"llm": llm, "enforce_privacy": True, "verbose": True})
with get_openai_callback() as cb:
    response = agent.chat("Get the top 3 GDP countries.")
    print(response)
    print(cb)

And can see the print out of prompt, the dataframe still with data:

2024-05-04 15:08:41 [INFO] Question: Get the top 3 GDP countries.
2024-05-04 15:08:42 [INFO] Running PandasAI with openai LLM...
2024-05-04 15:08:42 [INFO] Prompt ID: 50302077-57f3-482a-a823-64e2be596f5d
2024-05-04 15:08:42 [INFO] Executing Pipeline: GenerateChatPipeline
2024-05-04 15:08:42 [INFO] Executing Step 0: ValidatePipelineInput
2024-05-04 15:08:42 [INFO] Executing Step 1: CacheLookup
2024-05-04 15:08:42 [INFO] Executing Step 2: PromptGeneration
2024-05-04 15:08:46 [INFO] Using prompt: <dataframe>
dfs[0]:10x3
country,gdp,happiness_index
Spain,19294482071552,6.38
Japan,14631844184064,7.23
China,3435817336832,7.22
</dataframe>




Update this initial code:
\```python
\# TODO: import the required dependencies
import pandas as pd

\# Write code here

\# Declare result var: 
type (possible values "string", "number", "dataframe", "plot"). Examples: { "type": "string", "value": f"The highest salary is {highest_salary}." } or { "type": "number", "value": 125 } or { "type": "dataframe", "value": pd.DataFrame({...}) } or { "type": "plot", "value": "temp_chart.png" }

### QUERY
Get the top 3 GDP countries.

Variable dfs: list[pd.DataFrame] is already declared.

At the end, declare "result" variable as a dictionary of type and value.

If you are asked to plot a chart, use "matplotlib" for charts, save as png.

Generate python code and return full updated code:
2024-05-04 15:08:46 [INFO] Executing Step 3: CodeGenerator
2024-05-04 15:08:49 [INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-05-04 15:08:49 [INFO] Prompt used:

dfs[0]:10x3
country,gdp,happiness_index
Spain,19294482071552,6.38
Japan,14631844184064,7.23
China,3435817336832,7.22

Update this initial code:

# TODO: import the required dependencies
import pandas as pd

# Write code here

# Declare result var: 
type (possible values "string", "number", "dataframe", "plot"). Examples: { "type": "string", "value": f"The highest salary is {highest_salary}." } or { "type": "number", "value": 125 } or { "type": "dataframe", "value": pd.DataFrame({...}) } or { "type": "plot", "value": "temp_chart.png" }

QUERY

Get the top 3 GDP countries.

Variable dfs: list[pd.DataFrame] is already declared.

At the end, declare "result" variable as a dictionary of type and value.

If you are asked to plot a chart, use "matplotlib" for charts, save as png.

Generate python code and return full updated code:

2024-05-04 15:08:49 [INFO] Code generated:
```
# TODO: import the required dependencies
import pandas as pd

Write code here

top_3_gdp_countries = dfs[0].nlargest(3, 'gdp')

Declare result var

result = {
"type": "dataframe",
"value": top_3_gdp_countries
}
```

2024-05-04 15:08:49 [INFO] Executing Step 4: CachePopulation
2024-05-04 15:08:49 [INFO] Executing Step 5: CodeCleaning
2024-05-04 15:08:49 [INFO]
Code running:

top_3_gdp_countries = dfs[0].nlargest(3, 'gdp')
result = {'type': 'dataframe', 'value': top_3_gdp_countries}
        ```
2024-05-04 15:08:49 [INFO] Executing Step 6: CodeExecution
2024-05-04 15:08:49 [INFO] Executing Step 7: ResultValidation
2024-05-04 15:08:49 [INFO] Answer: {'type': 'dataframe', 'value':          country             gdp  happiness_index
0  United States  19294482071552             6.94
9          China  14631844184064             5.12
8          Japan   4380756541440             5.87}
2024-05-04 15:08:49 [INFO] Executing Step 8: ResultParsing
         country             gdp  happiness_index
0  United States  19294482071552             6.94
9          China  14631844184064             5.12
8          Japan   4380756541440             5.87
Tokens Used: 340
	Prompt Tokens: 270
	Completion Tokens: 70
Total Cost (USD): $ 0.000240

Process finished with exit code 0
@Hrishikesh-Dutta0078
Copy link

Facing same issue. enforce privacy is working till v2.0.28.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants