FEAS is an automated theorem proving (ATP) agent specifically designed for solving functional equation problems within the Lean theorem prover environment. It builds upon the foundational work of the COPRA project (https://github.com/trishullab/copra) but is tailored exclusively for Lean, focusing on specialized techniques relevant to this domain.
The setup for FEAS follows the same procedure as the COPRA project, tailored for Lean. Below are the relevant setup steps from the COPRA README, adapted for FEAS:
-
Create and Activate a Miniconda Environment:
- Set up a
Miniconda
environment and activate it.
- Set up a
-
Project Setup:
- Navigate to the project's root directory and execute the setup script:
./src/scripts/setup.sh
.
- Navigate to the project's root directory and execute the setup script:
-
Configure Environment Variables for Lean:
- Add the following line to your
.bashrc
file:export PATH="/home/$USER/.elan/bin:$PATH"
- Add the following line to your
-
API Key Configuration:
- Create a file named
.secrets/openai_key.json
in the project root directory containing your OpenAI API key:{ "organization": "<your-organization-id>", "api_key": "<your-api-key>" }
Note: The
setup.sh
script includes some configurations for Coq and Isabelle which are not needed for FEAS. - Create a file named
-
FunEq dataset Lean Project Configuration:
- Run the following commands in
data/benchmarks/FunEq
directory:leanpkg configure leanproject get-mathlib-cache leanproject build
- Run the following commands in
To run FEAS experiments, execute src/main/eval_benchmark.py
with the specified parameters:
-
eval_settings=n_60_dfs_gpt4_128k always
-
Set
prompt_settings
according to the desired agent:lean_few_shot
for few-shotlean_dfs
for COPRA agentlean_dfs_block
for FEAS agentlean_dfs_block_strategy
for FEAS agent with heuristics
-
Set
benchmark
based on the problem category of FunEq:simple_funeq
for simple problemsintermediate_funeq
for intermediate-level problemsimo_a1_funeq
for IMO shortlisted A1 problemsimo_funeq
for other IMO shortlisted problemsfuneq
for running all problems of FunEq