{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":728769934,"defaultBranch":"main","name":"gemini-benchmark","ownerLogin":"neulab","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2023-12-07T16:50:17.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/22324665?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1703301991.0","currentOid":""},"activityList":{"items":[{"before":"d4a98815d9858c160ec1c7a7b819cb176c4d46f4","after":"45d7e65a15ed1691d74fe1d068c8ccab97d25a6e","ref":"refs/heads/main","pushedAt":"2024-01-04T23:34:32.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"aashiqmuhamed","name":"Aashiq Muhamed","path":"/aashiqmuhamed","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17514579?s=80&v=4"},"commit":{"message":"FLORES scoring script seems to have hyps and refs flipped (#44)","shortMessageHtmlLink":"FLORES scoring script seems to have hyps and refs flipped (#44)"}},{"before":"6b3b8f18c3fbaa6df947f00fc49b87802c5e063e","after":"d4a98815d9858c160ec1c7a7b819cb176c4d46f4","ref":"refs/heads/main","pushedAt":"2023-12-26T17:10:49.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Update README.md (#43)\n\nlangauge -> language","shortMessageHtmlLink":"Update README.md (#43)"}},{"before":"c206c88ff5e7587e5ccf08fdba1e4f2252006a26","after":"6b3b8f18c3fbaa6df947f00fc49b87802c5e063e","ref":"refs/heads/main","pushedAt":"2023-12-23T14:56:21.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Added new outputs for math reasoning tasks (#40)\n\n* Added new outputs for math and text reasoning tasks\r\n\r\n* Added new outputs for text reasoning tasks\r\n\r\n* Added new scripts\r\n\r\n* Added prompt only for math tasks","shortMessageHtmlLink":"Added new outputs for math reasoning tasks (#40)"}},{"before":"e31a41af7637abcaef996f03ea2bc4649211a7ad","after":"3f499f448f18dbcffb5f1c3c21fc6e06be714074","ref":"refs/heads/Math_Text_Reasoning_Results","pushedAt":"2023-12-23T14:47:21.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"snat1505027","name":"Syeda Nahida Akter","path":"/snat1505027","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/18405970?s=80&v=4"},"commit":{"message":"Added prompt only for math tasks","shortMessageHtmlLink":"Added prompt only for math tasks"}},{"before":"381e60bc848e6cb59094f3dd9f5fcddd65abbbfa","after":"c206c88ff5e7587e5ccf08fdba1e4f2252006a26","ref":"refs/heads/main","pushedAt":"2023-12-23T11:23:09.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"remove safety filter, update mixtral to mixtral-instruct, use built-in ua eval (#42)\n\n* initial commit for webarena\r\n\r\n* proper diretory\r\n\r\n* resolve submodule\r\n\r\n* resolve submodule\r\n\r\n* no na\r\n\r\n* support call with litellm\r\n\r\n* visualize\r\n\r\n* WebArena: experiment and zeno analysis code\r\n\r\n* remove safety filter, update mixtral to mixtral-instruct, use built-in ua eval","shortMessageHtmlLink":"remove safety filter, update mixtral to mixtral-instruct, use built-i…"}},{"before":"56eec3bc084ba4c3d81a2d7433570a74734ba40a","after":"381e60bc848e6cb59094f3dd9f5fcddd65abbbfa","ref":"refs/heads/main","pushedAt":"2023-12-23T11:22:32.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Mixtral support and outputs (#41)\n\n* Resolved merge conflict\r\n\r\n* Added zeno visualization script\r\n\r\n* Merge conflict resolved\r\n\r\n* Updated gemini results without filtering and mixtral support\r\n\r\n* Committing outputs for mistral","shortMessageHtmlLink":"Mixtral support and outputs (#41)"}},{"before":null,"after":"ed27b9c5b3a501f8206ba4c675ed8090543b9750","ref":"refs/heads/agent","pushedAt":"2023-12-23T03:26:31.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"oootttyyy","name":"Tianyue Ou","path":"/oootttyyy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49041600?s=80&v=4"},"commit":{"message":"merge","shortMessageHtmlLink":"merge"}},{"before":null,"after":"d6ff33882cf491a82273c41f357e3cddc2f9f4b7","ref":"refs/heads/translation-dev","pushedAt":"2023-12-23T00:25:42.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"aashiqmuhamed","name":"Aashiq Muhamed","path":"/aashiqmuhamed","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17514579?s=80&v=4"},"commit":{"message":"Committing outputs for mistral","shortMessageHtmlLink":"Committing outputs for mistral"}},{"before":"dce31b2358e20f0014fe78a6516dde929b4dbcf1","after":null,"ref":"refs/heads/translation-dev","pushedAt":"2023-12-23T00:25:28.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"aashiqmuhamed","name":"Aashiq Muhamed","path":"/aashiqmuhamed","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17514579?s=80&v=4"}},{"before":null,"after":"dce31b2358e20f0014fe78a6516dde929b4dbcf1","ref":"refs/heads/translation-dev","pushedAt":"2023-12-23T00:20:13.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"aashiqmuhamed","name":"Aashiq Muhamed","path":"/aashiqmuhamed","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17514579?s=80&v=4"},"commit":{"message":"Updated gemini results without filtering and mixtral support","shortMessageHtmlLink":"Updated gemini results without filtering and mixtral support"}},{"before":"512dece3da9021785aabfccaca13e6fd3d9a367a","after":"e31a41af7637abcaef996f03ea2bc4649211a7ad","ref":"refs/heads/Math_Text_Reasoning_Results","pushedAt":"2023-12-23T00:09:38.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"snat1505027","name":"Syeda Nahida Akter","path":"/snat1505027","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/18405970?s=80&v=4"},"commit":{"message":"Added new scripts","shortMessageHtmlLink":"Added new scripts"}},{"before":"3c529cc8182d7f4574a900b4d4045835bd04e598","after":null,"ref":"refs/heads/aashiq-dev","pushedAt":"2023-12-23T00:06:23.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"aashiqmuhamed","name":"Aashiq Muhamed","path":"/aashiqmuhamed","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17514579?s=80&v=4"}},{"before":null,"after":"3c529cc8182d7f4574a900b4d4045835bd04e598","ref":"refs/heads/aashiq-dev","pushedAt":"2023-12-23T00:04:42.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"aashiqmuhamed","name":"Aashiq Muhamed","path":"/aashiqmuhamed","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17514579?s=80&v=4"},"commit":{"message":"Updated gemini results without filtering and mixtral support","shortMessageHtmlLink":"Updated gemini results without filtering and mixtral support"}},{"before":null,"after":"512dece3da9021785aabfccaca13e6fd3d9a367a","ref":"refs/heads/Math_Text_Reasoning_Results","pushedAt":"2023-12-22T23:45:10.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"snat1505027","name":"Syeda Nahida Akter","path":"/snat1505027","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/18405970?s=80&v=4"},"commit":{"message":"Added new outputs for math and text reasoning tasks","shortMessageHtmlLink":"Added new outputs for math and text reasoning tasks"}},{"before":"fe7a80c9f4423bdca529dfd18691d060f5d61e6e","after":"56eec3bc084ba4c3d81a2d7433570a74734ba40a","ref":"refs/heads/main","pushedAt":"2023-12-22T23:32:46.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"oootttyyy","name":"Tianyue Ou","path":"/oootttyyy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49041600?s=80&v=4"},"commit":{"message":"Update README.md with new results (#39)","shortMessageHtmlLink":"Update README.md with new results (#39)"}},{"before":null,"after":"b5430c0d39ab0e32220e0096a9150844072f4fd3","ref":"refs/heads/new_results_1222","pushedAt":"2023-12-22T23:29:33.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Update README.md with new results","shortMessageHtmlLink":"Update README.md with new results"}},{"before":"030ba5565e3e37549d16f953fd78e247f35c40df","after":"fe7a80c9f4423bdca529dfd18691d060f5d61e6e","ref":"refs/heads/main","pushedAt":"2023-12-22T18:57:51.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Updates to code processing (#37)\n\n* Made some updates to code\r\n\r\n* Update to code processing\r\n\r\n* Remove unused imports","shortMessageHtmlLink":"Updates to code processing (#37)"}},{"before":"9a04b5c7a9b1b20352c540120232502076f6b264","after":"10f368e800c466e09a92eaa5cf08b0c72cb2e179","ref":"refs/heads/code_updates","pushedAt":"2023-12-22T18:44:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Remove unused imports","shortMessageHtmlLink":"Remove unused imports"}},{"before":"6d6055c0b976c7538eab943e8f441778fa3e7e28","after":"9a04b5c7a9b1b20352c540120232502076f6b264","ref":"refs/heads/code_updates","pushedAt":"2023-12-22T18:42:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Update to code processing","shortMessageHtmlLink":"Update to code processing"}},{"before":null,"after":"6d6055c0b976c7538eab943e8f441778fa3e7e28","ref":"refs/heads/code_updates","pushedAt":"2023-12-22T13:28:57.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Made some updates to code","shortMessageHtmlLink":"Made some updates to code"}},{"before":"f124d28687c4cbe3feff51ccc966e8d7d95bf6b2","after":"030ba5565e3e37549d16f953fd78e247f35c40df","ref":"refs/heads/main","pushedAt":"2023-12-21T17:42:26.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Wrap up updated MMLU and Code (#36)","shortMessageHtmlLink":"Wrap up updated MMLU and Code (#36)"}},{"before":null,"after":"56c02b3dd37155e2285a0640bafa97801a5230f6","ref":"refs/heads/mmlu-code-update","pushedAt":"2023-12-21T17:27:30.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"yuzc19","name":"Zichun Yu","path":"/yuzc19","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/53684662?s=80&v=4"},"commit":{"message":"Wrap up updated MMLU and Code","shortMessageHtmlLink":"Wrap up updated MMLU and Code"}},{"before":"c91f474063b828d1e09e7d36adaddbe1d693ebd7","after":"f124d28687c4cbe3feff51ccc966e8d7d95bf6b2","ref":"refs/heads/main","pushedAt":"2023-12-20T12:41:17.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Fix missing word inREADME.md (#30)","shortMessageHtmlLink":"Fix missing word inREADME.md (#30)"}},{"before":null,"after":"cf08fd91947c461f977b013e128578e7a0665c64","ref":"refs/heads/readme_missing_word","pushedAt":"2023-12-20T12:08:25.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Fix missing word inREADME.md","shortMessageHtmlLink":"Fix missing word inREADME.md"}},{"before":"621ac097837bcd5c1e1607b02587279a35adf923","after":"c91f474063b828d1e09e7d36adaddbe1d693ebd7","ref":"refs/heads/main","pushedAt":"2023-12-19T13:38:39.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"snat1505027","name":"Syeda Nahida Akter","path":"/snat1505027","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/18405970?s=80&v=4"},"commit":{"message":"Update README.md with more detail and links (#28)","shortMessageHtmlLink":"Update README.md with more detail and links (#28)"}},{"before":null,"after":"d5778ff85efca5a34556a9c3ba326c57905956db","ref":"refs/heads/readme_upgrade","pushedAt":"2023-12-19T13:18:52.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Update README.md with more detail and links","shortMessageHtmlLink":"Update README.md with more detail and links"}},{"before":"3cdddcf68febba63b82c75417166e1c66a071f51","after":"621ac097837bcd5c1e1607b02587279a35adf923","ref":"refs/heads/main","pushedAt":"2023-12-19T13:08:52.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"},"commit":{"message":"Update README.md (#27)\n\nAdded Description","shortMessageHtmlLink":"Update README.md (#27)"}},{"before":null,"after":"3e508b967196b87d712ea21bdfecf9c0ee3d8add","ref":"refs/heads/README-Description","pushedAt":"2023-12-19T12:55:44.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"snat1505027","name":"Syeda Nahida Akter","path":"/snat1505027","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/18405970?s=80&v=4"},"commit":{"message":"Update README.md\n\nAdded Description","shortMessageHtmlLink":"Update README.md"}},{"before":"17f612cc2814bf8ff1bc6002aeb027d14c44475d","after":null,"ref":"refs/heads/ac-viz-math","pushedAt":"2023-12-19T00:54:53.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"}},{"before":"52fc5740f4c37e969ce23ab104e1c5de7d0d3a4c","after":null,"ref":"refs/heads/MMLU","pushedAt":"2023-12-19T00:54:53.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"neubig","name":"Graham Neubig","path":"/neubig","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/398875?s=80&v=4"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAD2B93mwA","startCursor":null,"endCursor":null}},"title":"Activity · neulab/gemini-benchmark"}