{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":547806116,"defaultBranch":"main","name":"text-generation-inference","ownerLogin":"huggingface","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2022-10-08T10:26:28.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/25720743?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1716818597.0","currentOid":""},"activityList":{"items":[{"before":"a03cc02a7319869e38d4364118b9cbaf181820c1","after":"7b368b76445e5f0a287f2d69b0a5ddee59d0b3ca","ref":"refs/heads/upgrade_axum","pushedAt":"2024-05-27T16:35:57.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fixing doc.","shortMessageHtmlLink":"Fixing doc."}},{"before":"26d3519ff2de0a8368689dcd76b2f4302f976bb5","after":"a03cc02a7319869e38d4364118b9cbaf181820c1","ref":"refs/heads/upgrade_axum","pushedAt":"2024-05-27T16:14:24.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Disabled ngrok for good.","shortMessageHtmlLink":"Disabled ngrok for good."}},{"before":"4c6c39e4918cfedea32c6d0cd5895d1b2de4b78a","after":"26d3519ff2de0a8368689dcd76b2f4302f976bb5","ref":"refs/heads/upgrade_axum","pushedAt":"2024-05-27T16:09:28.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Upgrading axum=0.7","shortMessageHtmlLink":"Upgrading axum=0.7"}},{"before":"0732b9d2f0fb9a4dd9753bdabe3ddb7d452c49cf","after":"b7ffa287f228e065c45a99684e73b862a5166fac","ref":"refs/heads/main","pushedAt":"2024-05-27T15:31:07.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"fix small typo and broken link (#1958)\n\n# What does this PR do?\r\n\r\nFix a typo; fix a broken link; add one sentence in the guidance docs to\r\nmake the word \"grammar\" less abstract\r\n\r\n\r\n## Before submitting\r\n- [x] This PR fixes a typo or improves the docs (you can dismiss the\r\nother checks if that's the case).\r\n- [ ] Did you read the [contributor\r\nguideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),\r\n      Pull Request section?\r\n- [ ] Was this discussed/approved via a Github issue or the\r\n[forum](https://discuss.huggingface.co/)? Please add a link\r\n      to it if that's the case.\r\n- [ ] Did you make sure to update the documentation with your changes?\r\nHere are the\r\n[documentation\r\nguidelines](https://github.com/huggingface/transformers/tree/main/docs),\r\nand\r\n[here are tips on formatting\r\ndocstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).\r\n- [ ] Did you write any new necessary tests?\r\n\r\n\r\n## Who can review?\r\n\r\nAnyone in the community is free to review the PR once the tests have\r\npassed. Feel free to tag\r\nmembers/contributors who may be interested in your PR.\r\n\r\n@drbh","shortMessageHtmlLink":"fix small typo and broken link (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2319345760\" data-permission-text=\"Title is private\" data-url=\"https://github.com/huggingface/text-generation-inference/issues/1958\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/huggingface/text-generation-inference/pull/1958/hovercard\" href=\"https://github.com/huggingface/text-generation-inference/pull/1958\">#1958</a>)"}},{"before":"dd696891ac5ac6a891f26d19e8d9e013deea3155","after":null,"ref":"refs/heads/processor-config-chat-template","pushedAt":"2024-05-27T14:03:17.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"}},{"before":"a401c83c355d3b66ad158f4798b58bb5c696caac","after":"0732b9d2f0fb9a4dd9753bdabe3ddb7d452c49cf","ref":"refs/heads/main","pushedAt":"2024-05-27T14:03:16.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Processor config chat template (#1954)\n\nThis PR loads the `processor_config` similar to the `tokenizer_config`\r\nand uses the processor_config's chat_template if the tokenizer_config\r\ndoes not include one. These changes enable chat with idefics2","shortMessageHtmlLink":"Processor config chat template (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2318180196\" data-permission-text=\"Title is private\" data-url=\"https://github.com/huggingface/text-generation-inference/issues/1954\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/huggingface/text-generation-inference/pull/1954/hovercard\" href=\"https://github.com/huggingface/text-generation-inference/pull/1954\">#1954</a>)"}},{"before":"95944b67578c9a6b78a343e742686db183ecff95","after":"25312cb8be7fc28a271cbb6f2b874fee7d95f292","ref":"refs/heads/pip-installable","pushedAt":"2024-05-27T13:00:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"fix: set cuda arch list prior to vllm build","shortMessageHtmlLink":"fix: set cuda arch list prior to vllm build"}},{"before":"6f30a13afa522d18e0c40d06d5d09fb35c4a8760","after":null,"ref":"refs/heads/bugfix/preserve-quantized-dtype-simpler","pushedAt":"2024-05-27T12:41:29.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"}},{"before":"9231098f3a9b2f0fe7f6652f10f02f4d8f551143","after":"a401c83c355d3b66ad158f4798b58bb5c696caac","ref":"refs/heads/main","pushedAt":"2024-05-27T12:41:28.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype (simpler) (#1953)\n\n# What does this PR do?\r\n\r\nFix GPTQ for models which do not have float16 at the default dtype\r\n\r\nBefore this change GPTQ models would not work if the model's default\r\ndata type is not `float16`. For example, Gemma GPTQ models would fail\r\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\r\n\r\nIf the default `dtype` is not `float16`, the quantizer's `float16`\r\nparameters get converted to that dtype. The kernels cannot deal\r\nwith non-`float16` types. The same applies to inputs of quantized ops.\r\n\r\nThis is resolved by setting the dtype of gptq/awq-quantized models to\r\n`float16`.\r\n\r\nSimpler version of #1951.\r\n\r\n**Draft:** just testing...\r\n\r\n## Before submitting\r\n- [ ] This PR fixes a typo or improves the docs (you can dismiss the\r\nother checks if that's the case).\r\n- [x] Did you read the [contributor\r\nguideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),\r\n      Pull Request section?\r\n- [ ] Was this discussed/approved via a Github issue or the\r\n[forum](https://discuss.huggingface.co/)? Please add a link\r\n      to it if that's the case.\r\n- [ ] Did you make sure to update the documentation with your changes?\r\nHere are the\r\n[documentation\r\nguidelines](https://github.com/huggingface/transformers/tree/main/docs),\r\nand\r\n[here are tips on formatting\r\ndocstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).\r\n- [ ] Did you write any new necessary tests?\r\n\r\n\r\n## Who can review?\r\n\r\nAnyone in the community is free to review the PR once the tests have\r\npassed. Feel free to tag\r\nmembers/contributors who may be interested in your PR.\r\n\r\n<!-- Your PR will be replied to more quickly if you can figure out the\r\nright person to tag with @\r\n\r\n\r\n@OlivierDehaene OR @Narsil\r\n\r\n -->","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype (s…"}},{"before":"fb83cb85c7dc62cd6bbd51130b1712e8959c9447","after":"b9b5051abc197e5e13e47054591103e62fd06791","ref":"refs/heads/bugfix/preserve-quantized-dtype","pushedAt":"2024-05-27T08:05:18.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype\n\nBefore this change GPTQ models would not work if the model's default\ndata type is not `float16`. For example, Gemma GPTQ models would fail\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\n\n1. If the default `dtype` is not `float16`, the quantizer's `float16`\n   parameters get converted to that dtype. The kernels cannot deal\n   with non-`float16` types.\n\nThis change resolves this issue by excluding quantizer parameters\nfrom data type conversions.\n\n2. Quantized models will typically have `float16` parameters. However,\n   the default dtype was set to model's default. So, if a quantized\n   Gemma uses `float16`, all parameters are converted to `bfloat16` since\n   it is the model's default. This fails in quantized gemm, because it\n   expects `float16` arguments.\n\nThis is resolved by setting the dtype of gptq/awq-quantized models to\n`float16`. (We cannot use `torch_dtype` from the config, because it\noften does not correspond to the dtype of the parameters.)","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype"}},{"before":"9ce8640cae7403e61ce7abb939bbde8038ff8c8d","after":"6f30a13afa522d18e0c40d06d5d09fb35c4a8760","ref":"refs/heads/bugfix/preserve-quantized-dtype-simpler","pushedAt":"2024-05-27T08:02:50.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype\n\nBefore this change GPTQ models would not work if the model's default\ndata type is not `float16`. For example, Gemma GPTQ models would fail\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\n\nIf the default `dtype` is not `float16`, the quantizer's `float16`\nparameters get converted to that dtype. The kernels cannot deal\nwith non-`float16` types. The same applies to inputs of quantized ops.\n\nThis is resolved by setting the dtype of gptq/awq-quantized models to\n`float16`.","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype"}},{"before":"364b4497d0438de2ce8b8df45e77300b63c46dfd","after":null,"ref":"refs/heads/bugfix/gemma-prefix","pushedAt":"2024-05-27T07:58:06.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"}},{"before":"d32e33bd489f2419e579f5d423073791ee19f789","after":"9231098f3a9b2f0fe7f6652f10f02f4d8f551143","ref":"refs/heads/main","pushedAt":"2024-05-27T07:58:06.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix (flash) Gemma prefix and enable tests","shortMessageHtmlLink":"Fix (flash) Gemma prefix and enable tests"}},{"before":"52c13cdad4256d9645ec09daa911fd2aea1aa00b","after":"95944b67578c9a6b78a343e742686db183ecff95","ref":"refs/heads/pip-installable","pushedAt":"2024-05-27T04:26:17.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"feat: compile vllm for cuda after flash_attn","shortMessageHtmlLink":"feat: compile vllm for cuda after flash_attn"}},{"before":"93409ea03841247bb9a5c413429853c48b8e5caf","after":"dd696891ac5ac6a891f26d19e8d9e013deea3155","ref":"refs/heads/processor-config-chat-template","pushedAt":"2024-05-27T04:00:03.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"fix: adjust for idefics2 template","shortMessageHtmlLink":"fix: adjust for idefics2 template"}},{"before":null,"after":"93409ea03841247bb9a5c413429853c48b8e5caf","ref":"refs/heads/processor-config-chat-template","pushedAt":"2024-05-27T03:49:40.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"feat: check processor_config chat template if not in tokenizer_config","shortMessageHtmlLink":"feat: check processor_config chat template if not in tokenizer_config"}},{"before":"00e5a89fef4bdcdf6dde42abff9843dc09381df4","after":"52c13cdad4256d9645ec09daa911fd2aea1aa00b","ref":"refs/heads/pip-installable","pushedAt":"2024-05-27T03:21:09.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"feat: compile vllm for cuda after flash_attn","shortMessageHtmlLink":"feat: compile vllm for cuda after flash_attn"}},{"before":"33043f8255f8335d7b49b5ca27746aebcf478405","after":null,"ref":"refs/heads/add-medusa-training-docs","pushedAt":"2024-05-27T03:13:58.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"}},{"before":null,"after":"33043f8255f8335d7b49b5ca27746aebcf478405","ref":"refs/heads/add-medusa-training-docs","pushedAt":"2024-05-27T03:13:13.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"fix: improve text and typos","shortMessageHtmlLink":"fix: improve text and typos"}},{"before":"ef8088b4b032f0c8d01b94d0107fe1decb108b6c","after":"9acf4307c94067c8ad60f569980f029afaa37d0f","ref":"refs/heads/kserve-endpoints","pushedAt":"2024-05-27T03:03:55.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"fix: cleanup and improve api docs","shortMessageHtmlLink":"fix: cleanup and improve api docs"}},{"before":null,"after":"9ce8640cae7403e61ce7abb939bbde8038ff8c8d","ref":"refs/heads/bugfix/preserve-quantized-dtype-simpler","pushedAt":"2024-05-25T09:00:09.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype\n\nBefore this change GPTQ models would not work if the model's default\ndata type is not `float16`. For example, Gemma GPTQ models would fail\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\n\nIf the default `dtype` is not `float16`, the quantizer's `float16`\nparameters get converted to that dtype. The kernels cannot deal\nwith non-`float16` types. The same applies to inputs of quantized ops.\n\nThis is resolved by setting the dtype of gptq/awq-quantized models to\n`float16`.","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype"}},{"before":"4e786230e58304569b66c9a892f58031bbd84299","after":"fb83cb85c7dc62cd6bbd51130b1712e8959c9447","ref":"refs/heads/bugfix/preserve-quantized-dtype","pushedAt":"2024-05-25T08:53:27.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype\n\nBefore this change GPTQ models would not work if the model's default\ndata type is not `float16`. For example, Gemma GPTQ models would fail\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\n\n1. If the default `dtype` is not `float16`, the quantizer's `float16`\n   parameters get converted to that dtype. The kernels cannot deal\n   with non-`float16` types.\n\nThis change resolves this issue by excluding quantizer parameters\nfrom data type conversions.\n\n2. Quantized models will typically have `float16` parameters. However,\n   the default dtype was set to model's default. So, if a quantized\n   Gemma uses `float16`, all parameters are converted to `bfloat16` since\n   it is the model's default. This fails in quantized gemm, because it\n   expects `float16` arguments.\n\nThis is resolved by setting the dtype of gptq/awq-quantized models to\n`float16`. (We cannot use `torch_dtype` from the config, because it\noften does not correspond to the dtype of the parameters.)","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype"}},{"before":"bbd6675c5313205189333254fc1f489cbe5e48be","after":"4e786230e58304569b66c9a892f58031bbd84299","ref":"refs/heads/bugfix/preserve-quantized-dtype","pushedAt":"2024-05-25T08:45:11.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype\n\nBefore this change GPTQ models would not work if the model's default\ndata type is not `float16`. For example, Gemma GPTQ models would fail\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\n\n1. If the default `dtype` is not `float16`, the quantizer's `float16`\n   parameters get converted to that dtype. The kernels cannot deal\n   with non-`float16` types.\n\nThis change resolves this issue by excluding quantizer parameters\nfrom data type conversions.\n\n2. Quantized models will typically have `float16` as their `torch_dtype`.\n   However, this was ignored, the default dtype was set to model's\n   default. So, if a quantized Gemma uses `float16`, all parameters\n   are converted to `bfloat16` because that is the model's default.\n   This fails in quantizer gemm, because it expects `float16` arguments.\n\nThis is resolved by setting the dtype of gptq/awq-quantized models to\n`float16`.","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype"}},{"before":null,"after":"bbd6675c5313205189333254fc1f489cbe5e48be","ref":"refs/heads/bugfix/preserve-quantized-dtype","pushedAt":"2024-05-24T19:08:59.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix GPTQ for models which do not have float16 at the default dtype\n\nBefore this change GPTQ models would not work if the model's default\ndata type is not `float16`. For example, Gemma GPTQ models would fail\nbecause the default dtype of Gemma is `bfloat16`. There are two issues:\n\n1. If the default `dtype` is not `float16`, the quantizer's `float16`\n   parameters get converted to that dtype. The kernels cannot deal\n   with non-`float16` types.\n\nThis change resolves this issue by excluding quantizer parameters\nfrom data type conversions.\n\n2. Quantized models will typically have `float16` as their `torch_dtype`.\n   However, this was ignored, the default dtype was set to model's\n   default. So, if a quantized Gemma uses `float16`, all parameters\n   are converted to `bfloat16` because that is the model's default.\n   This fails in quantizer gemm, because it expects `float16` arguments.\n\nThis is resolved by setting the dtype to `torch_dtype`.","shortMessageHtmlLink":"Fix GPTQ for models which do not have float16 at the default dtype"}},{"before":"01e4442ef6061fa3fa32cd4852d018116c6b07fe","after":"d654cf989a541ace9101b99b30908dcb1fe74673","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-24T16:10:51.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Speedup flashdecoding.","shortMessageHtmlLink":"Speedup flashdecoding."}},{"before":null,"after":"364b4497d0438de2ce8b8df45e77300b63c46dfd","ref":"refs/heads/bugfix/gemma-prefix","pushedAt":"2024-05-24T15:35:49.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Fix (flash) Gemma prefix and enable tests","shortMessageHtmlLink":"Fix (flash) Gemma prefix and enable tests"}},{"before":null,"after":"3c74cf9cd41338af4e55755a3dadada99fb8fc51","ref":"refs/heads/flashinfer","pushedAt":"2024-05-24T15:32:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Flashinfer test.","shortMessageHtmlLink":"Flashinfer test."}},{"before":"d580aa1f4db0414fd18cfb158194042b3ea26a82","after":"713aaf19c9edf1fa560e99ae4666db1d9a9f6ffb","ref":"refs/heads/feature/exl2","pushedAt":"2024-05-24T14:54:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Avoid exl2-specific layer logic","shortMessageHtmlLink":"Avoid exl2-specific layer logic"}},{"before":"63e72033b784f0efdc128611f9425de329204db5","after":"01e4442ef6061fa3fa32cd4852d018116c6b07fe","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-24T14:18:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"REvert changes in modeling.","shortMessageHtmlLink":"REvert changes in modeling."}},{"before":"cacba5f21fa5b34e82b7440379aaea939a2c4f80","after":"63e72033b784f0efdc128611f9425de329204db5","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-24T14:16:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Less intrusive.","shortMessageHtmlLink":"Less intrusive."}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEVRUw7QA","startCursor":null,"endCursor":null}},"title":"Activity · huggingface/text-generation-inference"}