Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Upstream encoder/decoder support based on multiple blocktables #161

Draft
wants to merge 233 commits into
base: main
Choose a base branch
from

Commits on Feb 22, 2024

  1. Configuration menu
    Copy the full SHA
    d7f3964 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5574081 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    344020c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    95529e3 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    93dc5a2 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    fd5dcc5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c530e2c View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6f32cdd View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    4caf704 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    57f0449 View commit details
    Browse the repository at this point in the history

Commits on Feb 23, 2024

  1. Configuration menu
    Copy the full SHA
    f7c1234 View commit details
    Browse the repository at this point in the history

Commits on Feb 25, 2024

  1. Configuration menu
    Copy the full SHA
    ef978fe View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2024

  1. Configuration menu
    Copy the full SHA
    70f3e8e View commit details
    Browse the repository at this point in the history
  2. Optimize Triton MoE Kernel (vllm-project#2979)

    Co-authored-by: Cade Daniel <edacih@gmail.com>
    pcmoritz and cadedaniel committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    cfc15a1 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d6e4a13 View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2024

  1. Configuration menu
    Copy the full SHA
    d9f726c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c1c0d00 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4dd6416 View commit details
    Browse the repository at this point in the history
  4. Support Orion model (vllm-project#2539)

    Co-authored-by: zhangdacheng <zhangdacheng@ainirobot.com>
    Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
    3 people committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    48a8f4a View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    2410e32 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    4bd18ec View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    e0ade06 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    8b430d7 View commit details
    Browse the repository at this point in the history
  9. Enable GQA support in the prefix prefill kernels (vllm-project#3007)

    Signed-off-by: Tao He <sighingnow@gmail.com>
    sighingnow committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    71bcaf9 View commit details
    Browse the repository at this point in the history

Commits on Feb 28, 2024

  1. Configuration menu
    Copy the full SHA
    a868310 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e46fa5d View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3b7178c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    929b4f2 View commit details
    Browse the repository at this point in the history

Commits on Feb 29, 2024

  1. t5-small

    Jin Shang authored and js8544 committed Feb 29, 2024
    Configuration menu
    Copy the full SHA
    dd82ba3 View commit details
    Browse the repository at this point in the history
  2. fix

    js8544 committed Feb 29, 2024
    Configuration menu
    Copy the full SHA
    f2fd579 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    01a5d18 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a6d471c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    9289e57 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    bfdcfa6 View commit details
    Browse the repository at this point in the history
  7. lint

    js8544 committed Feb 29, 2024
    Configuration menu
    Copy the full SHA
    2fb6905 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    2c08ff2 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    29a8d6a View commit details
    Browse the repository at this point in the history
  10. Add guided decoding for OpenAI API server (vllm-project#2819)

    Co-authored-by: br3no <breno@veltefaria.de>
    Co-authored-by: simon-mo <simon.mo@hey.com>
    3 people committed Feb 29, 2024
    Configuration menu
    Copy the full SHA
    703e42e View commit details
    Browse the repository at this point in the history

Commits on Mar 1, 2024

  1. Configuration menu
    Copy the full SHA
    54d3544 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    27ca23d View commit details
    Browse the repository at this point in the history
  3. docs: Add tutorial on deploying vLLM model with KServe (vllm-project#…

    …2586)
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    terrytangyuan committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    49d849b View commit details
    Browse the repository at this point in the history
  4. fix relative import path of protocol.py (vllm-project#3134)

    Co-authored-by: huohuarong <huohuarong@zuoshouyisheng.com>
    Huarong and Huarong committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    90fbf12 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    be58c3b View commit details
    Browse the repository at this point in the history
  6. Integrate Marlin Kernels for Int4 GPTQ inference (vllm-project#2497)

    Co-authored-by: Robert Shaw <114415538+rib-2@users.noreply.github.com>
    Co-authored-by: alexm <alexm@neuralmagic.com>
    3 people committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    c0c2335 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    82091b8 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    70837fd View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    42a6e2b View commit details
    Browse the repository at this point in the history
  10. allow user chose log level by --log-level instead of fixed 'info'. (v…

    …llm-project#3109)
    
    Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
    Co-authored-by: Simon Mo <simon.mo@hey.com>
    3 people committed Mar 1, 2024
    Configuration menu
    Copy the full SHA
    29e70e3 View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2024

  1. Configuration menu
    Copy the full SHA
    e3fd30d View commit details
    Browse the repository at this point in the history
  2. Merge pull request #1 from afeldman-nm/enc_dec_t5

    T5 enc/dec example file; linting/formatting
    js8544 committed Mar 2, 2024
    Configuration menu
    Copy the full SHA
    db726e6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    43e920e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    431f014 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    37fcf99 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    baee28c View commit details
    Browse the repository at this point in the history
  7. Merge pull request #2 from afeldman-nm/enc_dec_t5

    Small PR for debug print statements
    js8544 committed Mar 2, 2024
    Configuration menu
    Copy the full SHA
    4bf056b View commit details
    Browse the repository at this point in the history
  8. Add Automatic Prefix Caching (vllm-project#2762)

    Co-authored-by: ElizaWszola <eliza@neuralmagic.com>
    Co-authored-by: Michael Goin <michael@neuralmagic.com>
    3 people committed Mar 2, 2024
    Configuration menu
    Copy the full SHA
    ce4f5a2 View commit details
    Browse the repository at this point in the history

Commits on Mar 3, 2024

  1. Configuration menu
    Copy the full SHA
    d65fac2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    996d095 View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2024

  1. Make it easy to profile workers with nsight (vllm-project#3162)

    Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
    pcmoritz and ywang96 committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    17c3103 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d0fae88 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    901cf4c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    27a7b07 View commit details
    Browse the repository at this point in the history
  5. enable --gpu-memory-utilization in benchmark_throughput.py (vllm-proj…

    …ect#3175)
    
    Co-authored-by: zixiao <shunli.dsl@alibaba-inc.com>
    AllenDou and zixiao committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    9cbc7e5 View commit details
    Browse the repository at this point in the history
  6. [Minor fix] The domain dns.google may cause a socket.gaierror excepti…

    …on (vllm-project#3176)
    
    Co-authored-by: guofangze <guofangze@kuaishou.com>
    ttbachyinsda and guofangze committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    76e8a70 View commit details
    Browse the repository at this point in the history
  7. Push logprob generation to LLMEngine (vllm-project#3065)

    Co-authored-by: Avnish Narayan <avnish@anyscale.com>
    Yard1 and avnishn committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    22de452 View commit details
    Browse the repository at this point in the history
  8. Add health check, make async Engine more robust (vllm-project#3015)

    Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
    Yard1 and zhuohan123 committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    ff578ca View commit details
    Browse the repository at this point in the history
  9. Fix the openai benchmarking requests to work with latest OpenAI apis (v…

    …llm-project#2992)
    
    Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
    wangchen615 and ywang96 committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    9a4548b View commit details
    Browse the repository at this point in the history

Commits on Mar 5, 2024

  1. [ROCm] enable cupy in order to enable cudagraph mode for AMD GPUs (vl…

    …lm-project#3123)
    
    Co-authored-by: lcskrishna <lollachaitanya@gmail.com>
    hongxiayang and lcskrishna committed Mar 5, 2024
    Configuration menu
    Copy the full SHA
    05af6da View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8a5060f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    29d6f44 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a4950ba View commit details
    Browse the repository at this point in the history
  5. small cleanup

    afeldman-nm committed Mar 5, 2024
    Configuration menu
    Copy the full SHA
    9c03760 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    8999ec3 View commit details
    Browse the repository at this point in the history

Commits on Mar 6, 2024

  1. [Fix] Avoid pickling entire LLMEngine for Ray workers (vllm-project#3207

    )
    
    Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
    njhill and Yard1 committed Mar 6, 2024
    Configuration menu
    Copy the full SHA
    2efce05 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #3 from afeldman-nm/enc_dec_t5

    fix _make_tensor_with_pad args change which broke decoder scenarios
    js8544 committed Mar 6, 2024
    Configuration menu
    Copy the full SHA
    9f20ccf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    24aecf4 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a33ce60 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4cb3b92 View commit details
    Browse the repository at this point in the history

Commits on Mar 7, 2024

  1. Configuration menu
    Copy the full SHA
    d3c04b6 View commit details
    Browse the repository at this point in the history
  2. Update requirements-dev.txt to include package for benchmarking scrip…

    …ts. (vllm-project#3181)
    
    Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
    wangchen615 and zhuohan123 committed Mar 7, 2024
    Configuration menu
    Copy the full SHA
    cbf4c05 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2daf23a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    385da2d View commit details
    Browse the repository at this point in the history
  5. arg naming fix

    afeldman-nm committed Mar 7, 2024
    Configuration menu
    Copy the full SHA
    6d6dccd View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    8cbba46 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2024

  1. Configuration menu
    Copy the full SHA
    b35cc93 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d2339d6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c59e120 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1ece1ae View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    99c3cfb View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    1cb0cc2 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c2c5e09 View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2024

  1. Configuration menu
    Copy the full SHA
    f48c679 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8437bae View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2024

  1. Configuration menu
    Copy the full SHA
    0bba88d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e4a28e5 View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2024

  1. Configuration menu
    Copy the full SHA
    9e8744a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4b59f00 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2f8844b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    657061f View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4c92270 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    c9415c1 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    654865e View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2024

  1. Configuration menu
    Copy the full SHA
    7035178 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dbec357 View commit details
    Browse the repository at this point in the history
  3. docs: Add BentoML deployment doc (vllm-project#3336)

    Signed-off-by: Sherlock113 <sherlockxu07@gmail.com>
    Sherlock113 committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    b0925b3 View commit details
    Browse the repository at this point in the history
  4. llm_engine.py conflict resolution; removed prefix caching code; Seque…

    …nce constructor call takes is_encoder_decoder, eos_token_id, lora_request calls; set is_encoder_decoder field in constructor
    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    4b2a121 View commit details
    Browse the repository at this point in the history
  5. actually updated Sequence constructor to take i_encoder_decoder, eos_…

    …token_id, lora_request arguments
    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    a93c17d View commit details
    Browse the repository at this point in the history
  6. xformers.py accept incoming changes; replace paged_attention function…

    … with import of PagedAttentionImpl
    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    a62c3af View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c31921f View commit details
    Browse the repository at this point in the history
  8. attempt at fixing model_runner conflicts related to encoder/decoder &…

    … prefix caching; low confidence of success
    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    0c78be9 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    e25e6b8 View commit details
    Browse the repository at this point in the history
  10. refactoring, including: moved enc_dec_attention.py into vllm/model_ex…

    …ecutor/layers/attention
    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    7f70d76 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    36c8291 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    08f268a View commit details
    Browse the repository at this point in the history
  13. augmented paged attention with context_lens, max_context_len, block_t…

    …ables arguments to override input_metadata values; tests still pass but enc/dec still fails
    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    b9b0600 View commit details
    Browse the repository at this point in the history
  14. linting/formatting fixes

    afeldman-nm committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    63e9dca View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    4d7e5a8 View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2024

  1. Configuration menu
    Copy the full SHA
    49a3c86 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    602358f View commit details
    Browse the repository at this point in the history
  3. [Fix] Fix quantization="gptq" when using Marlin (vllm-project#3319)

    Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
    DreamTeamWangbowen and WoosukKwon committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    b167109 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e221910 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ba8dc95 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    739c350 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    ae0ccb4 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    7e9bd08 View commit details
    Browse the repository at this point in the history
  9. Fix lint (vllm-project#3388)

    Yard1 committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    c33afd8 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    eeab52a View commit details
    Browse the repository at this point in the history

Commits on Mar 14, 2024

  1. Configuration menu
    Copy the full SHA
    81653d9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a37415c View commit details
    Browse the repository at this point in the history
  3. [Kernel] change benchmark script so that result can be directly used;…

    … tune moe kernel in A100/H100 with tp=2,4,8 (vllm-project#3389)
    youkaichao committed Mar 14, 2024
    Configuration menu
    Copy the full SHA
    8fe8386 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    06ec486 View commit details
    Browse the repository at this point in the history
  5. Add args for mTLS support (vllm-project#3410)

    Co-authored-by: Daniel Clark <daniel.clark@ibm.com>
    declark1 and Daniel Clark committed Mar 14, 2024
    Configuration menu
    Copy the full SHA
    c17ca8e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    dfc7740 View commit details
    Browse the repository at this point in the history
  7. Fix assertion failure in Qwen 1.5 with prefix caching enabled (vllm-p…

    …roject#3373)
    
    Co-authored-by: Cade Daniel <edacih@gmail.com>
    chenxu2048 and cadedaniel committed Mar 14, 2024
    Configuration menu
    Copy the full SHA
    54be8a0 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b983ba3 View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2024

  1. Configuration menu
    Copy the full SHA
    78b6c48 View commit details
    Browse the repository at this point in the history
  2. [Misc] add HOST_IP env var (vllm-project#3419)

    Co-authored-by: Simon Mo <simon.mo@hey.com>
    youkaichao and simon-mo committed Mar 15, 2024
    Configuration menu
    Copy the full SHA
    b522c44 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    21539e6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    253a980 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    429284d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    a7c8716 View commit details
    Browse the repository at this point in the history
  7. [Fix] Add args for mTLS support (vllm-project#3430)

    Co-authored-by: declark1 <daniel.clark@ibm.com>
    declark1 and declark1 committed Mar 15, 2024
    Configuration menu
    Copy the full SHA
    03d37f2 View commit details
    Browse the repository at this point in the history
  8. Fixes the misuse/mixuse of time.time()/time.monotonic() (vllm-project…

    …#3220)
    
    Signed-off-by: Tao He <sighingnow@gmail.com>
    Co-authored-by: simon-mo <simon.mo@hey.com>
    sighingnow and simon-mo committed Mar 15, 2024
    Configuration menu
    Copy the full SHA
    14b8ae0 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    604f235 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    a7af453 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    8fa7357 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    fb96c1e View commit details
    Browse the repository at this point in the history

Commits on Mar 16, 2024

  1. Configuration menu
    Copy the full SHA
    10585e0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    bb7a219 View commit details
    Browse the repository at this point in the history
  3. [Misc] PR templates (vllm-project#3413)

    Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
    youkaichao and zhuohan123 committed Mar 16, 2024
    Configuration menu
    Copy the full SHA
    413366e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    0b60121 View commit details
    Browse the repository at this point in the history
  5. fixed example

    afeldman-nm committed Mar 16, 2024
    Configuration menu
    Copy the full SHA
    d44257e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    19c5c4b View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    3123f15 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    14e3f9a View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    cf6ff18 View commit details
    Browse the repository at this point in the history
  10. fix lint

    simon-mo committed Mar 16, 2024
    Configuration menu
    Copy the full SHA
    ad50bf4 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    8e67598 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    120157f View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    6b78837 View commit details
    Browse the repository at this point in the history

Commits on Mar 17, 2024

  1. [Misc] Use dataclass for InputMetadata (vllm-project#3452)

    Co-authored-by: youkaichao <youkaichao@126.com>
    WoosukKwon and youkaichao committed Mar 17, 2024
    Configuration menu
    Copy the full SHA
    abfc4f3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    93348d9 View commit details
    Browse the repository at this point in the history

Commits on Mar 18, 2024

  1. Configuration menu
    Copy the full SHA
    9101d83 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8c654c0 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    482b0ad View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    097aa0e View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c0c17d4 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    9fdf3de View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    49eedea View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b30880a View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2024

  1. Configuration menu
    Copy the full SHA
    b37cdce View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6a9c583 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ef65dcf View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    7341c77 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c614cfe View commit details
    Browse the repository at this point in the history
  6. merged upstream

    afeldman-nm committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    c2f97b6 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    2a60c9b View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    cc63d03 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    63e8b28 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    0536ff5 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    20478c4 View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2024

  1. [PREFIX CACHING FOLLOW UP] A bunch of fixes to block allocator perfor…

    …mance when automatic prefix caching is disabled (vllm-project#3357)
    
    Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
    ElizaWszola and zhuohan123 committed Mar 20, 2024
    Configuration menu
    Copy the full SHA
    9474e89 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4ad521d View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5ee1449 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    84eaa68 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ba8ae1d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    80e2548 View commit details
    Browse the repository at this point in the history
  7. [1/n] Triton sampling kernel (vllm-project#3186)

    Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
    Yard1 and ywang96 committed Mar 20, 2024
    Configuration menu
    Copy the full SHA
    426ec4e View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6e435de View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f1c0fc3 View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2024

  1. Configuration menu
    Copy the full SHA
    523e30e View commit details
    Browse the repository at this point in the history
  2. [PREFIX CACHING FOLLOW UP] OrderedDict-based evictor (vllm-project#3431)

    Co-authored-by: rsnm2 <rshaw@neuralmagic.com>
    Co-authored-by: Luka <luka@paperspace>
    3 people committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    6ebd02b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3bbff9e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    4c07dd2 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    8657323 View commit details
    Browse the repository at this point in the history
  6. [Misc] Bump up transformers to v4.39.0 & Remove StarCoder2Config (vll…

    …m-project#3551)
    
    Co-authored-by: Roy <jasonailu87@gmail.com>
    Co-authored-by: Roger Meier <r.meier@siemens.com>
    3 people committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    c188ecb View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b7050ca View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2024

  1. Configuration menu
    Copy the full SHA
    ea5f14e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e90fc21 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f721096 View commit details
    Browse the repository at this point in the history
  4. merged in upstream-main

    afeldman-nm committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    7d4972c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    23a5da5 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    e32fb9c View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    ae1c368 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    691c2c1 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    e240eb4 View commit details
    Browse the repository at this point in the history
  10. t5 Sampler does not pass vocab size to constructor; input_metadata.pr…

    …ompt_lens is treated as a list in T5
    afeldman-nm committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    cbfba8e View commit details
    Browse the repository at this point in the history
  11. add_request now correctly swaps decoder_prompt, prompt in encoder/dec…

    …oder mode; removed encoder/decoder argument of Sequence
    afeldman-nm committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    501551c View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    08435e4 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. wip multi blocktable

    afeldman-nm committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    6e459a2 View commit details
    Browse the repository at this point in the history
  2. wip

    afeldman-nm committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    8e1ca33 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e097732 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2a44585 View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2024

  1. Configuration menu
    Copy the full SHA
    91a4608 View commit details
    Browse the repository at this point in the history

Commits on Mar 27, 2024

  1. inefficient but effective & Attention-wrapper-compatible implementati…

    …on of relative position encoding based on packed-variable-length-sequences
    afeldman-nm committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    d0c5e36 View commit details
    Browse the repository at this point in the history

Commits on Mar 28, 2024

  1. wip cross-attention

    afeldman-nm committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    3737d5b View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2024

  1. first pass at enc/dec support that runs e2e but doesn't produce corre…

    …ct T5 inference result. Nothing is broken by this commit, unless there is a subsequent commit with changes in order to pass regression tests.
    afeldman-nm committed Apr 1, 2024
    Configuration menu
    Copy the full SHA
    38946ed View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3c39f55 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4ec2fde View commit details
    Browse the repository at this point in the history
  4. works on bsz = 1

    afeldman-nm committed Apr 1, 2024
    Configuration menu
    Copy the full SHA
    38f55ed View commit details
    Browse the repository at this point in the history
  5. intermediate activations for prompt_run look right! Decoded token loo…

    …ks wrong though. Added not_causal option for attn_bias to kernel interface contracts; also switched to batch size 1 to avoid incorrectness likely caused by packed-variable-sequence-length mask having zeroes rather than -inf's
    afeldman-nm committed Apr 1, 2024
    Configuration menu
    Copy the full SHA
    1aedc80 View commit details
    Browse the repository at this point in the history

Commits on Apr 2, 2024

  1. wip

    afeldman-nm committed Apr 2, 2024
    Configuration menu
    Copy the full SHA
    c1258b4 View commit details
    Browse the repository at this point in the history

Commits on Apr 3, 2024

  1. passing with t5-small

    afeldman-nm committed Apr 3, 2024
    Configuration menu
    Copy the full SHA
    0af1022 View commit details
    Browse the repository at this point in the history

Commits on Apr 4, 2024

  1. vLLM T5 matches nativegit status! fixes: decode-phase cross-input-met…

    …adata has correct blocktable, slot_mapping=None, and correct (max) context length(s) (derived from prompt); decode-phase decoder self-attention relative position encoding mask has 1 x K geometry where 1 is the number of new tokens generated in a step and K is context length padded to the nearest multiple of block size, and also mask is reshuffled with contiguous (); ensured general correctness of cross-attention input_metadata; modified T5 example script to prevent HF/vLLM T5 instances from being length limited; net effect: batch-size 1 seems to work but batch-size >1 not supported
    afeldman-nm committed Apr 4, 2024
    Configuration menu
    Copy the full SHA
    9e8d234 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f5242a0 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    de0fd31 View commit details
    Browse the repository at this point in the history
  4. WIP google/flan-t5-xxxx

    afeldman-nm committed Apr 4, 2024
    Configuration menu
    Copy the full SHA
    5a67647 View commit details
    Browse the repository at this point in the history
  5. removed print statement

    afeldman-nm committed Apr 4, 2024
    Configuration menu
    Copy the full SHA
    ed05d47 View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2024

  1. batched enc/dec example

    afeldman-nm committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    d5a8b92 View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2024

  1. Configuration menu
    Copy the full SHA
    f555f5d View commit details
    Browse the repository at this point in the history

Commits on Apr 17, 2024

  1. bs >1 prefill works

    afeldman-nm committed Apr 17, 2024
    Configuration menu
    Copy the full SHA
    2c12b44 View commit details
    Browse the repository at this point in the history
  2. small change to examples

    afeldman-nm committed Apr 17, 2024
    Configuration menu
    Copy the full SHA
    dba02b2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    db201b6 View commit details
    Browse the repository at this point in the history