generated from jackyzha0/quartz
-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
19 changed files
with
202 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
--- | ||
title: Binary or Step function | ||
tags: | ||
- activation | ||
- ai | ||
- neural_networks | ||
--- | ||
[[Activation functions|Activation function]], mostly used in [[neural networks]]. | ||
|
||
$$\begin{split}f(x) = \begin{cases} | ||
0, & \text {if } x < 0, \\ | ||
1, & \text{if } x \ge 0 | ||
\end{cases}\end{split}$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Hyperbolic tangent function (tanh) | ||
tags: | ||
- ai | ||
- activation | ||
- neural_networks | ||
--- | ||
[[Activation functions|Activation function]], mostly used in [[neural networks]]. | ||
|
||
$$f(x) = \frac{1-e^{-x}}{1 + e^{-x}}$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
title: Hyperparameters | ||
tags: | ||
- ml | ||
- ai | ||
--- | ||
Since ML algorithms have the capability to learn the parameters that make them operate, the parameters that regulate how the learning takes place are named "hyper"-parameters. | ||
|
||
Examples: | ||
|
||
- [[Learning rate]] | ||
- [[Epochs]] | ||
- [[Early Stop]] | ||
- [[Generation size]] | ||
- [[Mutation rate]] | ||
- [[Number of clusters]] | ||
- [[Neural network architecture]] | ||
- [[Activation functions|Activation function]] | ||
- [[Weight initialization values]] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Linear function | ||
tags: | ||
- ai | ||
- neural_networks | ||
- activation | ||
--- | ||
Very simple [[activation functions|Activation function]], mostly used in [[neural networks]]. | ||
|
||
$$f(x) = x$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
--- | ||
title: ReLU | ||
tags: | ||
- activation | ||
- ai | ||
- neural_networks | ||
--- | ||
[[Activation functions|Activation function]], mostly used in [[neural networks]]. | ||
|
||
ReLU stands for "Rectified Linear Unit". | ||
|
||
$$f(x) = max({0, x})$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Sigmoid | ||
tags: | ||
- activation | ||
- neural_networks | ||
- ai | ||
--- | ||
[[Activation functions|Activation function]], mostly used in [[neural networks]]. | ||
|
||
$$f(x) = \frac{1}{1 + e^{-x}}$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Softmax | ||
tags: | ||
- activation | ||
- ai | ||
- neural_networks | ||
--- | ||
[[Activation functions|Activation function]], mostly used in [[neural networks]]. | ||
|
||
$$f(X) = \frac{e^{x_i}}{\sum{e^{x_i}}}$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
--- | ||
title: vLLM | ||
tags: | ||
- ai | ||
- llm | ||
--- | ||
vLLM utilizes **PagedAttention**, our new attention algorithm that effectively manages attention keys and values. vLLM equipped with PagedAttention redefines the new state of the art in LLM serving: it delivers up to 24x higher throughput than HuggingFace Transformers, without requiring any model architecture changes. | ||
|
||
In the autoregressive decoding process, all the input tokens to the LLM produce their attention key and value tensors, and these tensors are kept in GPU memory to generate next tokens. These cached key and value tensors are often referred to as KV cache. The KV cache is | ||
|
||
- _Large:_ Takes up to 1.7GB for a single sequence in LLaMA-13B. | ||
- _Dynamic:_ Its size depends on the sequence length, which is highly variable and unpredictable. | ||
|
||
PagedAttention partitions the KV cache of each sequence into blocks, each block containing the keys and values for a fixed number of tokens. During the attention computation, the PagedAttention kernel identifies and fetches these blocks efficiently. | ||
|
||
PagedAttention has another key advantage: efficient memory sharing. For example, in _parallel sampling_, multiple output sequences are generated from the same prompt. | ||
|
||
PageAttention’s memory sharing greatly reduces the memory overhead of complex sampling algorithms, such as parallel sampling and beam search, cutting their memory usage by up to 55%. | ||
|
||
## Sources | ||
|
||
- [vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention](https://blog.vllm.ai/2023/06/20/vllm.html) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
--- | ||
title: API Gateway | ||
tags: | ||
- architecture | ||
--- | ||
An API gateway acts as a single entry point for client requests. The API gateway is responsible for request routing, composition, and protocol translation. It also provides additional features like authentication, authorization, caching, and rate limiting. | ||
|
||
The API Gateway: | ||
|
||
- parses and validates the attributes in the HTTP request. | ||
- checks allow/deny lists. | ||
- authenticates and authorizes through an identity provider. | ||
- applies rate-limiting rules. | ||
- routes the request to the relevant backend service by path matching. | ||
- transforms the request into the appropriate protocol and forwards it to backend microservices. | ||
- handles any errors that may arise during request processing for graceful degradation of service. | ||
- implements resiliency patterns like [[circuit brakes]] to detect failures and prevent overloading interconnected services, avoiding cascading failures. | ||
- utilizes observability tools for logging, monitoring, tracing, and debugging. | ||
- can optionally cache responses to common requests to improve responsiveness. | ||
|
||
The API gateway is different from a load balancer. While both handle network traffic, the API gateway operates at the application layer, mainly handling HTTP requests; the load balancer mostly operates at the transport layer.[^bbg] | ||
|
||
[^bbg]: [ByteByteGo: 6 More Microservices Interview Questions](https://blog.bytebytego.com/p/6-more-microservices-interview-questions) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
title: API Key Authentication | ||
tags: | ||
- security | ||
- http | ||
--- | ||
Assigns unique keys to users or applications, sent in headers or parameters; while simple, it might lack the security features of token-based or OAuth methods.[^bbg91] | ||
|
||
[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
title: Basic Authentication | ||
tags: | ||
- http | ||
- security | ||
--- | ||
Involves sending a username and password with each request, but can be less secure without encryption. | ||
|
||
[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
title: OAuth Authentication | ||
tags: | ||
- security | ||
- http | ||
--- | ||
Enables third-party limited access to user resources without revealing credentials by issuing access tokens after user authentication.[^bbg91] | ||
|
||
[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: REST Authentication methods | ||
tags: | ||
- http | ||
- security | ||
--- | ||
- [[Basic Authentication]] | ||
- [[Token Authentication]] | ||
- [[OAuth Authentication]] | ||
- [[API Key Authentication]] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
title: Token Authentication | ||
tags: | ||
- security | ||
- http | ||
--- | ||
Uses generated tokens, like JSON Web Tokens (JWT), exchanged between client and server, offering enhanced security without sending login credentials with each request.[^bbg91] | ||
|
||
[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
--- | ||
title: Redis AOF (Append Only File) | ||
tags: | ||
- redis | ||
- databases | ||
--- | ||
AOF persistence logs every write operation received by the server. These operations can then be replayed again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself.[^redis] | ||
|
||
Unlike a write-ahead log, the Redis AOF log is a write-after log. Redis executes commands to modify the data in memory first and then writes it to the log file.[^bbg91] | ||
|
||
[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods) | ||
[^redis]: [Redis persistence](https://redis.io/docs/management/persistence/) |