Skip to content

Commit

Permalink
Quartz sync: Feb 10, 2024, 10:40 PM
Browse files Browse the repository at this point in the history
  • Loading branch information
AlphaGit committed Feb 11, 2024
1 parent 39b376e commit 3b4ad7c
Show file tree
Hide file tree
Showing 19 changed files with 202 additions and 18 deletions.
16 changes: 6 additions & 10 deletions content/ai/Activation functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ tags:
- activation
- neural networks
---

Activation functions are a part of a "neuron" in a neural network. It introduces a non-linearity so that the network can learn more than linear (or polynomial) relationships in the input-to-output data.

It is called "activation function" because it decides how much that particular neuron will participate in the generation of the output.
Expand All @@ -23,15 +22,12 @@ None of these rules are unbreakable, but good guidelines.

Example of activation functions:

- ReLU (Rectified Linear Unit): $f(x) = max({0, x})$
- Binary/Step: $$\begin{split}f(x) = \begin{cases}
0, & \text {if } x < 0, \\
1, & \text{if } x \ge 0
\end{cases}\end{split}$$
- Sigmoid: $f(x) = \frac{1}{1 + e^{-x}}$
- Linear: $f(x) = x$
- Hyperbolic tangent function (tanh): $f(x) = \frac{1-e^{-x}}{1 + e^{-x}}$
- Softmax: $f(X) = \frac{e^{x_i}}{\sum{e^{x_i}}}$
- [[ReLU]]
- [[Binary or Step function]]
- [[Sigmoid]]
- [[Linear function]]
- [[Hyperbolic tangent function (tanh)]]
- [[Softmax]]

## Sources

Expand Down
13 changes: 13 additions & 0 deletions content/ai/Binary or Step function.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
title: Binary or Step function
tags:
- activation
- ai
- neural_networks
---
[[Activation functions|Activation function]], mostly used in [[neural networks]].

$$\begin{split}f(x) = \begin{cases}
0, & \text {if } x < 0, \\
1, & \text{if } x \ge 0
\end{cases}\end{split}$$
8 changes: 4 additions & 4 deletions content/ai/Extreme Learning Machines.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ tags:
- science
- papers
---

Extreme Learning Machines are single hidden-layer feed-foreward [[neural networks]]. They are one of the [[neural network]] approaches to timeseries forecasting (opposed to [[statistical timeseries forecasting]]).
Extreme Learning Machines are single hidden-layer feed-forward [[neural networks]]. They are one of the [[neural network]] approaches to timeseries forecasting (opposed to [[statistical timeseries forecasting]]).

Original paper by Hung et al, 2004.

Expand All @@ -21,9 +20,10 @@ The training process consists of these steps:

In short, with a single shot we can avoid the multi-step process of iterative training and the backpropagation algorithm that is usually used with feed-forward neural networks.

The tuning of the network will mostly be around its hyperparameters:
The tuning of the network will mostly be around its [[Hyperparameters|hyperparameters]]:

- Hidden layer size
- Selection of activation function
- Selection of [[Activation functions|activation function]]
- Selection of input sources
- Selection of the distribution for random values used in the initialization step

Expand Down
10 changes: 10 additions & 0 deletions content/ai/Hyperbolic tangent function (tanh).md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Hyperbolic tangent function (tanh)
tags:
- ai
- activation
- neural_networks
---
[[Activation functions|Activation function]], mostly used in [[neural networks]].

$$f(x) = \frac{1-e^{-x}}{1 + e^{-x}}$$
19 changes: 19 additions & 0 deletions content/ai/Hyperparameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: Hyperparameters
tags:
- ml
- ai
---
Since ML algorithms have the capability to learn the parameters that make them operate, the parameters that regulate how the learning takes place are named "hyper"-parameters.

Examples:

- [[Learning rate]]
- [[Epochs]]
- [[Early Stop]]
- [[Generation size]]
- [[Mutation rate]]
- [[Number of clusters]]
- [[Neural network architecture]]
- [[Activation functions|Activation function]]
- [[Weight initialization values]]
4 changes: 3 additions & 1 deletion content/ai/LLM Speed Performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,12 @@ prefill = torch.compile(
### 4. Improve the attention mechanism

The attention mechanism (that picks the right token based on the changing context) is also a quadratic algorithm. All tokens attend to all tokens, leading to $N^2$ scaling.
#### 3.1. Use vLLM or paged-attention
#### 3.1. Use vLLM or paged attention

Both these techniques work as a middle-step between the memory available and the memory required. It's very similar to another level of memory where chunks of it are loaded in the device while the rest of it is paged and kept in a lower level.

[[vLLM]]

#### 3.2. FlashAttention

Instead of storing the full attention matrix in the HBM, do block-wise computation of the dot product, such that all the computation is performed in the L2 cache.
Expand Down
10 changes: 10 additions & 0 deletions content/ai/Linear function.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Linear function
tags:
- ai
- neural_networks
- activation
---
Very simple [[activation functions|Activation function]], mostly used in [[neural networks]].

$$f(x) = x$$
12 changes: 12 additions & 0 deletions content/ai/ReLU.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: ReLU
tags:
- activation
- ai
- neural_networks
---
[[Activation functions|Activation function]], mostly used in [[neural networks]].

ReLU stands for "Rectified Linear Unit".

$$f(x) = max({0, x})$$
10 changes: 10 additions & 0 deletions content/ai/Sigmoid.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Sigmoid
tags:
- activation
- neural_networks
- ai
---
[[Activation functions|Activation function]], mostly used in [[neural networks]].

$$f(x) = \frac{1}{1 + e^{-x}}$$
10 changes: 10 additions & 0 deletions content/ai/Softmax.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Softmax
tags:
- activation
- ai
- neural_networks
---
[[Activation functions|Activation function]], mostly used in [[neural networks]].

$$f(X) = \frac{e^{x_i}}{\sum{e^{x_i}}}$$
4 changes: 1 addition & 3 deletions content/ai/benchmarks/MTEB.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ tags:
- embedding
- papers
---
> MTEB consists of 58 datasets covering 112 languages from 8 embedding tasks: Bitext mining, classification, clustering, pair classification, reranking, retrieval, STS and summarization.
> MTEB consists of 58 datasets covering 112 languages from 8 embedding tasks: Bitext mining, classification, [[clustering]], pair classification, reranking, retrieval, STS and summarization.
## Tasks

Expand All @@ -34,6 +34,4 @@ Code available at: https://github.com/embeddings-benchmark/mteb

Leaderboard in HuggingFace: https://huggingface.co/spaces/mteb/leaderboard

Other [[NLP Benchmarks]]

[^MTEB]: [MTEB: Massive Text Embedding Benchmark](https://arxiv.org/pdf/2210.07316.pdf)
23 changes: 23 additions & 0 deletions content/ai/llms/vLLM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: vLLM
tags:
- ai
- llm
---
 vLLM utilizes **PagedAttention**, our new attention algorithm that effectively manages attention keys and values. vLLM equipped with PagedAttention redefines the new state of the art in LLM serving: it delivers up to 24x higher throughput than HuggingFace Transformers, without requiring any model architecture changes.

 In the autoregressive decoding process, all the input tokens to the LLM produce their attention key and value tensors, and these tensors are kept in GPU memory to generate next tokens. These cached key and value tensors are often referred to as KV cache. The KV cache is

- _Large:_ Takes up to 1.7GB for a single sequence in LLaMA-13B.
- _Dynamic:_ Its size depends on the sequence length, which is highly variable and unpredictable.

PagedAttention partitions the KV cache of each sequence into blocks, each block containing the keys and values for a fixed number of tokens. During the attention computation, the PagedAttention kernel identifies and fetches these blocks efficiently.

PagedAttention has another key advantage: efficient memory sharing. For example, in _parallel sampling_, multiple output sequences are generated from the same prompt.

PageAttention’s memory sharing greatly reduces the memory overhead of complex sampling algorithms, such as parallel sampling and beam search, cutting their memory usage by up to 55%.

## Sources

- [vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention](https://blog.vllm.ai/2023/06/20/vllm.html)

23 changes: 23 additions & 0 deletions content/architecture/API Gateway.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: API Gateway
tags:
- architecture
---
An API gateway acts as a single entry point for client requests. The API gateway is responsible for request routing, composition, and protocol translation. It also provides additional features like authentication, authorization, caching, and rate limiting.

The API Gateway:

- parses and validates the attributes in the HTTP request.
- checks allow/deny lists.
- authenticates and authorizes through an identity provider.
- applies rate-limiting rules.
- routes the request to the relevant backend service by path matching.
- transforms the request into the appropriate protocol and forwards it to backend microservices.
- handles any errors that may arise during request processing for graceful degradation of service.
- implements resiliency patterns like [[circuit brakes]] to detect failures and prevent overloading interconnected services, avoiding cascading failures.
- utilizes observability tools for logging, monitoring, tracing, and debugging.
- can optionally cache responses to common requests to improve responsiveness.

The API gateway is different from a load balancer. While both handle network traffic, the API gateway operates at the application layer, mainly handling HTTP requests; the load balancer mostly operates at the transport layer.[^bbg]

[^bbg]: [ByteByteGo: 6 More Microservices Interview Questions](https://blog.bytebytego.com/p/6-more-microservices-interview-questions)
9 changes: 9 additions & 0 deletions content/architecture/API Key Authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: API Key Authentication
tags:
- security
- http
---
Assigns unique keys to users or applications, sent in headers or parameters; while simple, it might lack the security features of token-based or OAuth methods.[^bbg91]

[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods)
9 changes: 9 additions & 0 deletions content/architecture/Basic Authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: Basic Authentication
tags:
- http
- security
---
Involves sending a username and password with each request, but can be less secure without encryption.

[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods)
9 changes: 9 additions & 0 deletions content/architecture/OAuth Authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: OAuth Authentication
tags:
- security
- http
---
Enables third-party limited access to user resources without revealing credentials by issuing access tokens after user authentication.[^bbg91]

[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods)
10 changes: 10 additions & 0 deletions content/architecture/REST Authentication methods.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: REST Authentication methods
tags:
- http
- security
---
- [[Basic Authentication]]
- [[Token Authentication]]
- [[OAuth Authentication]]
- [[API Key Authentication]]
9 changes: 9 additions & 0 deletions content/architecture/Token Authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: Token Authentication
tags:
- security
- http
---
Uses generated tokens, like JSON Web Tokens (JWT), exchanged between client and server, offering enhanced security without sending login credentials with each request.[^bbg91]

[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods)
12 changes: 12 additions & 0 deletions content/databases/Redis AOF (Append Only File).md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Redis AOF (Append Only File)
tags:
- redis
- databases
---
AOF persistence logs every write operation received by the server. These operations can then be replayed again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself.[^redis]

Unlike a write-ahead log, the Redis AOF log is a write-after log. Redis executes commands to modify the data in memory first and then writes it to the log file.[^bbg91]

[^bbg91]: [ByteByteGo EP91: REST API Authentication Methods](https://blog.bytebytego.com/p/ep91-rest-api-authentication-methods)
[^redis]: [Redis persistence](https://redis.io/docs/management/persistence/)

0 comments on commit 3b4ad7c

Please sign in to comment.