Skip to content

Latest commit

ย 

History

History
397 lines (210 loc) ยท 19.7 KB

Understanding-Swift-Performance.md

File metadata and controls

397 lines (210 loc) ยท 19.7 KB

Understanding Swift Performance

๐Ÿ“… 2020.5.14 (THU)

WWDC2016 | Session : 416 | Category : Performance

๐Ÿ”— Understanding Swift Performance - WWDC 2016 - Videos - Apple Developer

/WWDC2016/images/Understanding-Swift-Performance/Untitled.png

Swift has a variety of first class types and various mechanisms for code reuse and dynamism.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%201.png

Are value or reference semantics more appropriate?

How dynamic do you need this abstraction to be?

Taking performance inplications into account often helps guide me to a more idiomatic solution.

Understand the implementation to understand performance

/WWDC2016/images/Understanding-Swift-Performance/Untitled%202.png

Dimensions of Performance

/WWDC2016/images/Understanding-Swift-Performance/Untitled%203.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%204.png

If we want to write fast Swift code, we're going to need to avoid paying for dynamism and runtime that we're not taking advantage of.

Allocation

Swift automatically allocates and deallocates memory on your behalf.

Stack

/WWDC2016/images/Understanding-Swift-Performance/Untitled%205.png

When we call into a function, we can allocate that memory that we need just by trivially decrementing the stack pointer to make space. And when we've finished executing our function, we can trivially deallocate that memory just by incrementing the stack pointer back up to where it was before we called this function.

ํ•จ์ˆ˜ ํ˜ธ์ถœ ์‹œ์—, ์Šคํƒ ํฌ์ธํ„ฐ๊ฐ€ ๊ณต๊ฐ„์„ ๋งŒ๋“ค๋„๋ก ํ•˜๋ฉด ๊ฐ„๋‹จํ•˜๊ฒŒ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น์„ ํ•  ์ˆ˜ ์žˆ๋‹ค.

ํ•จ์ˆ˜ ์‹คํ–‰์ด ๋๋‚ฌ์„ ๋•Œ๋„, ์Šคํƒ ํฌ์ธํ„ฐ๋ฅผ ํ•จ์ˆ˜ ์‹คํ–‰ ์ „์— ์žˆ๋˜ ๊ณณ์œผ๋กœ ๋‹ค์‹œ ์ฆ๊ฐ€์‹œํ‚ค๋ฉด ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ•ด์ œ ํ•  ์ˆ˜ ์žˆ๋‹ค.

It's literally the cost of assigning an integer. So this is in contrast to heap, which is more dynamic, but less efficient than the stack.

integer ํ• ๋‹นํ•˜๋Š” ๊ฒƒ๋งŒํผ์˜ ๋น„์šฉ์ด ๋“ ๋‹ค. (๋น ๋ฅด๋‹ค) ๋” ๋™์ ์ด์ง€๋งŒ ํšจ์œจ์€ ์•ˆ์ข‹์€ ํž™๊ณผ ๋น„๊ต ๋Œ€์กฐ๋˜๋Š” ์ ์ด๋‹ค.

Heap

The heap lets you do things the stack can't like allocate memory with a dynamic lifetime.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%206.png

Because multiple thread can be allocating memory on the heap at the same time, the heap needs to protect its integrity using locking or other synchronization mechanisms. This is a pretty large cost.

Struct

/WWDC2016/images/Understanding-Swift-Performance/Untitled%207.png

Before we even begin executing any code, we've allocated space on the stack for our point1 instance and our point2 instance. Because point is a struct, the x and y properties are stored in line on the stack.

์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์ „๋ถ€ํ„ฐ ์ด๋ฏธ point1, point2 ์ธ์Šคํ„ด์Šค๊ฐ€ ์žˆ๋Š” ์Šคํƒ์ด ํ• ๋‹น ๋œ๋‹ค. Point ๊ฐ€ struct ์ด๊ธฐ ๋•Œ๋ฌธ์—, x ์™€ y ๋Š” ์Šคํƒ์— ์ €์žฅ ๋˜์–ด ์žˆ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%208.png

When we go to construct our point with an x of 0 and a y of 0, all we're doing is initializing that memory we've already allocated on the stack.

point x, y ์— 0, 0์„ ๊ตฌ์„ฑํ•  ๋•Œ ์ด๋ฏธ ์Šคํƒ์— ํ• ๋‹น๋˜์–ด ์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ดˆ๊ธฐํ™” ํ•ด์ฃผ๊ธฐ๋งŒ ํ•˜๋ฉด ๋œ๋‹ค.

When we assign point1 to point2 , we're just making a copy of that point and initializing the point2 memory, again, that we'd already allocated on the stack.

point1์„ point2 ์— ํ• ๋‹นํ•  ๋•Œ, ๊ทธ point๋ฅผ ๋ณต์‚ฌํ•˜๊ณ  point2 ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ดˆ๊ธฐํ™” ํ•˜๋ฉด ๋œ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%209.png

Note that point1 and point2 are independent instances.

point1 ๊ณผ point2 ๋Š” ๋…๋ฆฝ๋œ ์ธ์Šคํ„ด์Šค์ด๋‹ค. point2.x ์— 5๋ฅผ ํ• ๋‹นํ•ด๋„ point1.x ์˜ ๊ฐ’์€ ์—ฌ์ „ํžˆ 0์ด๋‹ค.

This is known as value semantics.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2010.png

use point1, use point2 and we're done executing our function. So we can trivially deallocate that memory for point1 and point2 just by incrementing that stack pointer back up to where we were when we entered our function.

Class

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2011.png

Instead of for the actual storage of the properties on point, we're going to allocate memory for references to point1 and point2. References to memory we're going to be allocated on the heap.

heap ์— ํ• ๋‹นํ•  ๋•Œ, Swift ๋Š” ์‹ค์ œ๋กœ 4word์˜ ์ €์žฅ ๊ณต๊ฐ„์„ ํ• ๋‹น ํ•ด์ค€๋‹ค. ์ด ์ ์ด 2word๋ฅผ ํ• ๋‹น ํ–ˆ๋˜ struct ์™€๋Š” ๋Œ€์กฐ์ ์ด๋‹ค.

point ๊ฐ€ class ์ด๊ธฐ ๋•Œ๋ฌธ์—, point ์˜ content๋ฅผ ๋ณต์‚ฌ ํ•˜์ง€ ์•Š๊ณ  reference๋ฅผ ๋ณต์‚ฌํ•œ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2012.png

This is known as reference semantics and can lead to unintended sharing of state.

We saw that classes are more expensive to construct than structs because classes require a heap allocation. Because classes are allocated on the heap and have reference semantics, classes have some powerful characteristics like identity and indirect storage.

์ด ๊ฒฝ์šฐ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด Struct๋ฅผ ์‚ฌ์šฉํ•˜๋Š”๊ฒŒ ์ข‹๋‹ค.

And stucts aren't prone to the unintended sharing of state like classes are.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2013.png

์‹คํ–‰ ๋  ๋•Œ, ์œ ์ €๊ฐ€ ์Šคํฌ๋กค ํ•  ๋•Œ ๋งˆ๋‹ค ์ž์ฃผ ํ˜ธ์ถœ ๋˜๊ธฐ ๋•Œ๋ฌธ์— ๋นจ๋ผ์•ผ ํ•œ๋‹ค.

โ†’ ์บ์‹ฑ ๋ ˆ์ด์–ด ์ถ”๊ฐ€

(ํ•œ ๋ฒˆ ์ƒ์„ฑ ๋œ ์ด๋ฏธ์ง€๋Š” ์ €์žฅ ํ•ด๋†“๊ณ  ๋”์ด์ƒ ์ƒ์„ฑ ํ•˜์ง€ ์•Š๋„๋ก)

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2014.png

**String ์€ ํ‚ค ๊ฐ’์œผ๋กœ strong type ์ด ์•„๋‹ˆ๋‹ค.**

  • ์ด๋ฏธ์ง€๋ฅผ ๋Œ€ํ‘œ ํ•˜๋Š” key์ด๊ธด ํ•˜์ง€๋งŒ ๊ทธ๋ƒฅ ๋Œ•๋Œ•์ด ๋ผ๊ณ  ์‰ฝ๊ฒŒ ํ‚ค๋ฅผ ๋„ฃ์„ ์ˆ˜๋„ ์žˆ๋‹ค. โ†’ Safety ํ•˜์ง€ ์•Š๋‹ค.
  • character๋“ค์˜ contents ๋ฅผ ๊ฐ„์ ‘์ ์œผ๋กœ heap ์— ์ €์žฅ ํ•œ๋‹ค. โ†’ makeBalloon ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœ ํ•  ๋•Œ๋งˆ๋‹ค, ์บ์‹ฑ์„ ํ•˜๋”๋ผ๋„ heap allocation ์„ ํ•˜๊ฒŒ ๋œ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2015.png

**Struct ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋” safeํ•˜๋‹ค.**

  • Struct๋Š” Swift์—์„œ first class type์ด๊ธฐ ๋•Œ๋ฌธ์— dictionary์˜ key๋กœ ์‚ฌ์šฉ ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ๋”์ด์ƒ heap allocation ์„ ์š”๊ตฌํ•˜์ง€ ์•Š๊ณ  stack์— ํ• ๋‹นํ•œ๋‹ค.
  • ๋” ์•ˆ์ „ํ•˜๊ณ  ๋” ๋น ๋ฅด๋‹ค!

Refrence Counting

Swift ๋Š” heap์— ํ• ๋‹น ๋œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์–ธ์ œ ํ•ด์ œํ•ด์•ผ ๋˜๋Š”์ง€ ์–ด๋–ป๊ฒŒ ์•Œ๊นŒ?

โ‡’ Swift๋Š” reference์˜ ์ „์ฒด count ์ˆ˜๋ฅผ heap์— ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. count ๊ฐ€ 0 ์ด ๋˜๋ฉด ํ•ด๋‹น ์ธ์Šคํ„ด์Šค๋ฅผ ์•„๋ฌด๋„ ๊ฐ€๋ฅดํ‚ค๊ณ  ์žˆ์ง€ ์•Š๋‹ค๊ณ  ํŒ๋‹จํ•˜๊ณ , ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ•ด์ œํ•˜๊ธฐ ์•ˆ์ „ํ•˜๋‹ค๊ณ  ํŒ๋‹จํ•œ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2016.png

  • multiple thread์—์„œ count ์ฆ๊ฐ€, ๊ฐ์†Œ๊ฐ€ ์ผ์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์— thread safety ํ•ด์•ผ ํ•œ๋‹ค.
  • Reference counting ์€ ๊ต‰์žฅํžˆ ์ž์ฃผ ์ผ์–ด๋‚˜๋Š” ์—ฐ์‚ฐ์ด๋‹ค. โ†’ cost can add up.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2017.png

Swift ๊ฐ€ retain, release ๋ฅผ ์ถ”๊ฐ€ํ•ด์ค€๋‹ค.

  • retain : atomically increment reference count
  • release : decrement reference count

Class

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2018.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2019.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2020.png

Struct

Struct ์ผ ๋•Œ๋Š” reference counting ์ด ์ผ์–ด๋‚˜์ง€ ์•Š๋Š”๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2021.png

Struct containing references

String ๋Š” character๋“ค์„ heap์— ์ €์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— reference counting์ด ๋œ๋‹ค.

UIFont ๋Š” class ์ด๊ธฐ ๋•Œ๋ฌธ์— reference counting์ด ๋œ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2022.png

label ์€ 2๊ฐœ์˜ reference๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2023.png

copy๋ฅผ ํ•˜๋ฉด reference๊ฐ€ 2๊ฐœ ๋” ์ถ”๊ฐ€ ๋œ๋‹ค.

Class๋Š” heap์— ํ• ๋‹น ๋˜๊ธฐ ๋•Œ๋ฌธ์— Swift๋Š” heap allocation์˜ lifetime์„ reference counting์„ ํ†ตํ•ด ๊ด€๋ฆฌ ํ•œ๋‹ค.

Struct๊ฐ€ reference๋ฅผ ํฌํ•จ ํ•˜๊ณ  ์žˆ๋‹ค๋ฉด, reference counting ์„ ํ•œ๋‹ค. โ‡’ reference๊ฐ€ ๋งŽ์„ ์ˆ˜๋ก reference counting overhead ๊ฐ€ ์ƒ๊ธด๋‹ค

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2024.png

๐Ÿ†•value type UUID

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2025.png

More type safety, got more performance, way more convenient to write, way more type safe

Method Dispatch

Static

๋ฉ”์†Œ๋“œ๋ฅผ runtime์— ํ˜ธ์ถœํ•˜๋ฉด, Swift correct implementation์„ ์‹คํ–‰ํ•ด์•ผ ํ•œ๋‹ค.

์ปดํŒŒ์ผ ํƒ€์ž„์— ์‹คํ–‰ํ•  implementation์„ ๊ฒฐ์ • ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์ด static dispatch ์ด๋‹ค.

๋Ÿฐํƒ€์ž„์— correct implentation์— ์ง์ ‘ jump ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2026.png

inlining

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2027.png

complier knows which implementations are going to be executed

drawAPoint(point)
โฌ‡๏ธ
param.draw() โ‡’ // static dispatch
โฌ‡๏ธ
Point.draw implementation

// no call stack overhead

Dynamic

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2028.png

Dynamic dispath ๋Š” ์ปดํŒŒ์ผ ํƒ€์ž„์— ๊ฒฐ์ • ํ•  ์ˆ˜ ์—†๋‹ค. ๋Ÿฐํƒ€์ž„์— implementation์„ ์ฐพ๊ณ  jump ํ•  ์ˆ˜ ์žˆ๋‹ค.

๊ทธ๋ž˜์„œ dynamic dispatch ๋Š” static๋ณด๋‹ค ๋น„์šฉ์ด ๋น„์‹ธ์ง€ ์•Š๋‹ค. There's just one level of indirection.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2029.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2030.png

We can create an array of these things and they're all the same size because we're storing them by reference in the array

Because this d.draw() , it could be a point, it could be a line.

The complier adds another filed to classes which is a pointer to the type information of that class and it's stored in static memory.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2031.png

And so when we go and call draw, what the compiler actually generates on our behalf is a lookup through the type to something called the virtual method table on the type and static memory, which contains a pointer to the correct implementation to execute.

Final Class

If you never intend for a class to be subclassed, you can mark it as final

Protocol Types

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2032.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2033.png

struct Line๊ณผ Point ๋Š” V-table dispatch ๋ฅผ ์œ„ํ•ด ํ•„์š”ํ•œ ์ƒ์† ๊ด€๊ณ„๋ฅผ ๊ณต์œ ํ•˜๊ณ  ์žˆ์ง€ ์•Š๋‹ค.

How does Swift dispatch to the correct method?

โ†’ Table based mechanism called the Protocol Witness Table (PWT)

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2034.png

ํƒ€์ž…๋งˆ๋‹ค ํ”„๋กœํ† ์ฝœ์„ ๊ตฌํ˜„ํ•œ ํ•˜๋‚˜์˜ ํ…Œ์ด๋ธ”์ด ์žˆ๋‹ค. ํ…Œ์ด๋ธ”์˜ ์—”ํŠธ๋ฆฌ๊ฐ€ ํƒ€์ž…์˜ ๊ตฌํ˜„์„ ๋งํฌํ•œ๋‹ค.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2035.png

Line์€ 4 words ๊ฐ€, Point ๋Š” 2 words ๊ฐ€ ํ•„์š” โ†’ ๊ฐ™์€ ์‚ฌ์ด์ฆˆ๊ฐ€ ์•„๋‹Œ๋ฐ array๋Š” fixed offset

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2036.png

**Point (2 words)**

The first 3 words in that existential container are reserved for the valueBuffer.

Small types like our Point, which only needs two words, fit into this valueBuffer.

**Line (4 words)**

Swift allocates memory on the heap and stores the value there and stores a pointer to that memory in existential container

The Value Witness Table (VWT)

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2037.png

  1. Allocate memory on the heap and store a pointer to that memory inside of the valueBuffer of the existential container
  2. Swift needs to copy the value from the source of the assignment that initializes our local variable into existential container
  3. Copy entries of our value witness table will do the correct thing and copy it into the valueBuffer allocated in the heap
  4. Program continues and we are at the end of the lifetime of our local variables
  5. Swift calls the deallocate function in that table

The Existential Container

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2038.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2039.png

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2040.png

โ†’ This work is what enables combining value types such as struct Line and struct Point together with protocols to get dynamic behavior, dynamic polymorphism.

Protocol Type Stored Properties

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2041.png

  • Existential Container inline
  • Large values on the heap
  • Supports dynamic polymorphism

โ†’ This representation allows storing a differently typed value later in the program.

Expensive Copies of Large Values

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2042.png

heap ์€ expensive ํ•œ๋ฐ ? 4 ๊ฐœ heap ์„ ์‚ฌ์šฉ?

โ†’ existential container has place for 3 words and references would fit into those 2 words

References Fit in the Value Buffer

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2043.png

first๋ฅผ ๋ณต์‚ฌํ•ด์„œ second๋ฅผ ๋งŒ๋“ค๋ฉด reference count ๋งŒ ์ฆ๊ฐ€ ์‹œํ‚ค๋ฉด ๋˜์ง€๋งŒ unintended sharing์ด ์ผ์–ด๋‚˜์ž–์•„?

โ†’ Copy on write

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2044.png

Indirect Storage with Copy-on-Write

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2045.png

When we come to modify, mutate our value, we first check the reference count. Is it greater than 1 ?

If the reference count is greater than 1, we create a copy of our Line storage and mutate that.

Copy Using Indirect Storage

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2046.png

This is a lot cheaper than heap allocation.

Summary - Protocol Types

  • Dynamic polymorphism
  • Indirection through Witness Tables and Existential Container
  • Copying of large value causes heap allocation

Generic code

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2047.png

Swift will bind the generic type T to the type used at this call side.

Generic parameter T in this call context is bound through the type Point.

Implementation of Generic Methods

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2048.png

Because we have one type per call context, Swift does not use an existential container here. Instead, it can pass both the value witness table of the type used at this call-site as additional arguments to the function.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2049.png

And then during execution o that function, when we create a local variable for the parameter, Swift will use the value witness table to allocate potentially any necessary buffers on the heap and execute the copy from the source of the assignment to the destination.

Storage of Local Variables

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2050.png

Is this any faster? Is this any better?

This static form of polymorphism enables the compiler optimization called specialization of generics.

Specialization of Generics

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2051.png

When Does Specialization Happen?

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2052.png

Whole Module Optimization

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2053.png

If we compile those 2 files separately, when I come to compile the file UsePoint, the definition of my Point is no loner available because the compiler has compiled those 2 files separately.

However, with whole module optimization, the compiler will compile both files together as one unit and will have insight into the definition of the Point file and optimization can take place.

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2054.png

Specialized Generics - Class Type

Performance characteristics like class type

  • Heap allocation on creating an instance
  • Reference counting
  • Dynamic method dispatch through V-Table

/WWDC2016/images/Understanding-Swift-Performance/Untitled%2055.png