



A **hardware 3D** image **generator**

Jefferson Chaves Ferreira – João Paulo Condé Oliveira Prado  
Orientadora: Cíntia Borges Margi – Co-orientador: Pedro Maat Costa Massolino

# Plano da apresentação

1. Motivação
2. Fundamentos teóricos
3. Metodologia
4. Arquitetura
5. Implementação
6. Resultados
7. Conclusões

# Motivação

Óptica

Reflexão  
Refração

Efeitos de  
iluminação

Illuminação  
Sombras

Desafio

Custo  
computacional



*Ray tracing*



Um carro renderizado por um processador de *ray tracing*. Extraído de (HANIKA, 2007).



Comparação entre rasterização e *ray tracing*. Extraído de (HOWARD, 2007)

# Fundamentos teóricos

## Reflexão especular



## Refração



Extraído de (GREVE, 2007)

$$\vec{V} = \vec{I} - 2\vec{N}(\vec{I} \cdot \vec{N})$$

$$\vec{t} = \frac{n_1}{n_2} \vec{i} - \left( \frac{n_1}{n_2} (\vec{i} \cdot \vec{n}) + \sqrt{1 - \left( \frac{n_1}{n_2} \right)^2 (1 - (\vec{i} \cdot \vec{n})^2)} \right) \vec{n}$$

# Árvore k-d



$$C = \frac{1}{SA(N_{curr})} [SA(LCN_{curr})(Trigleft + Trigboth) + SA(RCN_{curr})(Trigright + Trigboth)]$$

Heurística de área superficial (HAVRAN, 2000)

# Metodologia



Metodologia de projeto empregada. Adaptado de (ERBAS, 2006).

# Arquitetura



# Implementação



C++



SYNOPSYS®

XILINX®

# Resultados (I)



*Stanford bunny*



Poliedro (80 faces)

# Resultados (II)



# Resultados (III)

## Resource Usage Report for COPROC

Mapping to part: xc5vfx70tff1136-1

### Cell usage:

|         |           |
|---------|-----------|
| DSP48E  | 57 uses   |
| FD      | 3 uses    |
| FDE     | 290 uses  |
| FDR     | 2 uses    |
| FDRE    | 118 uses  |
| GND     | 9 uses    |
| LD      | 244 uses  |
| MUXCY   | 2 uses    |
| MUXCY_L | 5420 uses |
| VCC     | 13 uses   |
| XORCY   | 5337 uses |
| LUT1    | 44 uses   |
| LUT2    | 839 uses  |
| LUT3    | 266 uses  |
| LUT4    | 1757 uses |
| LUT5    | 2454 uses |
| LUT6    | 2117 uses |
| LUT6_2  | 113 uses  |
| BUFG    | 1 use     |

I/O Register bits: 0  
Register bits not including I/Os: 413 (0%)  
Latch bits not including I/Os: 244 (0%)

DSP48s: 57 of 128 (44%)

Global Clock Buffers: 1 of 32 (3%)

Number of unique control sets: 5  
C(Clk), R(Reset), S(GND),  
CE(unl\_ready\_1\_sqmuxa\_i) : 1  
C(Clk), R(unl\_reset\_0\_i\_lut6\_2\_06), S(GND),  
CE(N\_5\_i) : 117  
C(Clk), CLR(GND), PRE(GND), CE(ready\_0\_sqmuxa) :  
290  
C(Clk), CLR(GND), PRE(GND), CE(VCC) : 3  
C(Clk), R(Reset), S(GND), CE(VCC) : 2

Total load per clock:  
coproc|Clk: 419

Mapping Summary:

**Total LUTs: 7590 (15%)**

# Resultados (IV)

Performance Summary  
\*\*\*\*\*

Worst slack in design: -18.963

| Starting Clock | Requested Frequency | Estimated Frequency | Requested Period | Estimated Period | Slack   |
|----------------|---------------------|---------------------|------------------|------------------|---------|
| coproc clk     | 9.6 MHz             | 8.1 MHz             | 104.657          | 123.620          | -18.963 |

# Resultados (V)

```
#-----#
# Starting program par
# par -w -ol high system_map.ncd system.ncd system.pcf
#-----#
Device Utilization Summary:

Number of BUFGs           9 out of 32    28
Number of BUFIOS          8 out of 80    10
Number of DCM_ADVs         1 out of 12    8
Number of DSP48Es          73 out of 128   57
Number of FIFO36_72_EXPs   2 out of 148   1
Number of LOCed FIFO36_72_EXPs 2 out of 2    100
Number of IDELAYCTRLS      3 out of 22    13
Number of ILOGICs          106 out of 800   13
Number of LOCed ILOGICs     8 out of 106   7
Number of External IOBs     230 out of 640   35
Number of LOCed IOBs        230 out of 230   100
Number of IODELAYS          80 out of 800   10

Number of LOCed IODELAYS      8 out of 80    10
Number of JTAGPPCs          1 out of 1     100
Number of OLOGICS           223 out of 800   27
Number of PLL_ADVs          2 out of 6     33
Number of PPC440s            1 out of 1     100
Number of RAMB36_EXPs        5 out of 148   3
Number of LOCed RAMB36_EXPs  2 out of 5     40
Number of Slice Registers    8859 out of 44800   19
Number used as Flip Flops   8613
Number used as Latches       192
Number used as LatchThrus    54
Number of Slice LUTS         16977 out of 44800   37
Number of Slice LUT-Flip Flop pairs 19608 out of 44800   43
```

**Number of Slices      6299 out of 11200 56**

# Resultados (VI)

| Constraint                     | Period      | Actual Period      |                    | Timing Errors |               |
|--------------------------------|-------------|--------------------|--------------------|---------------|---------------|
|                                | Requirement | Direct             | Derivative         | Direct        | Derivative    |
|                                |             |                    |                    |               |               |
| TS_sys_clk_pin                 | 10.000ns    | 4.000ns            | <b>12.315ns(?)</b> | 0             | <b>60(!?)</b> |
| TS_clock_generator_0_clock_gen | 40.000ns    | 39.752ns           | N/A                | 0             | 0             |
| TS_clock_generator_0_clock_gen | 10.000ns    | <b>12.315ns(?)</b> | N/A                | <b>60(!?)</b> | 0             |
| TS_clock_generator_0_clock_gen | 7.500ns     | 7.459ns            | N/A                | 0             | 0             |
| TS_clock_generator_0_clock_gen | 5.000ns     | 4.506ns            | N/A                | 0             | 0             |
| TS_clock_generator_0_clock_gen | 5.000ns     | 4.964ns            | N/A                | 0             | 0             |

# Resultados (VII)

Total load per clock:

**normalAndIntersectionPoint**|clockSqrt: 325

**normalAndIntersectionPoint**|clock: 18

Mapping Summary:

**Total LUTs: 13527 (108%) (!!!)**

Mapper successful!

Process took 0h:10m:00s realtime, 0h:09m:50s cputime

# Conclusões

- Objetivos básicos alcançados
- Dificuldades encontradas:
  - Depuração: erros difíceis de encontrar
  - Ferramentas: incompatibilidades e má documentação
  - Mudanças na arquitetura
- Um projeto de engenharia "completo"

# Referências

ERBAS, C. *System-Level Modeling and Design Space Exploration for Multiprocessor Embedded System-on-Chip Architectures*. Tese (Doutorado) — Informatics Institute, Univ. of Amsterdam, Nov. 2006.

GREVE, B. de. Reflections and Refractions in Ray Tracing. 2007. Disponível em: <[http://www.flipcode.com/archives/reflection\\_transmission.pdf](http://www.flipcode.com/archives/reflection_transmission.pdf)>. [Acessado em 4-10-2011].

HAVRAN, V. *Heuristic Ray Shooting Algorithms*. Tese (Doutorado) — Czech Technical University, 2000.

HOWARD, J. *Real Time Ray-Tracing: The End of Rasterization?* 2007. Disponível em: <[http://blogs.intel.com/research/2007/10/real\\_time\\_raytracing\\_the\\_end\\_o.php](http://blogs.intel.com/research/2007/10/real_time_raytracing_the_end_o.php)>. [Acessado em 12-12-2011].



Obrigado!  
Questões?