RC v1.2.0 #10

alessandrocapotondi · 2019-08-02T12:19:07Z

Changelog: v1.2.0 - 2019-08-02

Fixed

Fix #8. Fixed hero_dma_memcpy_async API. In case of big memory transfers, some DMA job were leacked, resulting on the termination of DMA channels available.

Changed

Added API to access HW cycles counters.

### Added - Add API to access and control the HW cycle counters on the HERO accelerator-side: - `hero_rt_start_cycle_cnt()` - `hero_rt_reset_cycle_cnt()` - `hero_rt_stop_cycle_cnt()` - `hero_rt_get_cycles()` - Add `testsuite` for CI.

## Changelog: v1.2.0 - 2019-08-02 ### Fixed - Fix [#8](#8). Fixed `hero_dma_memcpy_async` API. In case of big memory transfers, some DMA job were leacked, resulting on the termination of DMA channels available. ### Changed - Added API to access HW cycles counters.

vogelpi

Thanks @alessandrocapotondi for creating this PR. I suggest the following two things:

The proposed fix for Job IDs are dropped when crossing page boundaries #8 should be tested on the board.
The target library should also provide a timer implementation for the host side. This can for example use Linux system functions internally.

vogelpi · 2019-08-05T14:33:16Z

host/hero-target.c

+hero_rt_get_cycles()
+{
+  return 0x0;
+}


Missing newline

vogelpi · 2019-08-05T14:36:56Z

host/hero-target.c

+void
+hero_rt_start_cycle_cnt()
+{
+	return;


Shouldn't there be a corresponding timer implementation for the host side as well? I think this would be useful when doing profiling and comparing host vs. accelerator execution. What do you think?

The PULP timer returns the number passed of clock cycles. How would you implement this on the host while staying consistent with the unit (clock cycles)?

I see two options:

You are using functions like clock_gettime() on the host which gives you the time in seconds and nanoseconds. You can then convert this into accelerator clock cycles or host clock cycles depending on what comparison you want to do. The frequencies of these two clocks you can get from the pulp struct defined and set up by libpulp (see hero-support), namely pulp->pulp_clk_freq_mhz and pulp->host_clk_freq_mhz. Alternatively, you can also get the host clock frequency from sysfs.

Despite executing on the host, you can still use the timer inside the accelerator directly. How to do this is illustrated here:
https://github.com/pulp-platform/hero-support/blob/6762d055953089eea215c60a58ad3a62c6b717e2/libpulp/src/pulp.c#L503
You just need to be careful that the accelerator is not manipulating the timer simultaneously.

Best regards,
Pirmin

vogelpi · 2019-08-05T14:49:47Z

pulp/hero-target.c

-    // just wait for the last one...
-    dma = (hero_dma_job_t)plp_dma_memcpy_priv(ext_addr,loc_addr,size_tmp,ext2loc);
+    //dma_job = (hero_dma_job_t)plp_dma_memcpy_priv(ext_addr,loc_addr,size_tmp,ext2loc);
+    dma_cmd = plp_dma_getCmd(ext2loc, size, PLP_DMA_1D, PLP_DMA_TRIG_EVT, PLP_DMA_NO_TRIG_IRQ, PLP_DMA_PRIV);


I am not sure if this change is sufficient to fix the issue described in #8 . This should definitely be tested. @ctbur could you have a look please?

Yes, I will test it as soon as my other DMA and compiler issues are resolved. This way I won't have to create a custom test case.

ctbur

The new hero_dma_memcpy_async works great, it's even faster than the old implementation by 0.5 - 1.0 bytes per cycle, depending on transfer size and whether DRAM is involved or not. Just the typo in line 62 needs to be fixed.

ctbur · 2019-08-06T06:56:41Z

pulp/hero-target.c

@@ -58,7 +58,8 @@ hero_dma_memcpy_async(void *dst, void *src, int size)
  int ext2loc;
  unsigned int ext_addr_tmp, ext_addr, loc_addr;
  int size_tmp = size;
-  hero_dma_job_t dma = 0;
+  hero_dma_job_t dma_job = plp_dma_counter_alloc();
+  uin32_t dma_cmd;


Typo: uin32_t => uint32_t

alessandrocapotondi added 2 commits October 17, 2018 23:20

Release v1.2.0

f649590

### Added - Add API to access and control the HW cycle counters on the HERO accelerator-side: - `hero_rt_start_cycle_cnt()` - `hero_rt_reset_cycle_cnt()` - `hero_rt_stop_cycle_cnt()` - `hero_rt_get_cycles()` - Add `testsuite` for CI.

alessandrocapotondi requested review from andreaskurth and vogelpi as code owners August 2, 2019 12:19

alessandrocapotondi changed the title ~~Rel v1.2.0~~ RC v1.2.0 Aug 2, 2019

vogelpi reviewed Aug 5, 2019

View reviewed changes

ctbur suggested changes Aug 6, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RC v1.2.0 #10

RC v1.2.0 #10

alessandrocapotondi commented Aug 2, 2019

vogelpi left a comment

vogelpi Aug 5, 2019

vogelpi Aug 5, 2019

ctbur Aug 5, 2019 •

edited

vogelpi Aug 6, 2019

vogelpi Aug 5, 2019

ctbur Aug 5, 2019

ctbur left a comment

ctbur Aug 6, 2019

RC v1.2.0 #10

Are you sure you want to change the base?

RC v1.2.0 #10

Conversation

alessandrocapotondi commented Aug 2, 2019

Changelog: v1.2.0 - 2019-08-02

Fixed

Changed

vogelpi left a comment

Choose a reason for hiding this comment

vogelpi Aug 5, 2019

Choose a reason for hiding this comment

vogelpi Aug 5, 2019

Choose a reason for hiding this comment

ctbur Aug 5, 2019 • edited

Choose a reason for hiding this comment

vogelpi Aug 6, 2019

Choose a reason for hiding this comment

vogelpi Aug 5, 2019

Choose a reason for hiding this comment

ctbur Aug 5, 2019

Choose a reason for hiding this comment

ctbur left a comment

Choose a reason for hiding this comment

ctbur Aug 6, 2019

Choose a reason for hiding this comment

ctbur Aug 5, 2019 •

edited