Skip to content
This repository has been archived by the owner on Jan 6, 2022. It is now read-only.

RC v1.2.0 #10

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,7 @@
*build/
*.o
*~
*testsuite
*__pycache__
*junit-reports

8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,14 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/), and this project adheres to
[Semantic Versioning](http://semver.org).

## v1.2.0 - 2018-10-17
### Fixed
- Fix [#8](https://github.com/pulp-platform/libhero-target/issues/8). Fixed `hero_dma_memcpy_async` API. In case of big memory transfers, some DMA job were leacked, resulting on the termination of DMA channels available.

### Changed
- Added API to access HW cycles counters.


## v1.1.0 - 2018-09-25
### Fixed
- Fix [#22](https://github.com/pulp-platform/hero-sdk/issues/22)
Expand Down
24 changes: 24 additions & 0 deletions host/hero-target.c
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,27 @@ hero_rt_core_id(void)
{
return omp_get_thread_num();
}

void
hero_rt_start_cycle_cnt()
{
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a corresponding timer implementation for the host side as well? I think this would be useful when doing profiling and comparing host vs. accelerator execution. What do you think?

Copy link

@ctbur ctbur Aug 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PULP timer returns the number passed of clock cycles. How would you implement this on the host while staying consistent with the unit (clock cycles)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two options:

  1. You are using functions like clock_gettime() on the host which gives you the time in seconds and nanoseconds. You can then convert this into accelerator clock cycles or host clock cycles depending on what comparison you want to do. The frequencies of these two clocks you can get from the pulp struct defined and set up by libpulp (see hero-support), namely pulp->pulp_clk_freq_mhz and pulp->host_clk_freq_mhz. Alternatively, you can also get the host clock frequency from sysfs.

  2. Despite executing on the host, you can still use the timer inside the accelerator directly. How to do this is illustrated here:
    https://github.com/pulp-platform/hero-support/blob/6762d055953089eea215c60a58ad3a62c6b717e2/libpulp/src/pulp.c#L503
    You just need to be careful that the accelerator is not manipulating the timer simultaneously.

Best regards,
Pirmin

}

void
hero_rt_reset_cycle_cnt()
{
return;
}

void
hero_rt_stop_cycle_cnt()
{
return;
}

int
hero_rt_get_cycles()
{
return 0x0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline

38 changes: 38 additions & 0 deletions inc/hero-target.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,44 @@ void hero_l2free(void * a);
\return The core ID.
*/
int hero_rt_core_id();

/** Start the processor cycle counter.

Start to commute the cycle counter. Note, at boot, the starting value is unpredictable because the counter is not automatically resetted during the `hero_rt_start_cycle_cnt` function. Reset must be manually controlled by the user. Following, the correct initialization sequence for the cycle counter:

```
hero_rt_reset_cycle_cnt();
hero_rt_start_cycle_cnt();
```

\return void.
*/
void hero_rt_start_cycle_cnt();

/** Reset the processor cycle counter.

Reset to 0 the processor cycle counter value.

\return void.
*/
void hero_rt_reset_cycle_cnt();

/** Stop the processor cycle counter.

Stop the processor cycle counter commutation. The cycle counter value can be readed using the function `hero_rt_get_cycles()`. The cycles counting can be resumed using the function `hero_rt_start_cycles_cnt()`.

Note. The counter value must be resetted manually by the user.

\return void.
*/
void hero_rt_stop_cycle_cnt();

/** Get the processor cycle counter value.

\return the current cycle counter value, or 0 if unavailable.
*/
int hero_rt_get_cycles();

//FIXME: hero_rt_info();
//FIXME: hero_rt_error();

Expand Down
35 changes: 31 additions & 4 deletions pulp/hero-target.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,8 @@ hero_dma_memcpy_async(void *dst, void *src, int size)
int ext2loc;
unsigned int ext_addr_tmp, ext_addr, loc_addr;
int size_tmp = size;
hero_dma_job_t dma = 0;
hero_dma_job_t dma_job = plp_dma_counter_alloc();
uin32_t dma_cmd;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: uin32_t => uint32_t


// get direction
if ((unsigned int) dst < ARCHI_CLUSTER_GLOBAL_ADDR(0) ||
Expand Down Expand Up @@ -100,15 +101,17 @@ hero_dma_memcpy_async(void *dst, void *src, int size)
pulp_tryread((unsigned *)ext_addr_tmp);
pulp_tryread((unsigned *)((ext_addr + size_tmp - 1) & 0xFFFFFFFC));

// just wait for the last one...
dma = (hero_dma_job_t)plp_dma_memcpy_priv(ext_addr,loc_addr,size_tmp,ext2loc);
//dma_job = (hero_dma_job_t)plp_dma_memcpy_priv(ext_addr,loc_addr,size_tmp,ext2loc);
dma_cmd = plp_dma_getCmd(ext2loc, size, PLP_DMA_1D, PLP_DMA_TRIG_EVT, PLP_DMA_NO_TRIG_IRQ, PLP_DMA_PRIV);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this change is sufficient to fix the issue described in #8 . This should definitely be tested. @ctbur could you have a look please?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will test it as soon as my other DMA and compiler issues are resolved. This way I won't have to create a custom test case.

__asm__ __volatile__ ("" : : : "memory");
plp_dma_cmd_push(dma_cmd, loc_addr, ext_addr);

size -= size_tmp;
ext_addr += size_tmp;
loc_addr += size_tmp;
}

return dma;
return dma_job;
}

void
Expand Down Expand Up @@ -152,3 +155,27 @@ hero_rt_core_id(void)
{
return rt_core_id();
}

void
hero_rt_start_cycle_cnt()
{
start_timer();
}

void
hero_rt_reset_cycle_cnt()
{
reset_timer();
}

void
hero_rt_stop_cycle_cnt()
{
stop_timer();
}

int
hero_rt_get_cycles()
{
return get_time();
}
12 changes: 12 additions & 0 deletions testset.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
from plptest import *

TestConfig = c = {}

tests = Testset(
name = 'libhero-target',
files = [
'testsuite/timers/testset.cfg',
]
)

c['testsets'] = [ tests ]
3 changes: 3 additions & 0 deletions testsuite/timers/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
CSRCS = main.c

-include ${HERO_OMP_EXAMPLES_DIR}/common/default.mk
79 changes: 79 additions & 0 deletions testsuite/timers/main.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
/*
* HERO HelloWorld Example Application
*
* Copyright 2018 ETH Zurich, University of Bologna
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <omp.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <hero-target.h>

struct timespec start, stop;
double start_ns, stop_ns, exe_time;

#pragma omp declare target
void test(int *ret, int *aux)
{
int cycles1=0x0;
int cycles2=0x0;

hero_rt_reset_cycle_cnt();
hero_rt_start_cycle_cnt();

for(int i=0; i<1000; i++)
*aux+=hero_rt_get_cycles();

hero_rt_stop_cycle_cnt();
cycles1=hero_rt_get_cycles();

if(cycles1<=0x0)
*ret=1;
if(cycles1<1000)
*ret=2;

cycles2=hero_rt_get_cycles();
if(cycles1!=cycles2)
*ret=3;

hero_rt_start_cycle_cnt();
for(int i=0; i<1000; i++)
*aux+=hero_rt_get_cycles();
hero_rt_stop_cycle_cnt();
cycles1=hero_rt_get_cycles();
if(cycles1==cycles2)
*ret=4;

hero_rt_reset_cycle_cnt();
cycles1=hero_rt_get_cycles();
if(cycles1!=0x0)
*ret=5;
}
#pragma omp end declare target

int main(int argc, char *argv[])
{
int ret=0;
int aux=0;

omp_set_default_device(BIGPULP_MEMCPY);

#pragma omp target map(tofrom:ret,aux)
test(&ret,&aux);

return ret;
}
8 changes: 8 additions & 0 deletions testsuite/timers/testset.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from plptest import *
TestConfig = c = {}
def check_output(config, output):
return(output.find("make: *** [run] Error") == -1, None)

c['tests'] = [
Test(name = 'timers', commands = [ Shell('clean', 'make clean'), Shell('build', 'make all'), Shell('run', 'make run'), Check('check', check_output) ], timeout=1000000),
]