Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File reader with repetitions; Correct cmake configuration with QThread library local install; Simplify add_port recursion helper and use unordered_set for kernelkeeper #161

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

BamboWu
Copy link
Contributor

@BamboWu BamboWu commented Jun 17, 2022

  • ## Description
  1. In order to adjust benchmark working size without bringing huge files, we let filereader kernel optionally take an argument, repetitions, which is the number of times a file is read repeatedly to feed the dataflow graph with arbitrary data to process.
  2. The QThread include path was added with leading -I and the compiling options generated by cmake would be collapsed (missing space between two -I) making include directories invalid.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Changes are individually tested with the word search example.

  • Runs locally on Windows
  • Runs locally on Linux
  • Runs locally on OS X

Details

QThread library is compiled and installed at /bambo/qthread_install.

RaftLib build is configured with the cmake command:

$ QTHREAD_PATH=/bambo/qthread_install/ cmake -DCMAKE_INSTALL_PREFIX=/bambo/raft_install ..

Then make install.

To compile the search example:

$ export PKG_CONFIG_PATH=/mnt/ampere3/bambo/bmkvl/raft_install/pkgconfig/:$PKG_CONFIG_PATH
$ g++ -o search -DQTHREAD_FOUND $(pkg-config --cflags raftlib) ../examples/cppalgo/search/bmh.cpp $(pkg-config --libs raftlib)

To run the search example:

$ ./search ../examples/cppalgo/search/alice.txt
# or with 2 repetitions
$ ./search ../examples/cppalgo/search/alice.txt 2

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Simplify the port_helper struct that was used for recursively adding
ports. Two template helper functions should be neat.
Change kernelkeeper to use unordered_set instead of set should make it
more scalable (O(logn) access time reduced to O(1) access time).
There is a dependency between class kpair and kernel, so their method
implementations have to stay in a source file. By introducing the meta
data structure KernelPortMeta, kpair avoids to invoke the kernel method
directly. Instead, kernel populates the KernelPortMeta structure ahead
and kpair gets kernel pointer, port name, in/out port number from
KernelPortMeta. This way, the mutual dependency between kpair and kernel
is simplified to be kernel depends on kpair.
@BamboWu BamboWu force-pushed the ut_dev branch 2 times, most recently from d010b28 to 88bc40d Compare February 7, 2023 03:05
BamboWu and others added 9 commits February 18, 2023 18:26
There was a minor bug when compile with BENCHMARK flag defined and this
commit fixes it. Exposing libut uthread pointer helps us to trace the
task scheduling.
Use unordered_map could lower the portmap_t access time complexity.
The size of dst_kernels could be immediately taken to set waitgroup.
The support for non-string port name was broken due to some bugs fixed
in this commit.
Now to switch using compiler hashed port names, set `-DSTRING_NAMES=0`
for cmake command.
Please note that the compiler must support C++20 to have this feature.
… the test cases, will probably just make a sed script or something to convert them
This commit changes the logic to increment the blocked counter, so that
the fraction function can give the real fraction to reflect the blocked
ratio. Thus, dynalloc could have its condition satisfied to do resize.
Another bug fixed by this commit is for those non-trivially-copyable
data types such as std::string, should mark the FIFOs not resizable and
avoid memcpy() those data and get undefined behaviors.
The length_signal should be sizeof( Signal ) * max_cap, otherwise in
resizing, the memcpy() for signal buffer could corrupt memory.
Export setPtrMap(), setPtrSet(), setInPeekSet(), setOutPeekSet() so they
can be called from benchmarks.
Copy link
Collaborator

@mr-j0nes mr-j0nes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, so many changes here. I get no error when compiling the library on my end, but I do get errors when compiling the examples.

What I do is:

cd build
cmake -DBUILD_EXAMPLES=Yes ..
make

Also I do get errors when trying to compile my software with this version, here some of the warnings and errors:

/develop/sandbox/thread/RaftLib/./raftinc/defs.hpp:110:5: error: ‘raft::port_key_type operator""_port(const char*,
 std::size_t)’ defined but not used [-Werror=unused-function]                                                                 
  110 |     operator""_port( const char *input, std::size_t len )

/develop/myapp/src/StreamBuilder.cpp:211:62: error: no match for ‘operator>>’ (opera
nd types are ‘KernelPortMeta’ and ‘KernelPortMeta’) 

Is the backward compatibility intended to be preserved with these changes?

@jonathan-beard
Copy link
Member

I think the original intent is to provides backwards/forwards compatibility. I should be able to compile these over the weekend to help out. The _port syntax should provide a compile-time numerical mapping to the string that is unique, and therefore faster port lookup (assuming no hash collisions). Definitely lots of changes, most for the better (and bug fixes), but I definitely don't want to break old apps while doing so.

find_package( QThreads )
find_package( UT )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to make sure we add documentation for this one, as in what it is, where to get it, and how to compile + limitations. There's definitely advantages over QThreads.

fr read( argv[ 1 ], (fr::offset_type) term.length(), 1 );
if( argc > 2 )
{
repetitions = atoi( argv[ 2 ] );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably use the included cmdargs package for the examples, I'll see if I can make the change for this one, then push to the same pull request. @jonathan-beard

{
repetitions = atoi( argv[ 2 ] );
}
fr read( argv[ 1 ], (fr::offset_type) term.length(), 1, repetitions );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for this one, parse with cmd-args so we get a menu for the example then use the assigned variable here.

* all the currently allocated FIFO objects within the streaming
* graph. This is primarily for instrumentation puposes.
* @author: Jonathan Beard
* @version: Tue Sep 16 20:20:06 2014
*
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BamboWu , did UT (or yourself) want to also add yourself to the author line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be fine, I think.
The most of modifications I did is on another branch armq_dev, and I added UT in the header there.

{
for( FIFO *f : allocated_fifo )
{
#if RAFT_DUMP_STATS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-beard , find a better solution for this. Perhaps define a shared memory hook that is automatically dumped to if a telemetry agent is attached?

src->setFIFO( fifo );
dst->setFIFO( fifo );
/** NOTE: this list simply speeds up the monitoring if we want it **/
allocated_fifo.insert( fifo );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to double check that the monitoring thread (if it's used) still traverses this structure (e.g., to dump periodic frames of queue stats.

stream << "service rate: " << s.service_rate << "\n";
return( stream );
}
std::uint16_t occ_in = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-beard , double check to see if the service rate instrumentation and such is still functional. I think @BamboWu was primarily using the enqueue/dequeue blocked stats, would be nice to have the service rate estimates going again for initial transport buffer sizing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants